Highest Probability SVM Nearest Neighbor Classifier for Spam Filtering

Blanzieri, Enrico and Bryl, Anton (2007) Highest Probability SVM Nearest Neighbor Classifier for Spam Filtering. UNSPECIFIED.

[img]
Preview
PDF
Download (272Kb) | Preview

    Abstract

    In this paper we evaluate the performance of the highest probability SVM nearest neighbor classifier, which is a combination of the SVM and k-NN classifiers, on a corpus of email messages. To classify a sample the algorithm performs the following actions: for each k in a predefined set {k1, ..., kN} it trains an SVM model on k nearest labelled samples, and uses this model to classify the given sample, then fits a sigmoid approximation of the probabilistic output for the SVM model, and computes the probabilities of the positive and the negative answers; than it selects that of the 2 × N resulting answers which has the highest probability. The experimental evaluation shows, that this algorithm is able to achieve higher accuracy than the pure SVM classifier at least in the case of equal error costs.

    Item Type: Departmental Technical Report
    Department or Research center: Information Engineering and Computer Science
    Subjects: Q Science > QA Mathematics > QA076 Computer software
    Report Number: DIT-07-007
    Repository staff approval on: 10 Apr 2007

    Actions (login required)

    View Item