Using a similarity measure for credible classification
This paper concerns classification by Boolean functions. We investigate the classification accuracy obtained by standard classification techniques on unseen points (elements of the domain, {0, 1}n, for some n) that are similar, in particular senses, to the points that have been observed as training obser- vations. Explicitly, we use a new measure of how similar a point x ∈ {0, 1}n is to a set of such points to restrict the domain of points on which we offer a classification. For points sufficiently dissimilar, no classification is given. We report on experimental results which indicate that the classification ac- curacies obtained on the resulting restricted domains are better than those obtained without restriction. These experiments involve a number of standard data-sets and classification techniques. We also compare the classification ac- curacies with those obtained by restricting the domain on which classification is given by using the Hamming distance.
| Item Type | Report (Technical Report) |
|---|---|
| Departments | Mathematics |
| Date Deposited | 23 Oct 2008 09:43 |
| URI | https://researchonline.lse.ac.uk/id/eprint/13927 |