Large-width bounds for learning half-spaces on distance spaces
A half-space over a distance space is a generalization of a half-space in a vector space. An important advantage of a distance space over a metric space is that the triangle inequality need not be satisfied, which makes our results potentially very useful in practice. Given two points in a set, a half-space is defined by them, as the set of all points closer to the first point than to the second. In this paper we consider the problem of learning half-spaces in any finite distance space, that is, any finite set equipped with a distance function. We make use of a notion of ‘width’ of a half-space at a given point: this is defined as the difference between the distances of the point to the two points that define the half-space. We obtain probabilistic bounds on the generalization error when learning half-spaces from samples. These bounds depend on the empirical error (the fraction of sample points on which the half-space does not achieve a large width) and on the VC-dimension of the effective class of half-spaces that have a large sample width. Unlike some previous work on learning classification over metric spaces, the bound does not involve the covering number of the space, and can therefore be tighter
| Item Type | Article |
|---|---|
| Copyright holders | © 2018 Elsevier |
| Keywords | Large width learning, Distance and metric spaces, Half spaces, Pseudo rank, Margin |
| Departments | Mathematics |
| DOI | 10.1016/j.dam.2018.02.004 |
| Date Deposited | 26 Mar 2018 16:13 |
| Acceptance Date | 2018-02-28 |
| URI | https://researchonline.lse.ac.uk/id/eprint/87357 |