Pure exploration and regret minimization in matching bandits

Sentenac, Flore; Yi, Jialin; Calauzènes, ‪Clément; Perchet, Vianney; and Vojnovic, MilanORCID logo (2021) Pure exploration and regret minimization in matching bandits. In: Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, 139 . Journal of Machine Learning Research, pp. 9434-9442.
Copy

Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to poly log terms).

picture_as_pdf

picture_as_pdf
subject
Published Version

Download

Atom BibTeX OpenURL ContextObject in Span OpenURL ContextObject Dublin Core MPEG-21 DIDL Data Cite XML EndNote HTML Citation METS MODS RIOXX2 XML Reference Manager Refer ASCII Citation
Export

Downloads