Pure exploration and regret minimization in matching bandits

Sentenac, F., Yi, J., Calauzènes, print., Perchet, V. & Vojnovic, M.ORCID logo (2021). Pure exploration and regret minimization in matching bandits. In Proceedings of the 38th International Conference on Machine Learning (pp. 9434-9442). Journal of Machine Learning Research.
Copy

Finding an optimal matching in a weighted graph is a standard combinatorial problem. We consider its semi-bandit version where either a pair or a full matching is sampled sequentially. We prove that it is possible to leverage a rank-1 assumption on the adjacency matrix to reduce the sample complexity and the regret of off-the-shelf algorithms up to reaching a linear dependency in the number of vertices (up to poly log terms).

picture_as_pdf

subject
Published Version

Download

Export as

EndNote BibTeX Reference Manager Refer Atom Dublin Core JSON Multiline CSV
Export