Skip to Main content Skip to Navigation
Conference papers

Information Complexity in Bandit Subset Selection

Abstract : We consider the problem of efficiently exploring the arms of a stochastic bandit to identify the best subset of a specified size. Under the PAC and the fixed-budget formulations, we derive improved bounds by using KL-divergence-based confidence intervals. Whereas the application of a similar idea in the regret setting has yielded bounds in terms of the KL-divergence between the arms, our bounds in the pure-exploration setting involve the ``Chernoff information'' between the arms. In addition to introducing this novel quantity to the bandits literature, we contribute a comparison between strategies based on uniform and adaptive sampling for pure-exploration problems, finding evidence in favor of the latter.
Complete list of metadata
Contributor : Telecomparis Hal Connect in order to contact the contributor
Submitted on : Saturday, September 14, 2019 - 6:46:53 PM
Last modification on : Tuesday, October 19, 2021 - 11:16:16 AM


  • HAL Id : hal-02288406, version 1



Emilie Kaufmann, Shivaram Kalyanakrishnan. Information Complexity in Bandit Subset Selection. Conference On Learning Theory, Jun 2013, Princeton, United States. ⟨hal-02288406⟩



Record views