Skip to Main content Skip to Navigation
Conference papers

Information Complexity in Bandit Subset Selection

Abstract : We consider the problem of efficiently exploring the arms of a stochastic bandit to identify the best subset of a specified size. Under the PAC and the fixed-budget formulations, we derive improved bounds by using KL-divergence-based confidence intervals. Whereas the application of a similar idea in the regret setting has yielded bounds in terms of the KL-divergence between the arms, our bounds in the pure-exploration setting involve the ``Chernoff information'' between the arms. In addition to introducing this novel quantity to the bandits literature, we contribute a comparison between strategies based on uniform and adaptive sampling for pure-exploration problems, finding evidence in favor of the latter.
Complete list of metadatas

https://hal.telecom-paris.fr/hal-02288406
Contributor : Telecomparis Hal <>
Submitted on : Saturday, September 14, 2019 - 6:46:53 PM
Last modification on : Friday, July 31, 2020 - 11:28:08 AM

Identifiers

  • HAL Id : hal-02288406, version 1

Collections

Citation

Emilie Kaufmann, Shivaram Kalyanakrishnan. Information Complexity in Bandit Subset Selection. Conference On Learning Theory, Jun 2013, Princeton, United States. ⟨hal-02288406⟩

Share

Metrics

Record views

24