# Information Complexity in Bandit Subset Selection

1 S2A - Signal, Statistique et Apprentissage
LTCI - Laboratoire Traitement et Communication de l'Information
Abstract : We consider the problem of efficiently exploring the arms of a stochastic bandit to identify the best subset of a specified size. Under the PAC and the fixed-budget formulations, we derive improved bounds by using KL-divergence-based confidence intervals. Whereas the application of a similar idea in the regret setting has yielded bounds in terms of the KL-divergence between the arms, our bounds in the pure-exploration setting involve the Chernoff information'' between the arms. In addition to introducing this novel quantity to the bandits literature, we contribute a comparison between strategies based on uniform and adaptive sampling for pure-exploration problems, finding evidence in favor of the latter.
Keywords :
Domain :

https://hal.telecom-paris.fr/hal-02288406
Contributor : Telecomparis Hal <>
Submitted on : Saturday, September 14, 2019 - 6:46:53 PM
Last modification on : Thursday, March 5, 2020 - 4:02:38 PM

### Identifiers

• HAL Id : hal-02288406, version 1

### Citation

Emilie Kaufmann, Shivaram Kalyanakrishnan. Information Complexity in Bandit Subset Selection. Conference On Learning Theory, Jun 2013, Princeton, United States. ⟨hal-02288406⟩

Record views