Multi-armed Bandits

Sub-sampling for multi-armed bandits, A. Baransi, O.-A. Maillard, S. Mannor, in Europeean conference on Machine Learning (ECML), 2014. Publisher website HaL

Latent bandits, O.-A. Maillard and S. Mannor, in Proceedings of the International Conference on Machine Learning (ICML), 2013. Publisher website HaL

Robust risk-averse stochastic multi-armed bandits, O.-A. Maillard, in Proceedings of the International Conference on Algorithmic Learning Theory (ALT), volume 8139 of Lecture Notes in Computer Science, pages 218–233. Springer Berlin Heidelberg, 2013. Publisher website HaL

Finite-time analysis of multi-armed bandits problems with Kullback-Leibler divergences, O.-A. Maillard, R. Munos, and G. Stoltz, in Proceedings of the 24th annual Conference On Learning Theory (COLT), 2011. Publisher website HaL

Online learning in adversarial lipschitz environments, O.-A. Maillard and R. Munos, in Proceedings of the 2010 European Conference on Machine Learning and Knowledge Discovery in Databases: Part II, (ECML-PKDD), pages 305–320, Berlin, Heidelberg, 2010. Springer-Verlag. Publisher website HaL

Adaptive bandits: Towards the best history-dependent strategy, O.-A. Maillard and R. Munos, in Proceedings of the 14th International Conference on Artificial Intelligence and Statistics (AI&STATS), volume 15 of JMLR W&CP, 2011. Publisher website HaL

 

 

 

 

 

Les commentaires sont fermés