Version française / Séminaires
- Libellé inconnu,
Séminaire MODAL'X : Evgenii Chzhen (LMO, Université Paris-Saclay)
Publié le 1 février 2024
–
Mis à jour le 30 avril 2024
Small total-cost constraints in contextual bandits with knapsacks
Date(s)
le 2 mai 2024
13h30 - 14h30
Lieu(x)
Abstract: I will talk about some recent developments in the literature of contextual bandit problems with knapsacks [CBwK], a problem where at each round, a scalar reward is obtained and vector-valued costs are suffered. The goal is to maximize the cumulative rewards while ensuring that the cumulative costs are lower than some predetermined cost constraints. In this setting, total cost constraints had so far to be at least of order T^{3/4} where T is the number of rounds, and were even typically assumed to depend linearly on T. Elaborating on the main technical challenge and drawback of the previous approaches, I will present a dual strategy based on projected-gradient-descent updates, that is able to deal with total-cost constraints of the order of T^{1/2} up to poly-logarithmic terms. This strategy is direct, and it relies on a careful, adaptive, tuning of the step size. The approach is inspired by parameter-free-type algorithms arising from convex (online) optimization literature.
The talk is based on joint works with C. Giraud, Z. Li, and G. Stoltz
The talk is based on joint works with C. Giraud, Z. Li, and G. Stoltz
Mis à jour le 30 avril 2024