• Libellé inconnu,

Séminaire MODAL'X : Evgenii Chzhen (LMO, Université Paris-Saclay)

Publié le 1 février 2024 Mis à jour le 30 avril 2024

Small total-cost constraints in contextual bandits with knapsacks

Date(s)

le 2 mai 2024

13h30 - 14h30
Lieu(x)

Bâtiment Maurice Allais (G)

Bâtiment Allais (G), salle Modal'X
Plan d'accès
Abstract: I will talk about some recent developments in the literature of contextual bandit problems with knapsacks [CBwK], a problem where at each round, a scalar reward is obtained and vector-valued costs are suffered. The goal is to maximize the cumulative rewards while ensuring that the cumulative costs are lower than some predetermined cost constraints. In this setting, total cost constraints had so far to be at least of order T^{3/4} where T is the number of rounds, and were even typically assumed to depend linearly on T. Elaborating on the main technical challenge and drawback of the previous approaches, I will present a dual strategy based on projected-gradient-descent updates, that is able to deal with total-cost constraints of the order of T^{1/2} up to poly-logarithmic terms. This strategy is direct, and it relies on a careful, adaptive, tuning of the step size. The approach is inspired by parameter-free-type algorithms arising from convex (online) optimization literature.

The talk is based on joint works with C. Giraud, Z. Li, and G. Stoltz

Mis à jour le 30 avril 2024