The very heart of sequentiel decision making

Facing the traveler tree, you wonder: which path shall I pick this time? Choosing the right alternative in an uncertain world is not easy. Advancing the multi-armed bandit theory will help you.

Mathematical Statistics for Sequential Learning

The more applied you go, the stronger theory you need. This is an equilibrium, between questions and answers, between dreams and practice. Mathematics is the door and the key for optimisation, learning guarantees and making your dreams come true.

Provably adaptive decisions in the wild

Providing algorithms with truely adaptive capabilities when facing an unknown dynamics and environment. Reinforcement Learning is the basic formalism, optimism in face of uncertainty a good tool, but robustness, and adaptivity to the unknown structure are the real challenges.

Sequential Learning for Sustainable Systems

Understanding the dynamics of complex systems, how to optimally act in them can have a huge positive impact on all aspects of human societies that require a careful management of natural, energetic, human and computational resources. It is our duty to optimally answer it.

The wind of change - An avenue of novel applications.

Choosing which future we want to shape is equally important as picturing the world we dream of beyond the existing applications of current research. From E-learning to Permaculture or Circular economy, embrace the potential of sequential learning for our societies.

All you need is a deep passion for mathematics, computer science and changing the world.

On these pages, you will find information regarding my research activities in the wide fields of Mathematics>Statistical Theory and Computer Science>Machine Learning. You may want to read and comment on my publications, attend the Seminaires d'Apprentissage et de Statistique de l'Université Paris-Saclay, subscribe to the Probability and Statistics news mailing list, or follow much more interesting links. If you are a student looking for a research internship, then go read this page.

...
In case you
  • believe that understanding the dynamics of complex systems, as well as how to optimally act in them can have a huge positive impact on all aspects of human societies that require a careful management of natural, energetic, human and computational resources, and that it is thus our duty to optimally answer it,
  • consider that for that purpose, due to the limitations of human capabilities to process large amounts of data, we should pursue the long-term development of an optimal and automatic method that can, from mere observations and interactions with a complex system, understand its dynamics and how to optimally act in it,
  • want to attack this problem by using any combination of the following four pillar domains: Machine Learning, Mathematical Statistics, Dynamical Systems and Optimization,
  • then do not hesitate to contact me, I'll be very happy to help you achieve this goal.
    Research Domains

    Streaming confident regression

    Streaming confident regression
    In a streaming regression setting with dependent data, how to build confidence distribution on the next point based on the history ? ___________________________________________

    Pliable Rejection Sampling

    Pliable Rejection Sampling
    Using kernel estimates to leverage application of rejection sampling at provably low cost. ___________________________________________

    Random Projections MCMC is hard

    Random Projections MCMC is hard
    Random Projections may replace sub-sampling techniques for MCMC with large data. Whether it actually works is a tricky issue. ___________________________________________

    How hard is my MDP?

    How hard is my MDP?
    How many samples do you need for tight enough confidence tp solve an MDP? The Bersntein norm of the value function helps! ___________________________________________

    Selecting State Representations

    Selecting State Representations
    When you have many possible notions of states perhaps all wrong, you don't know which is the best, but still want optimal regret guarantee. ___________________________________________

    Sub-sampling Bandits

    Sub-sampling Bandits
    A surprisingly simple bandit strategy that achieves the state-of-the-art in vast range of settings, without knowing the reward model. ___________________________________________

    Sampling without replacement

    Sampling without replacement
    What concentration inequalities can you show when sampling without replacement? ___________________________________________

    Latent Bandits

    Latent Bandits
    In recommender systems, not all features may be known about the users. Not considering the latent features may lead to dramatic results. ___________________________________________

    Robust risk-averse Bandits

    Robust risk-averse Bandits
    Choosing the right action when minimizing the risk of each trial, instead of simply the mean, and how to get near-optimal guarantees. ___________________________________________

    Handling infinitely many state models

    Handling infinitely many state models
    Solving an RL problem in a single stream of interactions when you don't know the state model, but have infinitely many candidates. ___________________________________________

    Better selecting the state representation

    Better selecting the state representation
    Solving an RL problem in a single stream of interactions in an optimal way when you have not one but many plausible state models. ___________________________________________

    Optimal bandit allocation strategy

    Optimal bandit allocation strategy
    An old KL-based class of bandit algorithm is shown to be not only optimal in the limit, but also analyzed for a finite number of pulls. ___________________________________________

    Random Projections Linear Regression

    Random Projections Linear Regression
    For a high-dimension function space, how to reduce the dimension to a manageable size while preserving risk-minimization guarantee? ___________________________________________

    Active curiosity-based sampling

    Active curiosity-based sampling
    Curiosity-driven learning naturally tradeoffs choosing between too complex or too easy tasks. This is here applied to active sampling. ___________________________________________

    Active sampling and partitioning

    Active sampling and partitioning
    Building a piecewise-constant approximation of a function, by actively sampling and refining a partition of the space in a near-optimal way. ___________________________________________

    Finite-time optimal bandit strategy

    Finite-time optimal bandit strategy
    Proving that an old bandit strategy based on KL divergence is optimal for discrete distributions. ___________________________________________

    Selecting the state-representation in RL

    Selecting the state-representation in RL
    Solving an RL problem in a single stream of interactions when you have many plausible state models and don't know which is right. ___________________________________________

    Sparse recovery with Brownian sensing

    Sparse recovery with Brownian sensing
    When compressed sensing fails because your sampling matrix has no good properties, apply Brownian sensing and you'll be fine. ___________________________________________

    Online learning with smooth opponents.

    Online learning with smooth opponents.
    Given a continuum of actions, an opponent chooses your feedback, only assumed to be smooth. How to get efficient, optimal actions? ___________________________________________

    Bound for Bellman residual minimization

    Bound for Bellman residual minimization
    In the setting of discounted MDPs, we show a generalization bound for the Bellman residual in linear approximation spaces. ___________________________________________

    History-dependent Adaptive Bandits

    History-dependent Adaptive Bandits
    Say you face an opponent in a bandit game. Knowing her limitations, you can design an optimal strategy. What if you don't know it? ___________________________________________

    Scrambled function spaces for regression

    Scrambled function spaces for regression
    Building from a large function space a subspace of manageable dimension by scrambling your basis functions. ___________________________________________

    Random Projections for MDPs

    Random Projections for MDPs
    Applying random projection regression to approximate the value function that is only available via fixed point formulation. ___________________________________________

    Compressed least-squares regression

    Compressed least-squares regression
    You basically build a random matrix to solve a regresion problem in a smaller space, and handling the approximation error overhead. ___________________________________________

    Many views agreement regularization

    Many views agreement regularization
    You observe the same data from different representations, each lerner providing different answers, and you want them to agree. ___________________________________________

    Have a good day :)

    If you are interested in actively saving academic research in France, you may ask your university to open a “Travail de Communication de la Recherche (T.C.R)“, this is a Teaching Unit (Unité d’Enseignement) for students to practice communicating research activities.
    Fièrement propulsé par Tempera & WordPress.