Séminaire NETLEARN/ORANGE sur Apprentissage et Réseaux
a eu lieu:

Chez Orange Labs
38, rue de Général Leclerc
Métro Corentin Celton Line 12
Google map

Le Vendredi 9 Octobre 2015

Les vidéos et présentations sont disponibles ci-dessous.
Voici les résultats du sondage effectué auprès des participants: [sondage]

NETLEARN/ORANGE workshop and Learning and Networks
took place:

In Orange Labs
38, rue de Général Leclerc
Metro Corentin Celton Line 12
Google map

On Friday 9th October 2015

Videos and presentations are available below.
Here are the results of the survey done after the workshop: [sondage]


9h-9h30: Accueil/Welcome
9h30-9h45: Présentation du projet NETLEARN/Presentation of NETLEARN (M Coupechoux, Telecom ParisTech) [slides-Coupechoux] [video-Coupechoux]
9h45-11h15: Tutoriel/Tutorial: Online optimization for wireless communication systems (Panayotis Mertikopoulos, INRIA) [slides-Mertikopoulos] [video-Mertikopoulos]
11h15-11h30: Pause/Break
11h30-13h: Tutoriel/Tutorial: Port-folio of machine learning algorithms (Olivier Teytaud, INRIA) [slides-Teytaud] [video-Teytaud]
13h-14h30: Déjeuner/Lunch
14h30-15h15: Performance characterization of stochastic games with i.i.d. states (Samson Lasaulce, CentraleSupélec, CNRS) [slides-Lasaulce] [video-Lasaulce-1] [video-Lasaulce-2]
15h15-16h: Some variations of gossip algorithms (Vivek Borkar, IIT Bombay) [slides-Borkar] [video-Borkar]
16h-16h30: Pause/Break
16h30-17h15: Time sharing policies for combining exploration and exploitation (Eitan Altman, INRIA) [slides-Altman] [video-Altman]
17h15: fin/end

Matin/Morning: Tutoriels/Tutorials

Panayotis Mertikopoulos (INRIA), Online optimization for wireless communication systems.

Résumé/Abstract: Online optimization is the field of study of optimal decision-making in sequentially changing environments (such as the weather, financial markets, wireless networks, etc.). More precisely, the basic goal in online optimization is to react optimally to an unknown cost function that evolves over time due to exogenous factors that are beyond the decision-maker's control - either directly or indirectly. As such, online optimization methods are readily applicable to a wide array of systems that are neither static nor stationary, but instead evolve over time in a potentially arbitrary fashion. This tutorial talk is intended to provide a bird's eye view of online learning and online optimization for researchers and practitioners working on wireless telecommunications. More precisely, we will go over the fundamental limits of online learning, how to attain these limits in practice, and the impact of feedback imperfections and observation noise. Throughout the talk, we will eschew generality in favor of specific examples and we will focus on concrete applications to wireless channel selection, power control, energy efficiency, and other aspects of telecommunication networks.
Panayotis Mertikopoulos, received his M.Sc. and M.Phil. degrees in Mathematics from Brown University in 2005 and 2006, and his Ph.D. degree from the University of Athens in 2010. In 2010– 2011 he held a post-doctoral fellowship in École Polytechnique, and since 2011, he has been a CNRS researcher (CR2) at the Laboratoire d’Informatique de Grenoble in the Inria project-team MESCAL. P. Mertikopoulos has been a member of the AMS since 2003 and a member of the IEEE since 2010. He has served on the organization committee of several international conferences (ValueTools 2012, WiOpt 2013), and together with Y. Viossat, he co-founded in 2010 the Paris Working Group on Evolutionary Game Theory. He is the lead coordinator of the LACODS research project at the Université Joseph Fourier, a member of the European project NEWCOM# (FP7- NoE-318306) and of the Franco-Chilean research network “Algorithms and Dynamics for Games and Optimization”.

Olivier Teytaud (INRIA), Port-folio of machine learning algorithms.

Olivier Teytaud received the M.S. degree in computer science from the University of Normale Sup, Lyon, France, in 1998 and the Ph.D. degree from the Lyon 2 University in 2001. He is an experienced Research Fellow with the Tao team, INRIA, France. He is working on applications of optimization and machine learning in power systems.
Résumé/Abstract: Many algorithms exist for handling similar problems; for example, the literature provides many optimization algorithms, and many supervised learning algorithms. The best algorithm, for a given task, depends on many features. It is often more important to choose the best algorithm in the set, rather than defining yet another algorithm. Portfolio methods are the art and science of choosing, possibly dynamically, the best algorithm(s) in a pool of algorithm. In a first part, we survey the state of the art and provide the terminology of portfolio methods; afterwards we focus on the uncertain case (noisy or adversarial cases).

Après-midi/Afternoon: Présentations/Presentations

Samson Lasaulce (L2S), Performance characterization of stochastic games with i.i.d. states.

Samson Lasaulce is a CNRS Director of Research in the Laboratory of Signals and Systems (joint lab between CNRS, Supélec, and Univ. Paris Sud). He is also a Professor in the Department of Physics at Ecole Polytechnique. Before joining CNRS he has been working for five years in private R&D companies (Motorola Labs and Orange Labs). Dr. Lasaulce is the recipient of several awards. Dr. Lasaulce has been serving as an Associate Editor for the IEEE Transactions on Signal Processing (2011-2014). His current research interests lie in distributed networks with a focus on game theory, network information theory, learning, distributed optimization, network control for communication and energy networks. He is a co-author of the book "Game Theory and Learning for Wireless Networks: Fundamentals and Applications".

Résumé/Abstract: The main purpose of this talk is to exploit some connections between stochastic games with i.i.d states and Shannon theory to find the feasible set of average payoffs. One interesting feature of our approach is that agents are assumed to have a quite general observation structure. Power control is considered as a case study for the derived results.

Eitan Altman (INRIA), Time sharing policies for combining exploration and exploitation.

Eitan Altman received the B.Sc. degree in electrical engineering, the B.A. degree in physics, and the Ph.D. degree in electrical engineering, all from the Technion-Israel Institute, Haifa, in 1984, 1984, 1990, respectively. In 1990, he received the B.Mus. degree in music composition from Tel-Aviv University. Since 1990, he has been a researcher at the National Research Institute in Computer Science and Control (INRIA) in Sophia-Antipolis, France. His areas of interest include networking, stochastic control and game theory. Dr. Altman has been on the editorial boards of several scientific journals: Wireless Networks (WINET), Computer Networks (COMNET), Computer Communications (Comcom), Journal of Discrete Event Dynamic Systems (JDEDS), SIAM Journal of Control and Optimisation (SICON), Stochastic Models , and the Journal of Economy Dynamic and Control (JEDC). He received the Best Paper Award in Networking 2006, in Globecom 2007, and in IFIP Wireless Days 2009 conferences, and is a coauthor of two papers that have re- ceived the Best Student Paper awards (at QoFis 2000 and at Networking 2002). More informaion can be found at

Résumé/Abstract: We begin by describing the framework of learning in a multiobjective environment. We relalte it to a constrained optimization problem with unkknown parameters. We present a possible application in SONs (Self Organising Networks) in LTE in which one is interested to optimize in parallel several KPIs (Key Performance Indicators). Standard Q-learning does not work anymore when there is more than a single KPI since the optimality principle does not hold anymore. We show that the control and estimation cannot be separated in the following sense; there is no guarantee that when estimators converge to the true parameter, the optimal policy for the problem with estimated parameters converes to that corresponding to the real parameter. We construct a learning policy that combines exploration and exploitation whose performance converges to that in which the coontroller has all information on the the parameters from the beginning. Our approach is based on occupation measure theory and on Martingale convergence.

Vivek Borkar (IIT Bombay), Some variations of gossip algorithms.

Vivek S. Borkar got his B.Tech. in Electrical Engg. from Indian Institute of Technology, Mumbai, in 1976, M.S. in Systems and Control Engg. from Case Western Reserve Uni. in 1977, and Ph.D. in Elec- trical Enggg. and Computer Science from Uni. of California, Berkeley, in 1980. He has held positions at TIFR-CAM and the Indian Institute of Science in Bangalore, and is currently a Distinguished Profes- sor at the Tata Institute of Fundamental Research, Mumbai. He has held visiting positions at Uni. of Twente, MIT, Uni. of Maryland at College Park, Uni. of California at Berkeley, and Uni. of Illinois at Urbana-Champaign. He is the recipient of several honors in Indi a and is a Fellow of IEEE. His research interests are in stochastic modeling and optimization, with applications to communications.

Résumé/Abstract: This talk will describe distributed algorithms on networks where the interaction is though a gossip-like mechanism. Such schemes were introduced by Tsitsiklis in mid-80's and have been extensively studied. We will focus on some `nonlinear' counterparts. The first one is a scheme wherein the averaging probabilities of the gossip component are modulated by the `state' of the algorithm. The second case is one in which we replace the averaging by a fully nonlinear operator satisfying certain properties. The main thrust will be on analyzing the time asymptotics, for which a `two time scale' interpretation can be given.