August 2011-June 2016. With Prof. Y. Narahari from Department of Computer Science and Automation, Indian Institute of Science (IISc), Bangalore, Karnataka.

Title: Mechanisms for Stochastic Multi-armed Bandit Problems.

Abstract: The multi-armed bandit (MAB) problem is a widely studied problem in the machine learning literature, in the context of online learning. The MAB problem becomes more relevant but at the same time more challenging when the arms are controlled by strategic agents. Dealing with strategic agents warrants the use of game theory and mechanism design principles in conjunction with online learning, and leads to non-trivial technical challenges. This thesis focuses on designing mechanisms for stochastic MAB problems where the rewards produced are stochastic. Our work is motivated by many contemporary, practical applications such as internet advertising, crowdsourcing, procurement, and smart grids. The work offers novel mechanisms for stochastic MAB problems in various representative, generic settings inspired by the above applications. Our contributions include the following.

  • Design of an Exploration-Separated, Contextual MAB Mechanism Motivated by demand management in smart grids, we propose a novel exploration separated MAB mechanism to incentivize strategic users to reduce their demands when there is a shortfall due to grid failures or time varying supply. The proposed mechanism incorporates realistic features of the demand response problem. A time varying and non-linear cost function in this setting throws up non-trivial challenges for mechanism design.
  • Design of a Low Regret, Combinatorial, Randomized MAB Mechanism Existing work on mechanisms for stochastic MAB problems deals with settings where either a single agent needs to be selected at a time or a subset of top-ranking agents needs to be selected at a time. Our work non-trivially generalizes this to a situation where a subset of agents needs to be selected so as to solve a general combinatorial optimization problem. Motivated by an application in crowdsourcing, we consider a problem of selecting an optimal subset of workers for each crowdsourced task so that the outcome obtained after aggregating the labels from the selected workers guarantees a target accuracy level in a cost optimal way.
  • Design of a Low Regret, Deterministic MAB Mechanism There are two broad classes of mechanisms for stochastic MAB problems: (a) deterministic, which invariably suffer from high regret, and (b) randomized, which often suffer from high variance in payments. We propose a deterministic, low regret, Upper Confidence Bound (UCB) based mechanism by working with a slightly weaker notion of truthfulness which serves the purpose perfectly for crowdsourcing applications.
  • Design of a Low Regret, Thompson Sampling based MAB Mechanism Many existing MAB mechanisms use Upper Confidence Bound (UCB) based algorithms, which provide a frequentist approach for learning the parameters of the distributions. We develop, for the first time, a Thompson sampling based (Bayesian) approach to MAB mechanism design. The randomized nature of Thompson sampling creates unique challenges for mechanism design but offers many attractive advantages. The mechanism proposed has immediate applications to procurement auctions.

  • M.E.

    August 2008-2010. With Prof. Shirish Shevade from Department of Computer Science and Automation, Indian Institute of Science (IISc), Bangalore, Karnataka.

    Title: An Efficient Algorithm for online SVM Classifier Design

    In this work we designed an efficient algorithm for online SVM classifier without losing accuracy. We designed the concept of a margin band and adapted this margin band according to the incoming data so that the algorithm gives low test error and is efficient both in terms of space and time. We then applied an improved version of SMO algorithm to decide which example falls into the margin band, which is a support vector and which can be discarded.


    August 2004-2008. Shri Govindram Seksaria Institute of Technology and Science (SGSITS), Indore, Madhya Pradesh