I am a researcher in the Reinforcement Learning group at Microsoft Research AI, Redmond, USA. I also work closely with the Reinforcement Learning groups at MSR NYC and the research team of Microsoft Chief Science Officer, Eric Horvitz.
I finished my PhD at the Robotics Institute, Carnegie Mellon University, USA, where I was advised by Prof. J. Andrew (Drew) Bagnell. I do fundamental as well as applied research in machine learning, control and computer vision with applications to autonomous agents in general and robotics in particular.
My interests include decison-making under uncertainty, reinforcement learning, artificial intelligence and machine learning. I regularly area-chair/review for NeurIPS, ICLR, ICML. On occasion for ICRA, IROS, IJRR, JFR, CVPR, ECCV, ICCV.
PhD in Robotics, 2015
Carnegie Mellon University
MS in Robotics, 2012
Carnegie Mellon University
Bachelor of Electrical Engineering, 2007
Delhi College of Engineering
We propose a neural architecture search (NAS) algorithm, Petridish, to iteratively add shortcut connections to existing network layers. The added shortcut connections effectively perform gradient boosting on the augmented layers. The proposed algorithm is motivated by the feature selection algorithm forward stage-wise linear regression, since we consider NAS as a generalization of feature selection for regression, where NAS selects shortcuts among layers instead of selecting features. In order to reduce the number of trials of possible connection combinations, we train jointly all possible connections at each stage of growth while leveraging feature selection techniques to choose a subset of them. We experimentally show this process to be an efficient forward architecture search algorithm that can find competitive models using few GPU days in both the search space of repeatable network modules (cell-search) and the space of general networks (macro-search). Petridish is particularly well-suited for warm-starting from existing models crucial for lifelong-learning scenarios
Developing and testing algorithms for autonomous vehicles in real world is an expensive and time consuming process. Also, in order to utilize recent advances in machine intelligence and deep learning we need to collect a large amount of annotated training data in a variety of conditions and environments. We present a new simulator built on Unreal Engine that offers physically and visually realistic simulations for both of these goals. Our simulator includes a physics engine that can operate at a high frequency for real-time hardware-in-the-loop (HITL) simulations with support for popular protocols (e.g. MavLink). The simulator is designed from the ground up to be extensible to accommodate new types of vehicles, hardware platforms and software protocols. In addition, the modular design enables various components to be easily usable independently in other projects. We demonstrate the simulator by first implementing a quadrotor as an autonomous vehicle and then experimentally comparing the software components with real-world flights.
Increasingly, real world problems require multiple predictions while traditional supervised learning techniques focus on making a single best prediction. For instance in advertisement placement on the web, a list of advertisements is placed on a page with the objective of maximizing click-through rate on that list. In this work, we build an efficient framework for making sets or lists of predictions where the objective is to optimize any utility function which is (monotone) submodular over a list of predictions. Other examples of tasks where multiple predictions are important include: grasp selection in robotic manipulation where the robot arm must evaluate a list of grasps with the aim of finding a sucessful grasp, as early on in the list as possible and trajectory selection for mobile ground robots where given the computational time limits, the task is to select a list of trajectories from a much larger set of feasible trajectories for minimizing expected cost of traversal. In computer vision tasks like frame-to-frame target tracking in video, multiple hypotheses about the target location and pose must be considered by the tracking algorithm. For each of these cases, we optimize for the content and order of the list of predictions. Crucially– and in contrast with existing work on list prediction – our approach to pre- dicting lists is based on very simple reductions of the problem of predicting lists to a series of simple classification/regression tasks. This provides powerful flexibility to use any existing prediction method while ensuring rigorous guarantees on prediction performance. We analyze these meta-algorithms for list prediction in both the online, no-regret and generalization settings. Furthermore we extend the methods to make multiple predictions in structured output domains where even a single prediction is a combinatorial object, e.g. , challenging vision tasks like semantic scene labeling and monocular pose estimation. We conclude with case studies that demonstrate the power and flexibility of these reductions in problems from document summarization, prediction of the pose of humans in images, to predicting the best set of robotic grasps and purely vision based autonomous flight in densely cluttered environments.
Sequence optimization, where the items in a list are ordered to maximize some reward has many applications such as web advertisement placement, search, and control libraries in robotics. Previous work in sequence optimization produces a static ordering that does not take any features of the item or context of the problem into account. In this work, we propose a general approach to order the items within the sequence based on the context (e.g., perceptual information, environment description, and goals). We take a simple, efficient, reduction-based approach where the choice and order of the items is established by repeatedly learning simple classifiers or regressors for each “slot” in the sequence. Our approach leverages recent work on submodular function maximization to provide a formal regret reduction from submodular sequence optimization to simple costsensitive prediction. We apply our contextual sequence prediction algorithm to optimize control libraries and demonstrate results on two robotics problems: manipulator trajectory prediction and mobile robot path planning.