About me
I am a new member of Microsoft Research New England, a new lab in Cambridge, MA. I have previously been an assitant professor of computer science at Georgia Tech and TTI-Chicago. I was extremely fortunate to receive a PhD at CMU from the ingenious Avrim Blum, followed by an NSF postdoc at MIT under the wise guidance of Santosh Vempala.
My main research interests are game theory (recent), machine learning, and randomized/online algorithms.
Teaching and Students
Spring 2008: Game Theory and Computer Science, (Georgia Tech)
Fall 2006: Machine Learning Theory (Weizmann Institute)
Autumn 2004: Online Algorithms (University of Chicago)
Previous students/interns: Varun Kanade (Georgia Tech), Duru Turkoglu (University of Chicago); interns at MSR: Jake Abernathy, Azarakhsh Malekian, Ankur Moitra, Aaron Roth, Greg Valiant; interns at TTI: Elad Verbin (Tel Aviv University), Katrina Ligett (CMU), Nina Balcan (CMU), Abie Flaxman (CMU), Brendan McMahan (CMU), Claire Monteleoni (MIT)
Publications
2009
- Adam Tauman Kalai, Alex Samorodnitsky, and Shang-Hua Teng. Learning and Smoothed Analysis. To appear in Proceedings of the 50th Annual Symposium on Computer Science (FOCS), 2009.
- Adam Tauman Kalai and Varun Kanade. Potential-based agnostic boosting. To appear in NIPS 09.
- Adam Tauman Kalai and Ravi Sastry. The Isotron Algorithm: High-Dimensional Isotonic Regression. To appear in Proceedings of the 22nd Annual Conference on Learning Theory (COLT), 2009.
- The Perceptron algorithm elegantly solves binary classification problems that have a margin between positive and negative examples. Isotonic regression (fitting an arbitrary increasing function in one dimension) is also a natural problem with a simple solution. By combining the two, we get a new simple regression algorithm in high dimensions, with strong guarantees.
- Varun Kanade, Adam Kalai, and Yishay Mansour. Reliable Agnostic Learning. To appear in Proceedings of the 22nd Annual Conference on Learning Theory (COLT), 2009.
- We give efficient algorithms for learning in a model of learning with one-sided agnostic noise, such as a model of SPAM emails in which legit emails come from one set but SPAM emails may be distributed arbitrarily. The goal is to achieve an accuracy as high as possible while maintaining a near-zero rate of false positives. We show how to learn the class of DNFs using membership queries in this setting.
2008
- Parikshit Gopalan, Adam Tauman Kalai, and Adam R. Klivans. Agnostically Learning Decision Trees. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing (STOC), pp. 527-536, 2008.
- We consider the problem of learning a decision tree in the presence of arbitrary noise. More formally, we are given black-box access (a type of active learning) to an arbitrary binary function on n bits, and our output is a function whose accuracy is almost as good as that of the best size-s decision tree, where accuracy is measured over the uniform distribution on inputs.
- Christian Borgs, Jennifer Chayes, Nicole Immorlica, Adam Tauman Kalai, Vahab Mirrokni, and Christos Papadimitriou. The Myth of the Folk Theorem. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing (STOC), pp. 365-372, 2008.
- The folk theorem for repeated games suggests that repeated games should by easier to play than one-shot games. This can be made formal by giving an efficient algorithm for finding Nash equilibria in any repeated game with two players. Perhaps surprisingly, we show that for three or more players, it is as hard to find Nash equilibria in repeated games as it is in one-shot games.
- Adam Tauman Kalai, Yishay Mansour, and Elad Verbin. Agnostic Boosting and Parity Learning. To appear in Proceedings of the thirty-ninth annual ACM symposium on Theory of computing (STOC), pp. 629-638, 2008.
- We give the first boosting algorithm that is capable of handling arbitrary noise and boosting to optimal accuracy. We give an application to the problem of learning parity with arbitrary noise, over an arbitrary distribution.
- Reid Andersen, Christian Borgs, Jennifer Chayes, Uriel Feige, Abraham Flaxman, Adam Tauman Kalai, Vahab Mirrokni, Moshe Tennenholtz. Trust-Based Recommendation Systems: an Axiomatic Approach. To appear in Proceedings of the 17th international conference on World Wide Web (WWW), 2008.
- We cast the trust-based recommendation system problem as a voting problem on a graph with a partial +/- assignment, where each node receives a personalized recommendation. We consider natural axioms that lead to a unique algorithm, as well alternative axioms that lead to impossibility.
2007
- Adam Tauman Kalai. Learning Nested Halfspaces and Uphill Decision Trees. In Proceedings of 20th Annual Conference on Learning Theory (COLT), pp. 378-392, 2007.
- Predicting class probabilities and other real-valued quantities is often more useful than binary classification, but comparatively little work in PAC-style learning addresses this issue. We show that two rich classes of real-valued functions are learnable in the probabilistic-concept framework of Kearns and Schapire. The classes naturally generalize halfspaces, decision lists, and SVMs.
- Sham Kakade, Adam Tauman Kalai, and Katrina Ligett. Playing Games with Approximation Algorithms. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing (STOC), pp. 546-555, 2007.
- How should one play a large repeated two-person game when you can only compute approximate best responses? How should one adaptively solve repeated optimization problems that are so complex that one can only approximately solve the offline (full-information) counterpart? This paper answers these two questions. The approach builds on a nice algorithm of Zinkevich (2003) and is very different from the previous work (for the exact cases) due to Hannan (1957) and Kalai and Vempala (2005, see below).
- Eli Ben-Sasson, Adam Tauman Kalai, and Ehud Kalai. An Approach to Bounded Rationality. In Advances in Neural Information Processing Systems 19 (NIPS), pp. 145-152, 2007.
- Many interactions in complex environments, e.g., chess, are affected by computational limitations. An extreme example is the factoring game, where the first player chooses a large number and sends it to the second player who then attempts to factor it. Ignoring computational considerations, the second player can factor any number and win, but with computational considerations the game seems to favor the first player. This well-known issue in game-theory falls under the term bounded rationality, yet there is no general model of playing games with computational limitations. (Prior work focused mainly on the case of finite automata playing repeated games like prisoner's dilemma.) We propose an extremely simple model of a game with additional costs (computational or otherwise) for each strategy. We illustrate that this model fits nicely into existing concepts in game theory such as zero-sum games, potential games, and submodular games, showing that natural learning dynamics continue to converge to equilibrium.
2006
- Ivona Bezakova, Adam Tauman Kalai, and Rahul Santhanam. Graph Model Selection using Maximum Likelihood. In Proceedings of the 23rd International Conference on Machine Learning (ICML), pp. 105-112, 2006.
- In recent years, a number of random graph models, such as preferential attachment, have been proposed as probabilistic models of large graphs. We suggest an objective method for ranking their performance on actual graphs. In particular, we look at the probability that a model assigns to a given graph, and design efficient MCMC algorithms for estimating these quantities.
- Elad Hazan, Adam Tauman Kalai, Satyen Kale, and Amit Agarwal. Logarithmic Regret Algorithms for Online Convex Optimization. In Proceedings of 19th Annual Conference on Learning Theory (COLT), pp. 499-513, 2006.
- We present improved algorithms for the problem of online optimization, that use second derivatives, analogous to Newton's method.
- Adam Tauman Kalai and Santosh Vempala. Simulated Annealing for Convex Optimization. Math of OR, 31(2), pp. 253-266, 2006. (15-min ppt 50-min ppt)
- Simulated annealing is a general-purpose optimization technique that has proven useful in practice, but is notoriously difficult to analyze. We show that it can be used for the problem of minimizing a linear function over a convex set, where the set is specified only by a membership oracle. Moreover, its guaranteed convergence rate is faster than that of any known alternative algorithm.
2005
- Abie Flaxman, Adam Tauman Kalai, and Brendan McMahan. Online Convex Optimization in the Bandit Setting: Gradient Descent Without a Gradient. In Proceedings of the Sixteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 385-394, 2005. (20-min ppt, 50-min ppt)
- In many problems, a decision-maker has to make a sequence of decisions online, with limited feedback. For example, consider a company has to choose how much to spend in advertising through a number of channels each month. The feedback the company receives may be only the company's total profit that period. Such feedback is indirect and noisy -- it may be more influenced by the economy than the company's choices during that period. To model such problems, we consider a fixed convex feasible set. On the ith period, the decision maker chooses a feasible point x_i, and receives some reward, f_i(x_i), where f_i is any bounded concave function. The only feedback the decision maker receives is this reward f_i(x_i) -- no other information is revealed about f_i. We give an efficient algorithm that guarantees an average reward that approaches the best average reward achievable by a single feasible point, where the best is chosen with the benefit of hindsight. These guarantees hold for an arbitrary sequence of bounded concave functions.
- Sham Kakade and Adam Tauman Kalai. From Batch to Transductive Online Learning. In Advances in Neural Information Processing Systems 18, 2006 (NIPS 05). (20-min NIPS ppt)
- We consider the online transductive learning (Ben-David, Kushilevitz and Mansour, 1995) model where an arbitrary sequence of examples are presented to a learner, one at a time. The learner must predict each label, online, given only the labels of the previous examples and the future unlabeled examples. In contrast, more standard i.i.d. models of learning assume that examples are drawn independently and identically distributed from a fixed distribution. We give an algorithm that employs unlabeled data to convert any efficient learning algorithm in an i.i.d. setting to an efficient learning algorithm in the more difficult online transductive model.
- Adam Tauman Kalai, Adam Klivans, Yishay Mansour, and Rocco Servedio. Agnostically Learning Halfspaces. In Proceedings of the 46th Annual Symposium on the Foundations of Computer Science (FOCS), pp. 11-20, 2005. (long version) (15-min FOCS ppt)
- The agnostic model of learning (Kearns, Schapire, and Sellie, 1992) is a natural model of learning where one aims to do as well as the best function in a given class of functions. Unfortunately, there are many previous negative and impossibility results for agnostic learning, due to computational intractability. We give a new algorithm for agnostic learning, based on the celebrated learning algorithm of Linial, Mansour, and Nisan. We show that, under mild distributional assumptions, it learns to predict as well as the best halfspace, in n dimensions. The algorithm does not suffer from the ``curse of dimensionality -- it runs in time polynomial rather than exponential in the dimension n.
- Sanjoy Dasgupta, Adam Tauman Kalai, and Claire Monteleoni. Analysis of Perceptron-Based Active Learning. In Proceedings of 18th Annual Conference on Learning Theory (COLT), 2005.
- In active learning, the set of unlabeled examples is known in advance, and the learner can choose which examples are labeled as training data. We give a simple and efficient active learning procedure based on the classic Perceptron algorithm. Under distributional assumptions, the number of points that must be labeled to achieve any error rate is logarithmic in the desired error rate.
- Adam Tauman Kalai and Rocco Servedio. Boosting in the Presence of Noise. Journal of Computer and System Sciences 71(3): 266-290, 2005. (STOC version)
- Boosting is a technique that attempts to improve the accuracy of any learning algorithm. It has both strong theoretical guarantees as well as overwhelming empirical success. One often-cited weakness of boosting is its inability to deal with noisy data, even when the noise is chosen completely randomly and independently. Following work of Kearns, Mansour, and McAllester, we show that a simple divide-and-conquer boosting algorithm can boost optimally in the presence of noise.
- Adam Tauman Kalai and Santosh Vempala. Efficient Algorithms for On-line Optimization. Journal of Computer and System Sciences 71(3): 291-307, 2005. (COLT version)
- In online optimization, one wants to solve a sequence of combinatorial optimization problems in the face of uncertainty. For example, each day one wants to choose a path to drive to work, and only afterwards, one finds out the times on the various roads. Suppose that one can efficiently solve the one-shot version of the optimization problem, e.g., one can easily find the shortest path in a graph with known times. We give an efficient procedure for adaptively choosing solutions to a sequence of such problems, where the average quality of the solution approaches that of the best single solution. This holds for an arbitrary sequence of problems, yet guarantees are given with respect to the best solution, chosen with the benefit of hindsight. Our algorithm is as efficient as the best offline solution, i.e., the fastest algorithm to solve the problem if the entire sequence were known in advance.
2004
- Adam Tauman Kalai. Learning Monotonic Linear Functions. In Proceedings of 17th Annual Conference on Learning Theory, pp. 487-501, 2004. (50-minute ppt)
- Generalized Additive Models (Hastie and Tibshirani) are a broad generalization of both linear and logistic regression. We give the first computationally efficient algorithm that provably learns this class of models. It is efficient in two sense: its runtime is polynomial in the size of the training data, and its error approaches optimal at a rate that is inverse polynomial.
2003
- Avrim Blum, Suchi Chawla, and Adam Tauman Kalai. Static Optimality and Dynamic Search Optimality in Lists and Trees. Algorithmica 36:3, pp. 249-260, 2003. (SODA version)
- Splay Trees are an efficient algorithm for maintaining a dynamic binary search tree, popular both in theory and practice. They are guaranteed to be only a constant factor worse than the best possible static binary search tree. We give algorithms that are only (1+eps) worse than the best fixed binary search tree, and similar results for lists.
- Avrim Blum, Adam Tauman Kalai, and Hal Wasserman. Noise-Tolerant Learning, the Parity Problem, and the Statistical Query Model. JACM 50(4):506--519, 2003. (STOC version)
- The parity with noise learning problem is one of the most beautiful problems in computer science that is believed to be hard. It is not only believed to be hard on worst-case instances but also on random instances. An efficient algorithm for this problem would find applications in many fields, such as error-correcting codes. We give the fastest-known algorithm. While it is not truly efficient, it is the first solution that is faster than brute force.
- Adam Tauman Kalai. Generating Random Factored Numbers, Easily. Journal of Cryptology 16(4):287-289, 2003. (SODA version)
- Our goal is to generate a uniformly random number between 1 and N, along with its prime factorization. Since there are no known efficient factoring algorithms, one cannot simply generate a random number and then factor it. Eric Bach's award-winning dissertation introduces and solves this problem. We present a four-line solution that is still efficient.
<2003
- Adam Tauman Kalai. Efficient Pattern-Matching with Don't Cares. In Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 655-656, 2002.
- We give a randomized algorithm for the string matching with don't cares problem, the standard string matching problem with wildcards. We present a four-line solution that is slightly faster and much simpler than previous algorithms.
- Adam Tauman Kalai. Better computers for better people. Ph.D. thesis, technical report CMU-CS-01-132, 2001.
- Adam Tauman Kalai and Santosh Vempala. Efficient Algorithms for Universal Portfolios. Journal of Machine Learning Research 3(3):423--440, 2002. (FOCS version)
- The Universal Portfolio is an investment strategy introduced by Thomas Cover. Previously, all known implementations required computation exponential in the number of stocks. In this paper, we use random walks for an implementation that is polynomial in the number of stocks.
- Adam and Ehud Kalai. Strategic Polarization. Journal of Mathematical Psychology, 45:4, pp. 656-663, 2001.
- An example of polarization is a marriage in which one spouse gives generously to charity while the other donates nothing. This may misrepresent what is, in actuality, a small discrepancy in preferences. It may be that the donating spouse would like to see 10% of their combined income go to charity each year, while the apparently frugal spouse would like to see 9% donated. A simple game-theoretic analysis suggests that the spouses will end up donating 10% and 0%, respectively. This paper gives a strategic (game-theoretic) analysis of polarization.
- Avrim Blum, Carl Burch, and Adam Tauman Kalai. Finely Competitive Paging. In Proceedings of the 40th Annual Symposium on the Foundations of Computer Science (FOCS), pp. 450-458, 1999.
- In the on-line paging problem, one has a fixed-size cache. Each time a page is requested that is not in the cache, it has to be brought in at a cost of 1 and another page has to be evicted from the cache. We present an algorithm that has the same worst-case performance as the previous algorithms, but for a large and important class of request sequences, is superior.
- Heung-Yeung Shum, Adam Tauman Kalai, and Steve Seitz. Omnivergent Stereo. Journal of Computer Vision 48:3, pp 159-172, 2002. (also at ICCV 1999)
- We describe the design of an optimal camera for 3D reconstruction. In our model, a camera may capture a fixed number of pixels (light rays) from anywhere inside the circular frame of the camera. Afterwards, points outside the camera must be reconstructed as accurately as possible. We show that capturing tangent rays (in both directions) gives optimal uniform reconstructions.
- Avrim Blum, Adam Tauman Kalai, and John Langford. Beating the Holdout: Bounds for K-Fold and Progressive Cross-Validation. In Proceedings of the 10th Annual Conference on Computational Learning Theory (COLT), 1999.
- K-Fold cross-validation is a popular technique in machine learning for estimating the performance of a learned hypothesis on a data set. We provide the first theoretical justification for this method that shows that it is, on average, more accurate than a held-out test set of comparable size.
- Stan Chen, Adam Tauman Kalai, Avrim Blum, and Roni Rosenfeld. On-line Algorithms for Combining Language Models. In Proceedings of the International Conference on Accoustics, Speech, and Signal Processing, 1999.
- We show how Cover's Universal Portfolios can be applied to the field of language modeling. The goal of a language model is to accurately predict the next word in a sequence of words, with applications to speech recognition and data compression. In this setting, the universal guarantees are very striking. We support this with experiments on several different corpora.
- Avrim Blum and Adam Tauman Kalai. Universal Portfolios With and Without Transaction Costs. Machine Learning 35:3, pp 193-205, 1999. (15-min COLT slides).
- We present a much simpler analysis of Thomas Cover's Universal Portfolios. This analysis easily extends to the case of transaction costs.
- Adam Tauman Kalai and Mel Siegel. Improved Rendering of Parallax Panoramagrams for a Time-Multiplexed Autostereoscopic Display. In Stereoscopic Displays and Applications IX: Proceedings of SPIE 3295, 1998.
- We describe a new kind of 3D display that is viewable without glasses.
- Avrim Blum and Adam Tauman Kalai. A Note on Learning from Multiple-Instance Examples. Machine Learning 30:1, pp. 23-30, 1998.
- In the multiple-instance learning setting introduced by Dietterich et al, the example data consist of n-tuples of points. An n-tuple is labeled positive if any of the n constituent points are positive. We show that this problem can be reduced to that of learning in the presence of noise.