Department of Electrical Engineering and Computer Science,
Massachusetts Institute of Technology
Title: Intelligent Robots Redux
Abstract: The fields of AI and robotics have made great improvements in many individual subfields, including in motion planning, symbolic planning, probabilistic reasoning, perception, and learning. Our goal is to develop an integrated approach to solving very large problems that are hopelessly intractable to solve optimally. We make a number of approximations during planning, including serializing subtasks, factoring distributions, and determinizing stochastic dynamics, but regain robustness and effectiveness through a continuous state-estimation and replanning process. I will describe our application of these ideas to an end-to-end mobile manipulation system, as well as ideas for current and future work on improving correctness and efficiency through learning.
Bio: Leslie is a Professor at MIT. She has an undergraduate degree in Philosophy and a PhD in Computer Science from Stanford, and was previously on the faculty at Brown University. She was the founding editor-in-chief of the Journal of Machine Learning Research. She is not a robot.
Host: Matthew Walter
Natural Language Processing Group,
Title: Reading Comprehension and Natural Language Inference
Abstract: Much of computational linguistics and text understanding is either towards one end of the spectrum where there is no representation of compositional linguistic structure (bag-of-words models) or near the other extreme where very complex representations are employed (first order logic, AMR, HPSG, …). A unifying theme of much of my recent work is to explore models with just a little bit of appropriate linguistic structure. I will focus here on recent case studies in reading comprehension and question answering, exploring the use of both natural logic and deep learning methods for reading comprehension and question answering.
Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet still unsolved goal of NLP. I’ll first introduce our recent work on the Deepmind QA dataset - a recently released dataset of millions of examples constructed from news articles. On the one hand, we show that (simple) neural network models are surprisingly good at solving this task and achieving state-of-the-art accuracies; on the other hand, we did a careful hand-analysis of a small subset of the problems, and we argue that we are quite close to a performance ceiling on this dataset, and it is still quite far from genuine deep / complex understanding. I will then turn to the use of Natural Logic, a weak proof theory on surface linguistic forms which can nevertheless model many of the common-sense inferences that we wish to make over human language material. I will show how it can support common-sense reasoning and be part of a more linguistically based approach to open information extraction which outperforms previous systems. I show how to augment this approach with a shallow lexical classifier to handle situations where we cannot find any supporting premises. With this augmentation, the system gets very promising results on answering 4th grade science questions, improving over both the classifier in isolation, a strong IR baseline, and prior work. Finally, I will look at how we can incorporate more of the compositional structure of language, which is standardly used in logical approaches to understanding, into a deep learning model. I will emphasize some recent work which shows how that can be done quite efficiently by building the structure like a shift-reduce parser, and how the resulting system can produce stronger results than a sequence model on a natural language inference task.
The talk will include joint work with Gabor Angeli, Danqi Chen, and Sam Bowman.
Bio: Christopher Manning is a professor of computer science and linguistics at Stanford University. His Ph.D. is from Stanford in 1995, and he held faculty positions at Carnegie Mellon University and the University of Sydney before returning to Stanford. His research goal is computers that can intelligently process, understand, and generate human language material. Manning concentrates on machine learning approaches to computational linguistic problems, including syntactic parsing, computational semantics and pragmatics, textual inference, machine translation, and using deep learning for NLP. He is an ACM Fellow, a AAAI Fellow, and an ACL Fellow, and has coauthored leading textbooks on statistical natural language processing and information retrieval. He is a member of the Stanford NLP group (@stanfordnlp).
Host: Kevin Gimpel
Department of Brain and Cognitive Sciences,
Massachusetts Institute of Technology
Photo credit: Azeddine Tahiri
Title: Engineering and reverse-engineering common sense
Abstract: Many recent successes in computer vision, machine learning and other areas of artificial intelligence have been driven by methods for sophisticated pattern recognition, such as deep neural networks. But human intelligence is more than just pattern recognition. In particular, it depends on a suite of commonsense capacities for modeling the world: for explaining and understanding what we see, imagining things we could see but haven’t yet, solving problems and planning actions to make these things real, and building new models as we learn more about the world. I will talk about how we are beginning to capture these distinctively human capacities in computational models using the tools of probabilistic programs and program induction, embedded in a Bayesian framework for inference from data. These models help to explain how humans can perceive rich three-dimensional structure in visual scenes and objects, perceive and predict objects’ motion based on their intrinsic physical characteristics, and learn new visual object concepts from just one or a few examples.
Bio: Josh Tenenbaum is Professor of Computational Cognitive Science in the Department of Brain and Cognitive Sciences at MIT, a principal investigator at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), and a thrust leader in the Center for Brains, Minds and Machines (CBMM). His research centers on perception, learning, and common-sense reasoning in humans and machines, with the twin goals of better understanding human intelligence in computational terms and building more human-like intelligence in machines. The machine learning and artificial intelligence algorithms developed by his group are currently used by hundreds of other science and engineering groups around the world.
Tenenbaum received his PhD from MIT in 1999, and was an Assistant Professor at Stanford University from 1999 to 2002 before returning to MIT. His papers have received awards at the Cognitive Science (CogSci), Computer Vision and Pattern Recognition (CVPR), Neural Information Processing Systems (NIPS), and Uncertainty in Artificial Intelligence (UAI) conferences, the International Conference on Learning and Development (ICDL) and the International Joint Conference on Artificial Intelligence (IJCAI). He has given invited keynote talks at all of the major machine learning and artificial conferences, as well as the main meetings of the Cognitive Science Society, the Cognitive Development Society, the Society for Mathematical Psychology, and held distinguished lectureships at Stanford University, the University of Amsterdam, McGill University, the University of Pennsylvania, the University of California, San Diego, and the University of Arizona. He is the recipient of the Early Investigator Award from the Society of Experimental Psychologists, the Distinguished Scientific Award for Early Career Contribution to Psychology from the American Psychological Association, and the Troland Research Award from the National Academy of Sciences, and is a fellow of the Society of Experimental Psychologists and the Cognitive Science Society.
Host: Nati Srebro
Computer Science and Artificial Intelligence Lab,
Massachusetts Institute of Technology
Title: Fast Algorithms for Structured Sparsity
Abstract: Sparse representations of signals (/img.e., representations that have only few non-zero or large coefficients) have emerged as powerful tools in signal processing theory, algorithms, machine learning and other applications. However, real-world signals often exhibit rich structure beyond mere sparsity. For example, a natural image, once represented in the wavelet domain, often has the property that its large coefficients occupy a subtree of the wavelet hierarchy, as opposed to arbitrary positions. A popular approach to capturing this type of additional structure is to model the support of the signal of interest (i.e., the set of indices of large coefficients) as belonging to a particular family of sets. Computing a sparse representation of the signal then corresponds to the problem of finding the support from the family that maximizes the sum of the squares of the selected coefficients. Such a modeling approach has proved to be beneficial in a number of applications including compression, de-noising, compressive sensing and machine learning. However, the resulting optimization problem is often computationally difficult or intractable, which is undesirable in many applications where large signals and datasets are commonplace.
In this talk, I will outline some of the past and more recent algorithms for finding structured sparse representations of signals, including piecewise constant approximations, tree-sparse approximations and graph-sparse approximations. If time allows, I will also mention our recent work that generalizes sparse supports to arbitrary subspace approximations, enabling applications such as low-rank approximation of matrices. The algorithms borrow several techniques from combinatorial optimization (e.g., dynamic programming), graph theory, and approximation algorithms. For many problems the algorithms run in (nearly) linear time, which makes them applicable to very large datasets.
Joint work with Chinmay Hegde and Ludwig Schmidt.
Bio: Piotr joined MIT in September 2000, after earning PhD from Stanford University. Earlier, he received Magister degree from Uniwersytet Warszawski in 1995. As of July 2010, he holds the title of Professor in the Department of Electrical Engineering and Computer Science.
Piotr’s research interests include algorithms for high-dimensional geometric problems, algorithms using sublinear time and/or space and streaming algorithms.
Host: Julia Chuzhoy
All talks will be held at TTIC in room #526 located at 6045 South Kenwood Avenue (intersection of 61st street and Kenwood Avenue)
Parking: Street parking, or in the free lot on the corner of 60th St. and Stony Island Avenue.
For questions and comments contact Julia Chuzhoy.