Date | Topic | Speaker |
---|---|---|

Wed 8.22 (10:00AM WEB 3780) | What is wrong with semantic parsing? | Jonathan Berant |

Wed 8.29 |
Probabilistic streaming tensor decomposition
[Slides]
## Probabilistic streaming tensor decomposition
Tensor decomposition is a fundamental tool for multiway data analysis. While most decomposition algorithms operate a collection of static data and perform batch processes, many practical applications produce data in a streaming way — every time a subset of entries are generated, and previously seen entries cannot be revisited. In such scenarios, traditional decomposition approaches will be inappropriate, because they cannot provide timely updates when new data come in, and they need to access the whole data many times for batch optimization. To address this issue, we propose POST, a Probabilistic Streaming Tensor decomposition algorithm, which enables real-time updates and predictions upon receiving new tensor entries, and supports dynamic growth of all the modes. Compared with the state-of-the-art streaming decomposition approach MAST, POST is more flexible in that it can handle arbitrary orders of streaming entries, and hence are more widely applicable. In addition, as a Bayesian inference algorithm, POST can quantify the uncertainty of the latent embeddings via their posterior distributions, and the confidence levels of the missing entry value predictions. On several real-world datasets, POST exhibits better or comparable predictive performance than MAST and other static decomposition algorithms. |
Yishuai Du |

Wed 9.05 |
Gaps in Information Access in Social Networks
## Gaps in Information Access in Social Networks
The study of influence maximization in social networks has largely ignored disparate effects these algorithms might have on the individuals contained in the social network. Individuals may place the high value on receiving information, e.g. job openings or advertisements for loans. While well-connected individuals at the center of the network are likely to receive the information that is being distributed through the network, poorly connected individuals are systematically less likely to receive the information, producing a gap in access to the information between individuals. In this work, we study how best to spread information in a social network while minimizing this access gap. We propose to use the maximin social welfare function as an objective function, where we maximize the minimum probability of receiving the information under an intervention. We prove that in this setting this welfare function constrains the access gap whereas maximizing the expected number of nodes reached does not. We also investigate the difficulties of using the maximin, and present hardness results and an analysis for standard greedy strategies. Finally, we investigate practical ways of optimizing for the maximin, and give empirical evidence that a simple greedy-based strategy works well in practice. |
Ashkan Bashardoust |

Wed 9.12 |
Hamiltonian Monte Carlo
## Hamiltonian Monte CarloMarkov Chain Monte Carlo originated from the classic paper of Metropolis, et al(1953). In this talk, we will briefly go through the MCMC, and dive into the Hamiltonian Monte Carlo. Beginning with the Hamiltonian Dynamics, continuing with the reasoning of why the algorithm works, we will go through the advantages and disadvantages over traditional MCMC, the implementation detail of algorithm, and other related topics. |
Mingxuan Han |

Wed 9.19 |
Robust statistics meets Theory CS
## Robust statistics meets Theory CSIn this talk I will survey some recent developments on problems of the following form: we wish to estimate a statistical quantity (such as the mean of a distribution) using samples from a distribution, where a small fraction of the samples have been corrupted arbitrarily. The classic estimators incur an estimation error that grows badly with the dimension of the data. Recently, new ideas from theoretical CS have been used to obtain algorithms that perform significantly better. We will give an overview of some of the work that has come out in the last couple of years. |
Aditya Bhaskara |

Wed 9.26 |
New Equivalence Results and Metric-Constrained LP Algorithms for Correlation Clustering
[Slides]
## New Equivalence Results and Metric-Constrained LP Algorithms for Correlation ClusteringCorrelation clustering is the problem of partitioning a signed graph in a way that simultaneously avoids cutting positive edges and leaving negative edges uncut. Recently we discovered that a simple weighted variant of the problem effectively generalizes and interpolates a number of common graph clustering objectives, including modularity, sparsest cut, and cluster deletion. Our clustering framework, which we call LambdaCC, takes an unsigned graph and turns it into an instance of correlation clustering by converting edges and non-edges in the original graph into positive and negative edges in a new signed graph respectively, both weighted with respect to a resolution parameter lambda. We generalize existing linear programming approximation algorithms for correlation clustering to obtain constant-factor approximations for our generalized clustering problem in certain parameter regimes. As a special case, our results lead to the first 2-approximation algorithm for cluster deletion. Typically, LP-based correlation clustering algorithms have been viewed only as theoretical results due to their high memory requirements. However, we develop a new approach based on applying projection methods, which allows us to solve this LP relaxation in practice on a much larger scale than was previously possible. These projection methods apply broadly to a larger class of so-called metric-constrained optimization problems, which arise frequently as convex relaxations of NP-hard clustering objectives. In practice we use our techniques to solve relaxations of this form involving up to 700 billion constraints. |
Nate Veldt |

Wed 10.3 |
Grasp Planning using Probabilistic Inference
## Grasp Planning using Probabilistic InferenceAbstract: Grasping is a difficult problem in Robotics. In this talk, I will present how we plan grasps using probabilistic inference. I will first talk about our novel approach to multi-fingered grasp planning leveraging learned deep neural network models. We train a convolutional neural network to predict grasp success as a function of both visual information of an object and grasp configuration. We can then formulate grasp planning as inferring the grasp configuration which maximizes the probability of grasp success. Our experimental results show that our planning method outperforms existing planning methods for neural networks. Then I will introduce our probabilistic grasp planner that explicitly models grasp type for planning high-quality precision and power grasps in real-time. We compare our learned grasp model with a model that does not encode type and show modeling grasp type helps to plan better grasps. At the end of the talk, I will briefly discuss our future work about how to efficiently explore the grasping configuration and object space for self-supervised grasp learning. |
Qingkai Lu |

Wed 10.10 | ||

Thu 10.18 (12:00PM) WEB 3780 |
From artificial intelligence to operations research and back again: decision diagrams and deep learning
## From artificial intelligence to operations research and back again: decision diagrams and deep learningAbstract: One form of characterizing the expressiveness of a piecewise linear neural network is by the number of linear regions, or pieces, of the function modeled. The first part of the talk is based on the number of linear regions that these networks can attain, both theoretically and empirically. We present upper and lower bounds for the maximum number of linear regions on rectifier and maxout networks and a method to perform exact counting of the number of regions by modeling the DNN with a mixed-integer linear formulation. These bounds come from leveraging the dimension of the space defining each linear region, and they indicate that a deep rectifier network can only have more linear regions than any shallow counterpart with same number of neurons if that number exceeds the dimension of the input. The second part of the talk is mostly based on how we approximate the number of linear regions of specific rectifier networks with an algorithm for probabilistic lower bounds of mixed-integer linear sets and we present a tighter upper bound that leverages network coefficients. We also discuss how decision diagrams could be used for faster exact counting. The algorithm for probabilistic lower bounds is several orders of magnitude faster than exact counting and the values reach similar orders of magnitude, hence making our approach a viable method to compare the expressiveness of such networks. The refined upper bound is particularly stronger on networks with narrow layers. Bio: Thiago Serra is working as a visiting research scientist in Mitsubishi Electric Research Labs (MERL). He received his Ph.D. in Operations Research from Carnegie Mellon University. Before his Ph.D., he worked for four years as an operations research analyst at Petrobras, obtained a M.S. in Computer Science from University of Sao Paulo (USP), and a B.S. in Computer Engineering from University of Campinas (Unicamp). He has received several awards: Gerald L. Thompson Doctoral Dissertation Award in Management Science from Carnegie Mellon University, the Judith Liebman Award from INFORMS, and best the poster awards at the INFORMS Annual Meeting in 2016 and at the Princeton Day of Optimization in 2018. Thiago's research focuses on theory and applications of decision diagrams, deep learning, and integer programming algorithms. |
Thiago Serra |

Wed 10.24 |
General Message Passing for Probabilistic Graphical Model Inference
## General Message Passing for Probabilistic Graphical Model InferenceAbstract: The task of calculating posterior marginals on nodes in an arbitrary probabilistic graphical model is known to be prohibitively expensive, which is, in fact, NP-hard. However, due to the practical importance of calculating posterior marginals, there has been a considerable interest in developing efficient approximate inference methods. Message passing is a group of Bayesian inference methods in graphical models, which estimate the posterior marginals by minimizing probability divergence measures. In this talk, firstly, I will give a brief introduction to probabilistic graphical models and exact inference method in simple graphical models. Then, before getting into the message passing algorithm, I will introduce some properties about alpha-divergence measure, exponential family, and fully factorized approximation. Finally, I am supposed to present a unifying view of message passing algorithms in the view of divergence measures they are trying to minimize. |
Zheng Wang |

Wed 10.31 |
Physics Informed Deep Learning: Data-driven Solutions of Nonlinear Partial Differential Equations
## Physics Informed Deep Learning: Data-driven Solutions of Nonlinear Partial Differential EquationsAbstract: With the explosive growth of available data and computing resources, recent advances in machine learning and data analytics have yielded transformative results across diverse scientific disciplines. However, more often than not, in the course of analyzing complex physical, biological or engineering systems, the cost of data acquisition is prohibitive. This talk introduces physics informed neural networks – neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. The proposed method utilized prior knowledge, such as principled physical laws that govern the time-dependent dynamics of a system, or some empirical validated rules or other domain expertise, as a regularization agent that constrains the space of admissible solutions to a manageable size. The developments are presented in the context of two main problem classes: data-driven solution and data-driven discovery of partial differential equations |
Wenzheng Tao |

Thu 11.08 (12:00PM) WEB 3780 |
How Do We Make Ethical Robots?
## How Do We Make Ethical Robots?Abstract: Robots are playing increasing roles in our society, not just as tools for people taking actions, but as goal-seeking agents, deciding for themselves which actions to take. This has raised concerns about robots, inadvertently or deliberately, behaving in destructive ways. If robots are to participate in our society, we want them to behave ethically. Ethics is society’s way to encourage its individual members to be trustworthy, encouraging cooperation, which leads to positive-sum interactions, making the society as a whole stronger and more successful. In contrast, behaviors that exploit trust for individual gain tend to be negative-sum interactions. Trust and cooperation are discouraged, and the society as a whole becomes weaker and less successful. We consider examples of different levels of ethical reasoning, including responding to immediate desires, maximizing individual expected utility; following ethical principles and social norms; and resolving ethical dilemmas. As technology advances, and as non-human agents including intelligent robots and other AIs increasingly act as autonomous decision-makers, they must be designed to follow ethical principles, demonstrate trustworthiness, and encourage cooperation among all members of society. Bio: Thiago Serra is working as a visiting research scientist in Mitsubishi Electric Research Labs (MERL). He received his Ph.D. in Operations Research from Carnegie Mellon University. Before his Ph.D., he worked for four years as an operations research analyst at Petrobras, obtained a M.S. in Computer Science from University of Sao Paulo (USP), and a B.S. in Computer Engineering from University of Campinas (Unicamp). He has received several awards: Gerald L. Thompson Doctoral Dissertation Award in Management Science from Carnegie Mellon University, the Judith Liebman Award from INFORMS, and best the poster awards at the INFORMS Annual Meeting in 2016 and at the Princeton Day of Optimization in 2018. Thiago's research focuses on theory and applications of decision diagrams, deep learning, and integer programming algorithms. |
Benjamin Kuipers |

Tue 11.13 (11:00AM) 3780 WEB |
Challenges of Human-Aware AI Systems
## Challenges of Human-Aware AI SystemsAbstract: Research in AI suffers from a longstanding ambivalence to humans swinging as it does, between their replacement and augmentation. Now, as AI technologies enter our everyday lives at an ever increasing pace, there is a greater need for AI systems to work synergistically with humans. To do this effectively, AI systems must pay more attention to aspects of intelligence that helped humans work with each other including emotional and social intelligence. I will discuss the research challenges in designing such human-aware AI systems, including modeling the mental states of humans in the loop, recognizing their desires and intentions, providing proactive support, exhibiting explicable behavior, giving cogent explanations on demand, and engendering trust. I will survey the progress made so far on these challenges, and highlight some promising directions. I will also touch on the additional ethical quandaries that such systems pose. I will end by arguing that the quest for human-aware AI systems broadens the scope of AI enterprise, necessitates and facilitates true inter-disciplinary collaborations, and can go a long way towards increasing public acceptance of AI technologies. |
Subbarao Kambhampati |

Fri 11.16 (11:00AM) Saltair Room at the Union |
Machine Learning at Europa: Smart Spacecraft Explorers
## Challenges of Human-Aware AI SystemsAbstract: Upcoming missions to remote destinations like Jupiter’s moon Europa will operate at extreme distances from the Earth. The high-radiation environment limits their expected lifetimes, so they must make the most of every observing opportunity. We have developed methods to analyze data as it is collected during a Europa flyby to quickly identify content of interest: an active icy plume, surface mineral deposits, or any unexpected phenomena (scientific anomalies). Data with positive detections can be marked for high-priority downlink to Earth and inform the next steps in mission planning. In addition, for operations on the Europa surface, we are developing vision-based methods to strategically select candidate locations for sampling with a robotic arm. This talk discusses data analysis and machine learning methods that can operate onboard and increase the rate of exploration and discovery. |
Kiri Wagstaff |

Wed 11.21 |
Bayesian Low-Rank Decomposition of Incomplete Multiway Tensors
## Bayesian Low-Rank Decomposition of Incomplete Multiway TensorsAbstract: There is a Bayesian framework for low-rank decomposition of multiway tensor data with missing observations. The key issue of pre-specifying the rank of the decomposition is sidestepped in a principled manner using a multiplicative gamma process prior. Both continuous and binary data can be analyzed under the framework, in a coherent way using fully conjugate Bayesian analysis. In particular, the analysis in the non-conjugate binary case is facilitated via the use of the Polya-Gamma sampling strategy which elicits closed-form Gibbs sampling updates. |
Yishuai Du |

Wed 11.28 |
Introduction to Laplace Propagation
## Introduction to Laplace PropagationAbstract: Laplace Propagation is an approximate inference method in Bayesian models and regularized risk functionals, which is based on the propagation of mean and variance derived from Laplace approximation. In this talk, Laplace propagation will be introduced from three aspects: algorithm basics, connections with message passing, and special cases. |
Zheng Wang |

Thu 12.06, 4-5pm, JWB 335 |
Orthogonal Tensor Decomposition
## Orthogonal Tensor DecompositionAbstract: Tensor decomposition has many applications. However, it is often a hard problem. In this talk I will discuss a family of tensors, called orthogonally decomposable tensors, which retain some of the properties of matrices that general tensors don't. Asymmetric tensor is orthogonally decomposable if it can be written as a linear combination of tensor powers of n orthonormal vectors. Such tensors are interesting because their decomposition can be found efficiently. We study their spectral properties and give a formula for all of their eigenvectors. We also give equations defining all real symmetric orthogonally decomposable tensors. Analogously, we study nonsymmetric orthogonally decomposable tensors, describing their singular vector tuples and giving polynomial equations that define them. In an attempt to extend the definition to a larger set of tensors, we define tight-frame decomposable tensors and study their properties. Finally, I will conclude with some open questions and future research directions. |
Elina Robeva |