Hinge-Loss Markov Random Fields and Probabilistic Soft Logic: A Scalable Approach to Structured Prediction

Thumbnail Image
Publication or External Link
Bach, Stephen Hilliard
Getoor, Lise
A fundamental challenge in developing impactful artificial intelligence technologies is balancing the ability to model rich, structured domains with the ability to scale to big data. Many important problem areas are both richly structured and large scale, from social and biological networks, to knowledge graphs and the Web, to images, video, and natural language. In this thesis I introduce two new formalisms for modeling structured data, distinguished from previous approaches by their ability to both capture rich structure and scale to big data. The first, hinge-loss Markov random fields (HL-MRFs), is a new kind of probabilistic graphical model that generalizes different approaches to convex inference. I unite three views of inference from the randomized algorithms, probabilistic graphical models, and fuzzy logic communities, showing that all three views lead to the same inference objective. I then derive HL-MRFs by generalizing this unified objective. The second new formalism, probabilistic soft logic (PSL), is a probabilistic programming language that makes HL-MRFs easy to define, refine, and reuse for relational data. PSL uses a syntax based on first-order logic to compactly specify complex models. I next introduce an algorithm for inferring most-probable variable assignments (MAP inference) for HL-MRFs that is extremely scalable, much more so than commercially available software, because it uses message passing to leverage the sparse dependency structures common in inference tasks. I then show how to learn the parameters of HL-MRFs using a number of learning objectives. The learned HL-MRFs are as accurate as traditional, discrete models, but much more scalable. To enable HL-MRFs and PSL to capture even richer dependencies, I then extend learning to support latent variables, i.e., variables without training labels. To overcome the bottleneck of repeated inferences required during learning, I introduce paired-dual learning, which interleaves inference and parameter updates. Paired-dual learning learns accurate models and is also scalable, often completing before traditional methods make even one parameter update. Together, these algorithms enable HL-MRFs and PSL to model rich, structured data at scales not previously possible.