On the Stability of Structured Prediction

Thumbnail Image


Publication or External Link





Many important applications of artificial intelligence---such as image segmentation, part-of-speech tagging and network classification---are framed as multiple, interdependent prediction tasks. These structured prediction problems are typically modeled using some form of joint inference over the outputs, to exploit the relational dependencies. Joint reasoning can significantly improve predictive accuracy, but it introduces a complication in the analysis of structured models: the stability of inference. In optimizations involving multiple interdependent variables, such as joint inference, a small change to the input or parameters could induce drastic changes in the solution.

In this dissertation, I investigate the impact of stability in structured prediction. I explore two topics, connected by the stability of inference. First, I provide generalization bounds for learning from a limited number of examples with large internal structure. The effective learning rate can be significantly sharper than rates given in related work. Under certain conditions on the data distribution and stability of the predictor, the bounds decrease with both the number of examples and the size of each example, meaning one could potentially learn from a single giant example. Secondly, I investigate the benefits of learning with strongly convex variational inference. Using the duality between strong convexity and stability, I demonstrate, both theoretically and empirically, that learning with a strongly convex free energy can result in significantly more accurate marginal probabilities. One consequence of this work is a new technique that ``strongly convexifies" many free energies used in practice. These two seemingly unrelated threads are tied by the idea that stable inference leads to lower error, particularly in the limited example setting, thereby demonstrating that inference stability is of critical importance to the study and practice of structured prediction.