Deep Thinking Systems: Logical Extrapolation with Recurrent Neural Networks

Schwarzschild, Avi Koplon

Deep Thinking Systems: Logical Extrapolation with Recurrent Neural Networks

Files

Schwarzschild_umd_0117E_23303.pdf (14.29 MB)

No. of downloads: 30

Date

2023

Authors

Schwarzschild, Avi Koplon

Advisor

Goldstein, Tom

DRUM DOI

https://doi.org/10.13016/dspace/yi8j-hqys

Abstract

Deep neural networks are powerful machines for visual pattern recognition, but reasoning tasks that are easy for humans are still be difficult for neural models. Humans possess the ability to extrapolate reasoning strategies learned on simple problems to solve harder examples, often by thinking for longer. We study neural networks that have exactly this capability. By employing recurrence, we build neural networks that can expend more computation when needed. Using several datasets designed specifically for studying generalization from easy problems to harder test samples, we show that our recurrent networks can extrapolate from easy training data to much harder examples at test time, and they do so with many more iterations of a recurrent block of layers than are used during training.

URI (handle)

http://hdl.handle.net/1903/30189

Collections

UMD Theses and Dissertations
Mathematics Theses and Dissertations

Full item page