QBLink: A Dataset for Sequential Open-Domain Question Answering
Files
Related Publication Link
Date
Authors
Advisor
Related Publication Citation
DRUM DOI
Abstract
We introduce QBLink, a new dataset of about 18,000 question sequences, each sequence consists of three naturally occurring human-authored questions (totaling around 56,000 unique questions). The sequences themselves are also naturally occurring (i.e., we do not artificially combine individually-authored questions to form sequences), which allows us to focus more on the important connections between questions that should be incorporated to improve the end-to-end question answering accuracy. QBLink is based on the bonus questions of Quiz Bowl tournaments. Unlike previous work that only uses the starter (or tossup) questions, bonus questions are not interruptable (players always hear the complete question) and have greater variability in difficulty. Bonus questions start with a lead-in text, which sets the stage for the rest of the question, followed by a sequence of related questions.