Trick Me If You Can: Human-in-the-Loop Generation of Adversarial Examples for Question Answering

Wallace, Eric; Rodriguez, Pedro; Feng, Shi; Yamada, Ikuya; Boyd-Graber, Jordan

doi:10.1162/tacl_a_00279

Transactions of the Association for Computational Linguistics (Nov 2019)

Trick Me If You Can: Human-in-the-Loop Generation of Adversarial Examples for Question Answering

Wallace, Eric,
Rodriguez, Pedro,
Feng, Shi,
Yamada, Ikuya,
Boyd-Graber, Jordan

Affiliations

Wallace, Eric
Rodriguez, Pedro
Feng, Shi
Yamada, Ikuya
Boyd-Graber, Jordan

DOI: https://doi.org/10.1162/tacl_a_00279
Journal volume & issue: Vol. 7
pp. 387 – 401

Abstract

Read online

Adversarial evaluation stress-tests a model’s understanding of natural language. Because past approaches expose superficial patterns, the resulting adversarial examples are limited in complexity and diversity. We propose human- in-the-loop adversarial generation, where human authors are guided to break models. We aid the authors with interpretations of model predictions through an interactive user interface. We apply this generation framework to a question answering task called Quizbowl, where trivia enthusiasts craft adversarial questions. The resulting questions are validated via live human–computer matches: Although the questions appear ordinary to humans, they systematically stump neural and information retrieval models. The adversarial questions cover diverse phenomena from multi-hop reasoning to entity type distractors, exposing open challenges in robust question answering.

Published in Transactions of the Association for Computational Linguistics

ISSN: 2307-387X (Online)
Publisher: The MIT Press
Country of publisher: United States
LCC subjects: Language and Literature: Philology. Linguistics: Computational linguistics. Natural language processing
Website: https://direct.mit.edu/tacl

About the journal