Is compositionality overrated? The view from language emergence

Is compositionality overrated? The view from language emergence

Date Posted:  June 24, 2020
Date Recorded:  June 23, 2020
Speaker(s):  Marco Baroni, Facebook AI Research and University Pompeu Fabra Barcelona
  • All Captioned Videos
  • Brains, Minds and Machines Seminar Series
Associated CBMM Pages: 
Description: 

Abstract: Compositionality is the property whereby linguistic expressions that denote new composite meanings are derived by a rule-based combination of expressions denoting their parts. Linguists agree that compositionality plays a central role in natural language, accounting for its ability to express an infinite number of ideas by finite means. "Deep" neural networks, for all their impressive achievements, often fail to quickly generalize to unseen examples, even when the latter display a predictable composite structure with respect to examples the network is already familiar with. This has led to interest in the topic of compositionality in neural networks: can deep networks parse language compositionally? how can we make them more sensitive to compositional structure? what does "compositionality" even mean in the context of deep learning? I would like to address some of these questions in the context of recent work on language emergence in deep networks, in which we train two or more networks endowed with a communication channel to solve a task jointly, and study the communication code they develop. I will try to be precise about what "compositionality" mean in this context, and I will report the results of proof-of-concept and larger-scale experiments suggesting that (non-circular) compositionality is not a necessary condition for good generalization (of the kind illustrated in the figure). Moreover, I will show that often there is no reason to expect deep networks to find compositional languages more "natural" than highly entangled ones. I will conclude by suggesting that, if fast generalization is what we care about, we might as well focus directly on enhancing this property, without worrying about the compositionality of emergent neural network languages.