Does the brain represent words?

The paper is by Jon Gauthier and Anna Ivanova, and is from June 2018.

My interest in this paper comes from the claim made therein that work in NLP on universal representations appears to be on the right track.

Brief Summary of the Paper

The seminal work of Mitchell et al. (2008) used a trillion word corpus to define semantic representations on words based on co-occurrence with a specifically chosen set of 25 sensorimotor verbs thought to be associated with semantic representation: see, hear, listen, taste… they found this was significantly useful to predict neural activation patterns over nine subjects.

As an aside, I thought one of the more interesting parts of that paper was the suggestion that a neural representation could be obtained as a linear superposition of representations from sub-modules.

Following this, other researchers have been looking for better feature spaces, based on, e.g. behavioural ratings and distributional statistics. An extension has also been made to sentence decoding.

A decoding study is described as follows

goal: derive a set of stimulus-specific linguistic features and measure how it is associated with brain activity
method: see if the brain activity patterns can predict the chosen features
conclusion: if the features reflect semantic properties of the stimulus, then the brain activity pattern is considered a “semantic representation”

The authors of this paper argue for the claim that

such talk of representation is meaningless unless one also specifies the brain mechanisms utilizing those representations and the task they are designed to solve.

since such representational claims

wildly over-generate, leading us to award the label of “representation” to brain activity evoked by any arbitrary aspect of the stimulus, so long as it has some vague relation to the stimulus “meaning”

A specific example of “over-generation”: the study of Pereira et al. (2018), wherein fMRI data from subjects reading a sentence was used to predict the embeddings of the words in that sentence, claimed that their decoder could read out “linguistic meaning”.

But since word embeddings have been shown at best to capture a limited range of things such as “elements of syntax” and “hypernymy relations”

we could just as well claim that the decoder has captured “elements of syntax” or “hypernymy relations.”

and since we do more than reason about syntax and hypernymy relations when reading a sentence, this underdetermines the the nature and function of neural computations.

Furthermore, representations do not exist in a vacuum: they are created by some part of the brain to be potentially consumed by another part and produce behaviour. (Some interesting references to philosophical work I would like to read on this point: Papineau, 1992; Dretske, 1995).

So, the authors re-run the experiments of Pereira et al. (2018) by learning a decoder that maps the fMRI data to the neural representations from models trained to perform specific tasks.

All neural models perform above chance, and the best performance is achieved by those that are more general (e.g. GloVe and NLI).

These results are where the suggestion comes from - the more general NLP model the better the fMRIs can predict its representations.