I was recently interviewed about our ACL 2019 paper on Data Skeptic. Many thanks to the Data Skeptic team for inviting me to participate in this interview, for asking some great questions, and for cutting the interview as nicely as I think possible given what raw materials I provided.

It was my first ever interview of this nature, and a learning experience for me. I was only able to assign a limited amount of time to prepare for this interview. This post is a brief note about what I wish I had done better, and another statement of an argument I made about empricism and nativism in AI - especially topical in light of the release of Gary Marcus and Ernest Davis’ new book.

In the main, I should have referenced other researchers more. A conversational style is nice and relaxed, but clarifying where ideas come from is required and non-negotiable. Most of the ideas I presented are not original. For example, the idea that supervised learning with text is facing a Chinese Room argument I first read in a paper by Douwe Kiela. Yoshua Bengio has also been arguing for grounded language learning recently, and for being more willing to look at the results of cognitive science, and for the important of out-of-distribution generalization.

As far as I know, perhaps the only original opinion expressed in the interview, beyond the results of our paper, was the counter-argument offered by our work to an “Argument from engineering success” for the kind of strongly empiricist program of people like Yann LeCun. I briefly reiterate that argument here.

I don’t know all the reasons why Yann LeCun sees innate structure as an “evil” to be minimized. However, watching his debate with Gary Marcus, among those reasons appears to be what could be called an “argument from engineering success”:

(1) The less innate structure we put into our models, the better they have performed.

(2) Engineering success is a strong indication of the right path, scientifically.

(3) Therefore, less innate structure is better.

We of course need to define what we mean by innate structure. From the same debate, it appears LeCun and Marcus have different ideas of what this means. Marcus argues that NIPS papers roundly ignore innate structure; LeCun states exactly the opposite is the case. That’s a question I want to return to in the future.

But for the time being, the growing number of findings in NLP that demonstrate our best deep learning models are learning spurious solutions to datasets via superficial statistics immediately undermines (1), since this improved performance does not represent the kind of learning we care about.

Bengio (who I continue to admire greatly, not just for his scientific achievements, which are amazing, but perhaps even more for how much of a high quality human being he is - see his work on AI for social good, and his passionate advocacy for action on climate change) has also been taking a view of recent deep learning success as specifically “System 1” success, in terms of Kahneman’s “systems theory.” If this view is correct, then the argument from engineering success should be modified as follows

(1) The less innate structure we put into our models, the better they have performed at system 1 tasks.

(2) Engineering success is a strong indication of the right path, scientifically.

(3) Therefore, less innate structure is better for system 1 tasks.

This argument is at least more reasonable given the growing evidence I referred to in my counter-argument, although I am not prepared to judge it at this point in time.

As for system 2 tasks, we will have to wait and see. But at the very least there is a reasonable case for taking the results of cognitive science, and contemporary nativism, seriously.