Artificial Intelligence is colossally hyped these days, but the dirty little secret is that it still has a long, long way to go. Sure, A.I. systems have mastered an array of games, from chess and Go to “Jeopardy” and poker, but the technology continues to struggle in the real world. Robots fall over while opening doors, prototype driverless cars frequently need human intervention, and nobody has yet designed a machine that can read reliably at the level of a sixth grader, let alone a college student. Computers that can educate themselves — a mark of true intelligence — remain a dream.
Even the trendy technique of “deep learning,” which uses artificial neural networks to discern complex statistical correlations in huge amounts of data, often comes up short. Some of the best image-recognition systems, for example, can successfully distinguish dog breeds, yet remain capable of major blunders, like mistaking a simple pattern of yellow and black stripes for a school bus. Such systems can neither comprehend what is going on in complex visual scenes (“Who is chasing whom and why?”) nor follow simple instructions (“Read this story and summarize what it means”).
Although the field of A.I. is exploding with microdiscoveries, progress toward the robustness and flexibility of human cognition remains elusive. Not long ago, for example, while sitting with me in a cafe, my 3-year-old daughter spontaneously realized that she could climb out of her chair in a new way: backward, by sliding through the gap between the back and the seat of the chair. My daughter had never seen anyone else disembark in quite this way; she invented it on her own — and without the benefit of trial and error, or the need for terabytes of labeled data.
Presumably, my daughter relied on an implicit theory of how her body moves, along with an implicit theory of physics — how one complex object travels through the aperture of another. I challenge any robot to do the same. A.I. systems tend to be passive vessels, dredging through data in search of statistical correlations; humans are active engines for discovering how things work.
To get computers to think like humans, we need a new A.I. paradigm, one that places “top down” and “bottom up” knowledge on equal footing. Bottom-up knowledge is the kind of raw information we get directly from our senses, like patterns of light falling on our retina. Top-down knowledge comprises cognitive models of the world and how it works.
Deep learning is very good at bottom-up knowledge, like discerning which patterns of pixels correspond to golden retrievers as opposed to Labradors. But it is no use when it comes to top-down knowledge. If my daughter sees her reflection in a bowl of water, she knows the image is illusory; she knows she is not actually in the bowl. To a deep-learning system, though, there is no difference between the reflection and the real thing, because the system lacks a theory of the world and how it works. Integrating that sort of knowledge of the world may be the next great hurdle in A.I., a prerequisite to grander projects like using A.I. to advance medicine and scientific understanding.
I fear, however, that neither of our two current approaches to funding A.I. research — small research labs in the academy and significantly larger labs in private industry — is poised to succeed. I say this as someone who has experience with both models, having worked on A.I. both as an academic researcher and as the founder of a start-up company, Geometric Intelligence, which was recently acquired by Uber.
Academic labs are too small. Take the development of automated machine reading, which is a key to building any truly intelligent system. Too many separate components are needed for any one lab to tackle the problem. A full solution will incorporate advances in natural language processing (e.g., parsing sentences into words and phrases), knowledge representation (e.g., integrating the content of sentences with other sources of knowledge) and inference (reconstructing what is implied but not written). Each of those problems represents a lifetime of work for any single university lab.
Corporate labs like those of Google and Facebook have the resources to tackle big questions, but in a world of quarterly reports and bottom lines, they tend to concentrate on narrow problems like optimizing advertisement placement or automatically screening videos for offensive content. There is nothing wrong with such research, but it is unlikely to lead to major breakthroughs. Even Google Translate, which pulls off the neat trick of approximating translations by statistically associating sentences across languages, doesn’t understand a word of what it is translating.
I look with envy at my peers in high-energy physics, and in particular at CERN, the European Organization for Nuclear Research, a huge, international collaboration, with thousands of scientists and billions of dollars of funding. They pursue ambitious, tightly defined projects (like using the Large Hadron Collider to discover the Higgs boson) and share their results with the world, rather than restricting them to a single country or corporation. Even the largest “open” efforts at A.I., like OpenAI, which has about 50 staff members and is sponsored in part by Elon Musk, is tiny by comparison.
An international A.I. mission focused on teaching machines to read could genuinely change the world for the better — the more so if it made A.I. a public good, rather than the property of a privileged few.
Gary Marcus is a professor of psychology and neural science at New York University.