AI: The Missing Piece

Recently, I had written a Facebook post stating that I was not particularly impressed with deep learning -- from an academic perspective. This is not to say that deep learning is not significant at all. Indeed, the applications and impact of deep learning are opening up so many possibilities (mostly scary ones), like never before! There is indeed a lot of "scope" for young professionals to major in deep learning.

My reservations has to do with the dearth of conceptual insights provided by deep learning.

My post had understandably caused a lot of consternation from many in my network -- most of all, in my former students. And predictably I was subject to argumentum ad throwing-the-book-at-me with links to several high end mathematics that gets used in deep learning.

There is a difference between using high-end mathematics to do something, and obtaining a conceptual breakthrough in understanding some underlying principle. Despite all the really awesome mathematics that go into deep learning models today, they are really still pretty much, optimisation engines. The underlying concepts that build deep learning networks of today have been around for several decades now.

I'm aware of several arguments challenging even the above notion as well. One of the primary breakthrough that made deep learning applications possible, is advances in hardware with general purpose GPU processors being used to produce massively parallel neural network applications.

Sometimes, advances in hardware will help us explore conceptual spaces that were hitherto unreachable with purely analytical models, and help bring about a conceptual breakthrough. For example, the advent of digital computers helped achieve some conceptual proof in mathematics like the four-colour theorem for planar graphs. The theorem had remained an open problem since the 1800s, only to be proven with the use of computer programs and automated theorem provers as recently as the early 2000s.

But purists aren't exactly happy with theorem-proving of this kind. A mathematical proof usually requires some kind of philosophical insight into the problem, which is what is the primary attraction of a conceptual breakthrough. A computer-based proof only helps validate a conjecture, without any insight into underlying philosophical phenomena as to why this theorem is true.

But let's not talk about theorems and proofs; let's talk about intelligence, which is what the promise of DL and AI are all about.

Recently I read another argument that all that the human brain does is also just optimisation, and "intelligence" as we know it, is simply all the lucky breaks obtained by the greatest optimisation process on earth -- evolution. Genetic evolution simply foraged its way over millions of years, and got a few lucky breaks, that resulted in the intelligent life of today.

Well.. yes and no.

Consider the following pictures:

Diamonds
Anthill
Both pictures show intricate patterns that are a result of complex optimisation processes that happen in nature. The first picture shows a set of diamonds formed by a process of intense high-temperature annealing, and the second picture shows an anthill built by a swarm of termites, each of which is trying to compute some local optima for itself. 

While both represent intricate, complex structures resulting from processes of optimisation, I'm sure we would agree that it is only the second picture that represents output of some kind of "intelligent" activity. 

What is the difference between an optimisation process that forms diamonds, and an optimisation process that builds an anthill? Why is the former not a result of intelligence, while the latter is? 

The point I am trying to make here is that, while the human brain does forage around a lot, and mostly indulges in optimisation, its activities are not arbitrary. The brain forages in very specific ways -- and is driven by a concept of "Self". 

The same is true of the termites that built the anthill. While the set of termites are collectively optimising something-- individual termites too are optimising-- by autonomously acting in their self interest. Each element of the termite system is an autonomous agent, while the same is not true about the hydrocarbon molecules that form the diamonds. (In the Western model of physics, that is. In Eastern dharmic models, both molecules and creatures are made of the same essence of "being" called Atma, and are essentially optimisers). 

Regardless of whether we use the Western model or the Eastern model of physics, the key takeaway in my argument is that, the essential element of intelligence is a sense of "Self" and optimisation processes driven autonomously by the Self in either trying to sustain itself, or out of a more generalised form of "self interest". An intelligent collective is something that has a collective sense of self -- like an ant community or a nation state. The collective sense of self is an emergent property of the interactions (and interferences) among the constituent set of selves that make up the collective. 

But deep learning based on artificial neural networks, model neurons as "gates" rather than "agents" that they really are. The "gate" model of a neuron comes from traditional models of computing that are based on digital logic, that in turn comes from electronics circuit designs based on transistors and valves. Which in turn comes from flow models in electrical engineering. 

In contrast though, life forms are better understood as "societies" of interacting autonomous agents, each of which, are pursuing self interest. Each neuron in our brain is much more than a gate. They are autonomous agents that take decisions on their own regarding who to connect to, where to get information from, and where to transmit to. Of course, decisions taken by neurons are based on their surroundings and the signals they get. But there is no overarching blueprint that decides a priori how neurons connect or remain connected to one another. 

Societies of agents are inherently declarative in nature, and routinely build abstractions on their own, to help them coordinate. For instance, when we grasp something with our hands, in our minds, we only have a mental image of our hands doing the grasping. Our minds have no idea how many muscles are involved in this process, and how or who is coordinating them. It just creates an abstract image of the hand grasping something, and this image is translated into action by who-knows-how-many layers of neural abstractions underneath. 

The complex nature of coordination that such an action requires, is managed by a distributed system of incentives and disincentives that bring about the desired collective behaviour. 

I would like to conjecture that without a sense of self and autonomous decision-making, deep learning will remain largely a process of generalisation from training data, or finding patterns in data. Neurons should be able to build their architectures on the fly, and continuously keep adapting them. And fundamentally driven by a sense of self preservation and utility maximisation.

Comments

Popular posts from this blog

Understanding Saturation and Stagnation

Fighting inner demons

The Web and Dharma