The War Against Machine Learning

IMG_5235
In 1969, during the war for the supremacy in Artificial Intelligence, the symbolists dropped a bomb over their adversaries, the connectionists, that proved to be so destructive that it stopped the advance of what later would be known as Machine Learning in its tracks, bringing to a closure a conflict that has lasted over a decade. Fifty years have passed since these events, and the fortunes of both camps have reversed completely: While ML is on everyone’s lips, heralded as the engine of the next technological revolution, only a few readers would be familiar with the work of the symbolists.

The story goes back to the 1950s, during the genesis of AI, when researchers in computer science, neurobiology and information theory, came together to study the feasibility of creating an electronic brain gifted with human intelligence. From the onset, two schools of thought emerged on what should be the correct way of formalizing the fundamentals of the theory. On one side there were the symbolists, whose central premise was that, for machines to perform a task, you need to explicitly provide all the pieces of knowledge relevant to the problem that have been collected by human experts. The embodiment of the solution is a computer program that assembles new structures of information through the manipulation of abstract symbols with a specific set of rules and exceptions.

A symbolist with the task of turning a machine into an expert chef, capable of creating new and exciting recipes, would write a computer program listing all types of food with their chemical properties, the available kitchenware with their instruction manuals, templates of basic recipes, and categories of meals. The computer program would then collapse the art of creating a new recipe into a series of sequential steps as if following a decision tree.

Don’t give the fish to the computer; teach it how to fish.

On the other side of the confrontation, there were the connectionists, who were inspired by the massively distributed network of neurons in the brain, and the synaptic mechanism for communication between them. They proposed a computer architecture of basic units (the “artificial neurons”, or “neurodes”) that would be activated depending on intrinsic characteristics specific to each of them, and the pattern of the incoming signals. An input to the artificial neural network triggers a cascading reaction on arrays of neurodes until reaching an outermost layer from which a result would be recovered.

The critical step is the calibration of those intrinsic characteristics of each neurode, which would be accomplished by “training” the network. The training consists of passing to the system many examples with known outputs and varying the internal properties of all neurodes until the error is minimal. In other words, the machine learns an appropriate configuration of internal parameters that solves the problem, at least for a set of examples.

A connectionist looking to make her machine an expert chef would train it by showing thousands of examples of meals together with the ingredients and kitchen appliances that were used to prepare them. It would be up to the machine to figure out the patterns shared by all of them, and at the end of the training process, the network would have hard-wired the characteristics enabling it to produce a recipe, perhaps one never imagined by a human. Notice that neither the principles of cooking nor explicit knowledge of the properties of food is used by the connectionist, only vasts amounts of examples, that is, vasts amounts of data.

Minsky and Rosenblatt go to War.

The two theories were gradually developed during the 1950s, with AI researchers trying connectionist or symbolic approaches interchangeably, oblivious of unsolvable differences between them. But as they made progress in their work, the profound discrepancies in the set of assumptions made by both sides became too obvious to ignore. At its core, the AI schism reflects a profound philosophical debate about intelligence, cognition, and learning, all highly complex matters that can ignite passionate confrontations about who holds the truth.

The lines were drawn along opposite views that transcended mere technicalities. The symbolic side championed attributes related to the left-brained, analytical mind (logic, serial, discrete, local, hierarchical) while the connectionist assume the cause of the right-brained, creative mind (analogical, parallel, continuous, distributed, heterarchical). This characterization of both sides is, of course, simplistic, but hints to the vast discrepancies that were dividing the AI family.

Enter Marvin Minsky and Frank Rosenblatt. Both scientists have traced parallel careers that took them from the Bronx High School of Science in New York, where they were classmates in the 1940s, to the helm of the two intellectual ships. Minsky, once a connectionist himself, became the leader and spokesman of the symbolists’ movement from his headquarters in MIT; Rosenblatt, at Cornell, assumed the same role for the ML side.

The first resounding victory to connectionists came in 1958, when Rosenblatt introduced the “perceptron“, one of the simplest neural networks, and announced that it had an almost magical property: regardless of the complexity of the problem posed to the machine, the perceptron algorithm was always guaranteed to find the solution if it existed. This property was proved with mathematical rigor by Rosenblatt, and later by Block and Novikoff, placing the ML approach in the most solid of grounds.

Connectionists were jubilant with the invention of the perceptron, and their excitement prompted them to start dreaming of a future of endless delights. An elated Rosenblatt would take his vision to the media and in 1958 the New York Times reported:

“The Navy revealed the embryo of an electronic computer today that it expects will be able to walk, talk, see, write, reproduce itself and be conscious of its existence. Later perceptrons will be able to recognize people and call out their names and instantly translate speech in one language to speech and writing in another language, it was predicted.”

The symbolists didn’t take well these declarations, which they deemed extravagant and ridiculous. In their view, Rosenblatt was irresponsible to paint these eye-catching images to the public, disrespecting the scientific standard in which the debate should be framed. But for the connectionists, being on the spotlight brought a boost in appeal, and they started to attract hordes of researchers scattered across university campuses all over the world, and perhaps most important, the resources of funding bodies.

Dropping the Bomb

Although the 1960s is remembered as the first golden age of ML, the truth is that during those years the progress that was made by the connectionists was not matching the expectations they had created. There were some advances in the theoretical side, but applications were still clunky and far from impressive. It became clear that neural networks required much more computer power than the one that was available at the moment and the lobbying for more resources started clashing with a wall of skepticism. The Office of Naval Research had funded some of the development of the perceptron, but to take things to the next level, the connectionists would need to reach the deep pockets of ARPA, the powerful US government agency responsible for the development of new military technologies (now rebranded as DARPA).

Meanwhile, for Minsky, the wave of connectionism enthusiasm had reached an unacceptable level. It didn’t make sense that the AI community keep wasting time in an approach that was just full of hot air, and that violated the most straightforward fact about intelligence and learning: “No machine can learn to recognize X unless it possesses, at least potentially, some scheme for representing X”. He also had a fundamental problem with the lack of rigor of the connectionist movement and, in his view, their apathy to answering the most critical question of all: Why the learning systems were able to learn to recognize certain kinds of patterns and not others?

And here is where the genius of Minsky was displayed in its fullest. Instead of waiting for the connectionists to provide the answer to this question, he embarked on the mission of answering it himself. Together with Seymour Papert, also at MIT, Minsky published “Perceptrons” in 1969 , a 292 pages treatise of extraordinary elegance and profound insights about neural networks. For their exposition, they choose a very basic architecture that allowed them to frame the analysis with the rigor they had expected from Rosenblatt and his troops. And though the book was dispassionate in its presentation of the pros and cons of neural networks, leaving the mathematics to do most of the talk, it carried within the bombshell: the limitations of perceptrons included some embarrassingly simple problems that any AI system should be in the position to solve.

The book didn’t explicitly say so but, for any reader, the message that the king may have no clothes was just too obvious. What could be the future of a technology that couldn’t even tell if a figure is connected or not? How could those perceptrons recreate the complexities of human intelligence if it couldn’t reproduce the humble exclusive OR logic gate?

We know now that the picture that Minsky and Papert painted was incomplete, in the sense that neural networks can overcome the limitations that they described so precisely. The trick they pulled consisted of narrowing enough the type of systems that they analyzed, disregarding multi-layer perceptrons (which were already known when they wrote the book). There is something paradoxical about the rigor of a mathematical theorem, that on the one hand it serves as the ultimate framework to establish the truth, but on the other, it has the power to obfuscate a discussion unnecessarily. The Perceptrons controversy is an example of a mathematical result that ended up diverting the attention of researchers in a decisive but unfair way, but by no means is the only one.

The publication of Perceptrons opened the gates for the disbandment of the connectionist movement. Its researchers moved to other areas, leaving behind only a handful of the most loyal members to keep dreaming of the learning machines. Compounding this shock, ARPA explicitly decided to back symbolic AI and not to fund neural network research. And, before the curtain closed on this story, tragedy stroke connectionists two years after the publication of Perceptrons, when Frank Rosenblatt died on a boat accident, the day of his 43rd birthday.

The second coming of neural networks would have to wait for the next generation of researchers in the late 1980s, but the challenges they faced then proved again to be too strong. The current hype of ML is, in reality, the third time that this technology grabs the control of the AI movement – but this time it has done it with a vengeance that makes you wonder that perhaps in this occasion, it is here to stay.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s