On Intelligence: Lessons for Meaning Extraction

One of the books that inspired my thinking on meaning extraction was On Intelligence by one of Silicon Valley’s most successful computer architects, Jeff Hawkins, and highly-regarded science writer Sandra Blakeslee.  Hawkins founded the Redwood Neuroscience Institute to study memory and cognition, but he is not just an academician doing brain research.  Rather, he is a substantial practitioner, being the CTO of palmOne and counting the creation of the PalmPilot among his accomplishments.   I am always attracted to those persons that not only think a lot about a problem, but who then solve the myriad of practical problems required to translate those thoughts into devices and processes out here in the concrete reality.  (A colleague recently described me as a “poster boy for pragmatism.”)

Hawkins begins the book by observing that artificial intelligence has generally failed to produce anything like the results that its originators had promised and expected.   (Echoing this theme, I once heard someone else describe AI as the failed dream of visionaries of the 70’s, then again of the visionaries of the 80’s, then again of the visionaries of the 90’s, etc.)   Hawkins believes that the problem is that the AI industry didn’t actually take the time to learn how intelligence works in humans.  You can’t build a replica, image, or functioning duplicate of something about which you are clueless.  Furthermore, he believes that what brain research we have done on human intelligence indicates that we people think and reason very differently than the way AI researchers (1) supposed we might and (2) based their computer program designs upon.

Hawkins does not claim to know how to make artificially intelligent computer programs, but he presents the argument that intelligent computer programs are more likely to succeed if they process information the way the brain does.  And while we don’t know everything, or even much, about how intelligence arises in the brain, we do know some things.

One thing we know is how the brain interprets and acts on outside sensory data.  What is interesting is that there are six levels of processing that goes on, with each level supplying a more general interpretation of the data.  (Warning, I am writing this from memory of the book, which I read last year, so it presents Hawkins’ general thinking with a lot of paraphrasing and interpreting; I am not using his specific words and specific examples.)  For example, the eye sees patterns of light, dark, and color and then passes the information to level one in the brain’s sensory processing systems.  Level one examines its conceptual models for light and dark patterns and says, “hey, that might be a triangle,” and passes that information up to level two.

Level two examines its models for triangles and says, “Triangles like this one are likely noses, expect circles above the triangle,” and passes that information down to level one as a prediction.  Simultaneously, motor functions shift the eyes up and back and forth scanning from one point of fixation to another in what scientists call a saccade bringing in more external data. Level one, amazingly, interprets the downloaded prediction from level two as equal in weight and reality as data the eyes are now sending in saying we have circles above the triangle.  As long as the download from level two and the input from the eyes are congruent, level one keeps processing both streams as if they were both from outside, now sending data up to level two that there is a triangle and two circles just above the triangle.  Level two examines its models for a triangle and two circles, selects the best one, and passes the interpretation to level three saying we have two eyes and a nose.

Level three examines its conceptual models and says the object appears to be a face; predicting there will be a mouth, ears, and a chin – sending that information down to level two.  Level two integrates the level three generalization and sends the data down to level one, which again interprets the download as additional sensory data from the outside while scanning outside inputs for patterns of light, dark, and color that might be mouths, ears, and chins.  Meanwhile, level three is also sending its prediction up to level four with the same strength as sensory data with the information that there is a “face” present.

Level four looks at its models and says it is probably the face of someone in our tribe because the face is coming from the direction of our camp, passing that information both up to level five and down to lower levels to use to interpret sensory data coming up.

Level five says it is the probably the face of Frank based on some perceived similarity, passes this to level six, and level six concludes “ this is Frank who is a friend.”   Level six then passes the information that Friend Frank is approaching us to executive regions of the brain that make a decision that we should smile and put out our hand to shake his as a warm greeting.   Notice how profound level six is compared to level one.

What is marvelous about the above information processing architecture as presented by Hawkins in On Intelligence is that it can operate with very little outside data compared to the amount that is actually out there.  Our eyes can only see a small percentage of the area in our “field of vision.”  In any given moment we only focus on a miniscule percentage of what is in front of us.  The prediction downloads from higher levels fill in the data that should be there.   Most of what we perceive out there is really just these predictions from the higher up layers in the brain’s sensory processing systems.  Because the brain accumulates a huge store of experience as we go through life, these predictions get better and better and it turns out they are correct an astonishingly high percentage of the time by the time even young children are able to leave the immediate area made safe by adult protectors.

By filling in the predictions in the lower levels as if they were sensory data it is not necessary to process all the data from outside before a picture is formed.  The limited-bandwidth sensory apparatus need only confirm the prediction and does not have to process the entire environment.  Only because the system is thus made so efficient is there time to actually operate in our complex world at adequate speeds to be successful as high level organisms.  Based only on a glimpse of 1% of the scene, the system of experienced-based models and predictions leads us to grab a spear and assume a ready stance because that dimly-perceived, hardly-discerned movement in the bush is almost certainly a lion looking for dinner.

So, how does all this relate to meaning extraction in text analytics solutions?  I submit that there are two similarities.  First of all, there is the necessity of operating with limited consumption of the available data.  Secondly, there is the use of pre-existing models based on experience to infer larger important conclusions.

First of all, just like the challenge to perceiving the world visually, there is way more content in a modern research repository than any one person can consume in any given search process.  A researcher studying a business strategy problem or a clinical research problem is only going to examine a very small percentage of the information available, just like our eyes are only going to focus on a small percentage of the light patterns falling on the retina.

For example, in a database of two or three hundred thousand IT analysts reports for example, search on a company name and you are going to have thousands of hits.  In a pharmaceutical database of twenty five million journal articles, search on a disease state and you will have hundreds of thousands of hits.    My suggestion is that a “download” from a “higher level” of the search application in the form of a prediction of what is present and where the most important documents would be to confirm this prediction is the only hope that the user really has of coping with so much information.

If the search engine can infer meaning from the mass of search results, the user can be guided to those relatively few reports and journal articles that would make the most difference in correctly perceiving the underlying reality.  The user can then fully consume just this small number of reports and articles, much like the eye saccades to a new point of fixation looking to confirm the interpretation being suggested by the higher levels of reasoning in the brain.

Secondly then, how does a search application in a research portal make this prediction of the most important documents given the business purpose of the search?  With conceptual models of course.   Based on the long experience of practitioners in the field the user is working in, the meaning extraction application is armed with models that look for information bits and relationships between them that imply meaning.  These models get encoded in the search application in the form of things like meaning-loaded entities in a meaning taxonomy.

Much like level three implied a face from what level one thought of as a sufficient number of pixels of light and dark to form two circles and a triangle, a meaning extraction application can infer the presence of a strategically important business initiative by a competitor from a few lines of text in a sufficient number of reports, or can infer the presence of a technically significant finding by the pattern of terms in the text of enough of the journal articles on the search result.

The really important factors in such systems are the models.  You have to know what to monitor, why it is important, how to perceive that it is present.  While the plumbing, getting the data moved along fast enough, is critically important to making the system work as a mechanism, intelligence only arises when the brain, er, I mean search engine, invokes the right model given the text present in the mass of documents on the search result.

I, like Hawkins, do not claim that anyone, including Northern Light, has succeeded in building truly intelligent search engines in the same way that human beings are intelligent.  True, we have made important progress in enabling meaning extraction in search applications.  But grading the state of the art against the brain layers, I think we may be somewhere between layer three, it’s a face, and level four, it’s a face from our tribe (or not).   In our parlance, we may be able to identify that a competitor is taking market share from you (he is in your camp), but not what he hopes to gain strategically as a result (take your dinner), or what you should do in response (grab that spear).

Also, the current state of meaning extraction lacks the ability for the system to itself spawn the models, which our brain is doing spontaneously from the moment we open our eyes the first time.  How do we know that market share is important in this market segment anyway and what is it that competitors actually do to gain or lose it? How can the system learn these things without the guidance of human beings? And can the system be programmed to find the patterns on its own that imply a model is present , which right now we have to hand-tool?   Profound conclusions and sophisticated analytical tools like these are up there in level six, or even higher in the more mysterious executive regions of the frontal cortex.  We are not yet able to operate our software engines in such rarefied space, where the most uniquely human intelligence arises.

But nature did it one step at a time.  So can we.