Beyond Sentiment Scoring

There are times I think the text analytics industry has painted itself into a corner with sentiment scoring.  Not too long ago I attended an industry event in which every provider of text analytics that presented talked about how their solution could do sentiment scoring, and also a few other things.  Speaker after speaker, I thought the “few other things” mentioned in passing were way more useful than sentiment scoring.  But the speakers appeared to feel that sentiment scoring was what text analytics is about, at least from a PR and marketing perspective.

I suspect it was history and commercial pressure that caused this.  The history part is that the origin of text analytics was in intelligence agencies trying to analyze world media.  Around 1997, I was invited to observe what must have been one of the first implementations of text analytics.  The developer was government research firm Bolt, Beranek, and Newman.  (Right, that BBN, the one that invented the Internet.)  This modest project was recording all of the media broadcasts in all languages around the world, translating them to English, performing entity extraction, and scoring the sentiment toward each entity.  The client of BBN was, of course, the obvious three-letter intelligence agency.  The entities in question were nations, governments, political leaders, militaries, guerrilla organizations, and the like.  The use was to assess such things as developing threats and political upheavals.   Needless to say, I was blown away.  And sentiment scoring was the essence of the value add the application delivered.  The whole operation was carried out, fundamentally, for the purpose of scoring the sentiment toward the entities and watching for trends.

A few years later sentiment scoring made the jump to the commercial space.   Companies doing media monitoring, counting stories, and providing “clips” were able to add sentiment scoring as a flashy new technology.  And (Whamo! Presto!)  the industry of reputation management was born.  From a marketing perspective, sentiment scoring for reputation management was a brilliant move.  If you weren’t tracking your positives and negatives, you were just plain inadequate as a marketing communications manager.  And the economics were (and still are) terrific for those of us in the sentiment scoring business; we had a tool that we could sell to almost every company and industry with few if any changes to the core platform.

But now I think our success as text analytics vendors of sentiment scoring solutions has painted us into the corner I mentioned above.  Selling sentiment scoring worked so well we haven’t learned how to apply the techniques of text analytics to other problems.  And these other problems may be more interesting to solve.  Take this piece of text for example.

“Investors were cheered by Company A’s announcement that it is engaged in a cost reduction effort in order improve profits.”

Entity extraction can clearly identify Company A as a company entity mentioned in this article.  Also, a sentiment scoring engine might conclude that Company A  has positive sentiment being expressed toward it because of the emotionally laden words cheered and improve.

Of course, the real world is messy.  Now consider the rest of the news article.

“Company A will be laying off employees, cutting salaries, terminating health benefits for retirees, and closing plants.”

Suddenly, we don’t know how to score the sentiment of the article or the sentiment toward the entity Company A.    Cheered and improve are mixed in with laying off, cutting, terminating, and closing.  We still suspect the investors are positive toward Company A, but now worry that the retirees are not.  We are not sure about the author, editor, or publisher of the story; they may just be dispassionately reporting facts, and have no feelings whatsoever toward Company A.

But more important than how to score the sentiment, who cares about sentiment in this story anyway?

This news story is rich with strategic information about Company A.   A text analytics solution that is trying to extract business meaning rather than score sentiment would latch onto these “meaning-loaded entities,” expressed as strategic scenarios of interpretation:

Company A is Reducing Costs with a Staff Reduction.

Company A is Reducing Costs with a Salary and Wage Reduction

Company A is Reducing Costs with Plant Closings

Automatically extracted meaning such as this, when performed not on one story that is all of two sentences long but on tens of thousands of full-length news articles being published weekly on all the companies doing business with one’s own organization, can greatly shorten the time to insight for business analysts who may be competitors, customers, or suppliers of organizations like Company A.  If you are a competitor, you may be facing a leaner, meaner, more price-aggressive Company A in the market place.  If you are a customer, you may want to look around from alternative sources of supply if one of the plants being closed is the one you have been buying from.  If you are a supplier, you may want to pitch Company A on ways they can cut costs by using your products in place of your own competitor’s offering.

Sentiment scoring is a nice idea.  But it is time for the text analytics industry to move forward with new, more powerful capabilities.