7 Writing Tips for Accurate Machine Learning Summaries
There’s a wrong way and a right way to “talk” to a machine. And if you want to craft influential content, it’s important to know the distinction. That’s because it is now practical for organizations to use Machine Learning (ML)-enabled systems to “read” and summarize complex documents, as Northern Light SinglePoint does for market research and competitive intelligence.
If you want the content you write represented accurately in those ML-generated summaries, it’s important to observe some basic writing do’s and don’ts to be the most “ML friendly.”
Here are seven simple guidelines to help ensure an ML-enabled system will accurately analyze your writing.
- Authors should Use Full Stops. Period.
Machine Learning (ML) works best with simple declarative self-contained sentences which end in periods. Declarative sentences have a subject and verb and express a complete idea within the boundary of the sentence. For example, “IBM bought Red Hat,” is a just such a declarative phrase because it expresses a complete idea with ‘IBM’ as the subject and ‘bought’ as the verb and Red Hat as the object of the verb. The machine knows the idea is complete when it hits the period at the end.
On this last point, if you end your sentences with anything but a period you may not have written a declarative sentence. Exclamation points and question marks are red flags that the sentence may not express an idea understandable by the machine.
Also, question marks may indicate that you separated the question and the answer. Splitting questions from answers into separate sentences also causes a problem because the machine will not be able to symantically relate the two sentences together. So authors should pretend you’re the opposite of a Jeopardy contestant — and always write a question in the form of an answer.
Don’t say: “What percentage of users update their antivirus programs weekly? 67%.”
Do say: “67% of users update their antivirus programs weekly.”
- Authors Should Avoid Pronouns Referring to Names in Prior Sentences.
Words like “he, she, they, it” or even pronoun-like words such as “company” and “firm” aren’t easily understood by ML. It works best when you provide the actual name of the person or the business instead of the pronoun. This is especially important when you have the pronoun in one sentence and the antecedent of the pronoun in a previous sentence.
Don’t say: “We interviewed John Bowman, SVP of Cloud Computing Is Us. He said, ‘Cloud computing rocks.”
Do say: “John Bowman, SVP at Cloud Computing Is Us., said in an interview, “Cloud computing rocks.”
- Authors Should Integrate Lists into Sentences.
As already noted, ML does best with declarative sentences. This means ML won’t easily associate the heading of a list with its actual items, if they’re formatted in bullet-pointed, numbered, or lettered words below that heading. The answer to this particular issue is simple — put everything together in a single statement. If you need to use a list or bullet points, summarize the list in a following sentence.
Don’t say: “The key factors in enterprise cloud vendor selection are:
- a) Implementation assistance
- b) Near instance response to increased server needs
- c) Uptime history
- d) Cost”
Do say: “The key factors in enterprise cloud vendor selection are implementation assistance, near instance response to increased server needs, uptime history, and cost.”
- Authors Should Save Social Media Conventions for Social Media.
A hashtag may help a tweet get quickly picked up on Twitter, but it will just confound a ML system. So authors should stick with the actual words sans symbols and your chances of being understood improve considerably. Also, avoid the sort of “shortcut grammar” you often find in tweets and texts.
Don’t say: “#IBM bring the technology and #Accenture bring the industry expertise” sums up why #partnering is so important.”
Do say: “IBM bringing the technology and Accenture bringing the industry expertise sums up why partnering is so important.”
- Authors Should Avoid Creative Similes and Expressions.
If you employ language like, “It’s hotter than an oven at 500 degrees,” you may liven up your writing, but ML may assume you’re writing about cooking instead of the actual subject of the piece. So, simply stick to the facts and ML will reward you for your efforts.
Don’t say: “The study shows that compared to XYZ’s servers, IBM Power Systems are faster than a greased pig.”
Do say: “The study shows IBM Power Systems are faster than XYZ’s servers by a large margin.”
- Authors Should Use Common Ways of Referring to Objects and Processes
You’re working too hard if you’re trying to come up with novel expressions and names that aren’t routinely used in others’ writing. And use the common acronyms that apply. So, authors should be sure to word your references the same way as most writers.
Don’t say: “Geospatial location systems are an essential element of driverless car technology.”
Do say: “GPS is an essential element of driverless car technology.”
- Authors Should Not Use the Imperative Mood.
Finally, avoid the imperative form of a sentence. The previous statement, was, in fact, an example of using the imperative, in which there is no explicit subject, just a command or a request. Machine Learning is prone to perceive a sentence without a subject as an incomplete idea, which will compromise the insights it gains from your content.
Don’t say: “Don’t use the imperative mood.”
Do say: “Authors should not use the imperative mood.”
By writing simple declarative sentences containing commonly used language, you’ll find it’s not difficult to match up your human intelligence with a system’s artificial intelligence. It is not hard to write for both people and the machine at the same time. As a matter of fact, this article is written using all the principles laid out above. (Declarative sentences, no questions with separated answers, minimum use of pronouns, no creative terminology containing essential information, no imperative mood.) Most likely, you the readers of this article, never noticed the subtle changes that make this article machine learning-ready. If you just use the tips and tricks in this post, your articles and reports can become the superstar of any ML-generated search results summary, including those read by the many Fortune 100 clients using our knowledge management platform, SinglePoint.