Four Ways Natural Language Processing will Accelerate Digital Transformation
Digital transformation in security is rapidly picking up pace as we continuously bring more processes online. We collect more data than ever, but we struggle to use it effectively. Much of our data is in the form of raw text that we don’t have time to parse and analyze—potentially leading to missed threats or undiagnosed root causes. Thankfully, recent advances in the field of Natural Language Processing (NLP) offer solutions to these problems.
NLP is a field of artificial intelligence (AI) that enables machines to process and interpret human languages. NLP is used to filter our inboxes, correct our spelling, and power our Internet searches. While writing this, my son asked me where the world’s deepest lake was. The NLP in Siri came to my rescue to answer his question: Siberia, home to Lake Baikal—both the deepest and largest freshwater lake in the world.
In 2021, specialized security NLP solutions will arrive to help process text data more effectively. Security software companies are embedding NLP to assist in the collection, classification, extraction, and generation of text. The result will be substantial time savings in incident management, faster response times, more accurate threat detection, and better-quality data.
NLP streamlines the collection of data by converting streams of information into machine-readable text. Examples of NLP-based applications for data collection include optical character recognition, voice-to-text conversion, and language translation. These are mature NLP applications, but take up within security software solutions has been slow. This is changing as algorithms from the major cloud providers (Amazon, Google, Microsoft) have become easier to connect to modern security management products. Each of the above features is now available within some products either directly or through integration.
Once we have collected more data, NLP classification algorithms can help us prioritize where to spend time. For example, sentiment analysis can differentiate between emotional states such as anger, happiness, or sadness in a block of text. Sentiment analysis is used in several social media monitoring products to carve through noisy data to identify potential threats.
Perhaps more interesting are classification algorithms currently being trained on specific incident data to remove the manual effort involved in incident triage. In the coming year, look for tools that automatically classify incidents to improve escalation and routing. Others that pick out critical attributes on an incident—such as whether a weapon was used or law enforcement was engaged—are also under development.
Narrative reports and intelligence briefs contain important data related to a threat or incident. Today, security teams manually review this raw text and record involved people, locations, and organizations during a triage stage. Named Entity Recognition (NER) algorithms can simplify this process. NER algorithms, as their name implies, recognize named entities allowing us to extract them programmatically. This automation not only saves triage time, but dramatically improves data quality.
The effort to tag involved entities is time-consuming, so most teams only capture the most critical ones. Other involved entities are left buried in reports and narratives where they are harder to visualize. Neglected data causes us to miss connections that could help us find the root causes and prevent future incidents.
One of the most visible projects in the field of NLP is Open AI’s GPT-3. This is a general AI model that takes an input set and produces narrative text afterwards. The most prevalent use of text generation today is autocomplete in your inbox. At present, the graphical user interface for this is clunky but the consistency of suggestions is uncanny.
The above paragraph was written by an AI algorithm that summarized an extended description that I fed it. It isn’t perfect, but it generated a credible summary far faster than I could have. Text generation is a fast-developing subfield within NLP with near constant innovation. There aren’t any productized security applications using text generation yet, but this is likely to quickly change. The potential first application will use an algorithm like the one I used above to condense incident and threat data into a clear summary.
Accurate data is the holy grail for security leaders. But getting their teams to capture it in a complete and consistent format is a constant frustration. NLP offers a tantalizing potential to automate many of the most painful data processing steps with algorithms that will be far more efficient and accurate.
NLP offers us a free lunch—we get better data and create the time savings to reallocate to analyzing data and delivering more value to our businesses. NLP solutions are rare in security products today, but that will change as the technology becomes more mainstream in 2021.
Will Anderson is CEO of Resolver Inc.