How NLP Sounded the Global Alarm for COVID-19

Photo credit: iStockphoto/HT-Pix

At first glance, the COVID-19 epidemic and technologies such as natural language processing (NLP) would appear to have little in common.

The reality is different. NLP, a branch of AI that powers chat programs, helped to detect the outbreak in the initial stages. The same technology is now driving monitoring efforts and, hopefully, minimizes the virus spread.

NLP breakdown the virus chatter

There are several accounts on the early prediction of the virus in Wuhan, China, and both involve companies deploying NLP techniques.

In Boston, U.S., it is claimed that the first public alert outside of China came from the automated HealthMap system at the Boston Children’s Hospital.

Using NLP to scan online news and social media reports, HealthMap sent out an alert about unidentified pneumonia cases in Wuhan at 11.12 pm Boston time on December 30.

HealthMap uses a software mapping program from Mapbox Inc to analyze the data in the repository.

The HealthMap system—which ranks alerts on a scale of 1 to 5—ranked the Wuhan alert as only a 3. But the more researchers looked into it, they understood it was far more serious.

Still in North America, a Canadian startup company called BluDot was also sounding the alert, sending out bulletins to its customers on Dec. 31, 2019—a few days before the authorities made their announcements.

On the same date, in Toronto, Canada, BluDot found a Mandarin language article that said 27 people were sick with pneumonia in Wuhan. The two key expressions were “pneumonia” and “unknown case.”BluDot describes its business as “automated infectious diseases surveillance,” and uses NLP and an algorithm to digest and process data from thousands of news and air traffic information sourced from 65 languages. Every 15 minutes, the algorithm reviews the data for keywords and phrases. 

BluDot founder Kamran Khan, a physician who had the idea for the company after the SARS epidemic in 2002, told the news agency AFP the system was looking for “needles in a haystack.”

“There is an overwhelming amount of data,” he told the news agency.

“The machine looks for needles and presents them to human experts. While we didn’t know it would be a big global outbreak, we noticed some ingredients similar to what we saw during the SARS epidemic.”

Building on Google’s failure

The HealthMap and BluDot alerts suggest that the use of data in epidemiology is improving since Google’s ill-fated attempt to predict flu incidents.

Many will remember that back in 2008, Google claimed that it could use data from internet searches to predict flu trends.

The idea was that if more people googled for information on the flu in particular geographies, then this would enable forecasts of flu activity in the U.S. with a reporting lag of one day.

The whole idea failed spectacularly when Google missed the H1N1 pandemic and then the peak of the 2013 flu season, resulting in a rebuke from Wired magazine for “big data hubris.”

Ultimately, this technology is only as good as the data it analyses, and with Google it was a case of “garbage in, garbage out.”

The HealthMap and BluDot examples show that combining better NLP, machine learning and an exponential increase in data sources can increase accuracy and improve the algorithm over time. In contrast, the Google initiative had a fairly one-dimensional assumption.

Global NLP effort to monitor virus grows

HealthMap and BluDot are not the only AI companies involved in the COVID-19 outbreak. 

In Israel, Cobwebs Technologies is tracking mentions of the virus on the internet, social media posts and on twitter.

The company claims that its insights were useful during Hurricane Harvey in 2017, when its tools helped identify people in need of urgent help.

With COVID-19, the Cobweb platform tracks help requests and people in close contact with patients. While the company says this information will help authorities, there are also fears on unfair monitoring of people.

It seems that NLP-powered AI is developing significant abilities in incident detection, which can have more applications going forward in the health sector.

Beyond COVID-19

In the U.S., researchers at Indiana University, the Regenstrief Institute and pharmaceutical giant Merck are using these techniques to examine electronic medical records. They aim to identify patients at risk of developing dementia.

Many people who develop Alzheimer’s disease or dementia are never diagnosed. Yet, the researchers claim that the NLP algorithm could predict the onset within one to three years of diagnosis.

The market for NLP could reach over USD 80 billion by 2026, according to one recent study by Fortune Business Insights, with the technology set to unleash sweeping changes over fields ranging beyond healthcare to manufacturing and to speeding up the Know Your Customer (KYC) onboarding process in financial services.

So, while AI is the big buzzword, it is often NLP which is doing the heavy lifting and delivering many of the practical benefits in diverse areas from health and customer service to compliance.

Photo credit: iStockphoto/HT-Pix