This AI Just Beat Human Doctors On A Clinical Exam

By Parmy Olson for Forbes

The lights were dimmed in an auditorium packed with doctors on Wednesday night at London’s Royal College of Physicians. They were there to find out how AI might fundamentally change the way they work.

On stage Dr. Mobasher Butt, a director at digital healthcare startup Babylon Health, stood before a podium to read out the results of an exam taken by his company’s carefully trained AI doctor.

The average passmark for the MRCGP exam, which trainee general practitioners take to test their ability to diagnose, has been 72% over the past five years.

“How did Babylon Health do?” he asked, before waiting a beat. “It got 82%.” Several people clapped loudly, sparking applause from the rest of the audience. The AI had beat human doctors handily.

Babylon has been developing its own artificial intelligence software for the last two years, an ambitious endeavor since hiring the engineers needed to build your own AI is eye-wateringly expensive and the return on investment can be hit and miss. AI models can become brittle and fail if they’re subjected to a confusing array of environmental factors.

But for now, the artificial doctor’s brain built by Babylon, which has raised $85 million since its founding five years ago, works when it’s put in a test environment. And while regulations limit it to providing medical advice only, it may be only a matter of time before it’s trusted to make a diagnosis and even write a prescription.

The startup’s charismatic founder, Ali Parsa, has called it a world first and a major step towards his ambitious goal of putting accessible healthcare in the hands of everyone on the planet. “Five billion people globally have no access to surgery,” he said.

Without adequate primary care either, “a $10 problem becomes a $100 solution.” But catch it early with a robo-doctor like Babylon’s, and Parsa believes he can stop an illness from becoming an expensive problem for a state provider like the National Health Service or an insurer.

In some cases that means using Babylon’s intelligent chatbot to prevent a unnecessary consultation by “reassuring” people that they don’t need to see a live doctor, Parsa said in a separate interview with Forbes at Babylon’s headquarters.

He points out, a little ambitiously, that Bablyon’s end-to-end clinical service, which blends medical advice from humans and software, can deliver a diagnosis more cheaply over time when taking into consideration that two thirds of healthcare costs come from people’s salaries.

Bablyon sells directly to consumers or providers access to its network of 250 work-from-home doctors whom people can video-call on their mobiles. It also sells access to medical-advice software that people can use to investigate an ailment.

It is the latter feature that Parsa has spent the last two years heavily investing in, so that his human doctors are freed up from note-taking and diagnosing common illnesses, to looking after more complicated problems. “You don’t need to see a doctor for a diagnosis,” Parsa told Forbes. “What you want is a treatment.”

In the demonstration on Wednesday night, a large screen above Parsa showed an animated, 3-D web of symptoms and diseases, as the voice of a woman resounded through the auditorium, answering automated questions from a chatbot about her recent dizzy spells.

As she answered the questions, a table showed the software constantly readjusting her likely ailments, before settling on an 80%-probability that she had Ménière’s disease.

At the heart of this chatbot was Babylon’s diagnostic engine, which its engineers are constantly training with data from its interactions with humans (and which has now passed the doctor’s exam).

The screen then filled with text and graphics, the interface that Bablyon’s doctors see when they video-call with patients. An image of the woman calling in was covered in a digital web of lines—a facial tracking system that told the doctor if she was feeling confused, worried or neutral, based on the movements of 117 muscles in her nose, lips or eyebrows.

As they spoke, another box on the side of the screen was transcribing their conversation and categorizing it into sections.

In another box was a graphical, translucent illustration of the woman’s body, highlighting her organs and muscles. This was her “digital twin,” and over time, Parsa said, Babylon’s software would be able to make predictions about which parts of the body were most at risk of illness or disease, by running a vast number of simulations on the twin.

“I’m going to agree with our AI assessment,” said the live, human doctor who was featured in another box on the screen, after he asked the woman a few more questions. “It looks like you’ve got Ménière’s disease. I’d like to prescribe something called prochlorperazine. I’ll send that prescription to your usual pharmacy.”

As the demo ended, Parsa walked back on stage and asked doctors in the audience how much time they spent writing doctor’s notes. “About 50% of my time,” answered Megan Mahoney, who was chief of general primary care and population health at Stanford University, and who had overseen Babylon’s test.

Such is Parsa’s pitch to healthcare providers: Use his service and doctors can spend their time more efficiently. Over time, you won’t need to hire quite so many of them.

Parsa’s most important customer till now has been Britain’s state-run NHS, which since last year has allowed 26,000 citizens in London to switch from its physical GP clinics to Babylon’s service instead.

Another 20,000 are on a waiting list to join. The NHS pays Babylon an average of $80 per patient each year, which is on par with what it pays a typical doctor’s clinic; older people and those with chronic illnesses cost in the range of $2-$300 while younger, healthier people can cost as little as $30.

Now Parsa is bringing his software service and virtual doctor network to insurers in the U.S. His pitch is that the smarter and more “reassuring” his AI-powered chatbot gets, the more likely patients across the Atlantic are to resolve their issues with software alone.

It’s a model that could save providers millions, potentially, but Parsa has yet to secure a big-name American customer.

“The American market is much more tuned to the economics of healthcare,” he said from his office. “We’re talking to everyone: insurers, employers, health systems. They have massive gaps in delivery of the care.”

“We will set up physical and virtual clinics, and AI services in the United States,” he said, adding that Babylon would be operational with U.S. clinics in 2019, starting state by state. “For a fixed fee, we take total responsibility for the cost of primary care.”

Parsa isn’t shy about his transatlantic ambitions: “I think the U.S. will be our biggest market shortly,” he adds.

If he can eventually quantify the cost savings that his automated-care model can bring to a provider like the NHS, he might have a chance.

Share Article:
Dolbey Systems, Inc.