Significant disparities found in symptom assessment apps

Digital health researchers have found significant variance in coverage, accuracy and safety among the eight most popular online symptom assessment apps, raising questions about how ready some of these technologies are for use in clinical settings.

Chloe Kent December 16, 2020

Symptom assessment apps have huge potential to reduce the burden on strained healthcare systems and improve outcomes. Credit: Ada Health

The study, published in BMJ Open, compared the results of Ada Health, Babylon, Buoy, K Health, Mediktor, Symptomate, WebMD and Your.MD.

Go deeper with GlobalData

The gold standard of business intelligence.

Find out more

Discover B2B Marketing That Performs

Combine business intelligence and editorial excellence to reach engaged professionals across 36 leading media platforms.

Find out more

Ada Health chief medical officer Dr Claire Novorol said: “Symptom assessment apps have seen rapid uptake by users in recent years as they are easy to use, convenient and can provide invaluable guidance and peace of mind. When used in a clinical setting to support – rather than replace – doctors, they also have huge potential to reduce the burden on strained healthcare systems and improve outcomes.”

Researchers assessed each app using 200 clinical vignettes – fictional patient cases based partly on real world examples – and used a panel of human general practitioners (GPs) to benchmark performance.

The vignettes were generated using transcripts from the UK’s NHS 111 non-emergency telephone service and through cases the research team had seen themselves. They were then reviewed by an external panel of primary care practitioners to ensure quality and to assess diagnosis and urgency.

The vignettes were then entered into each of the apps by eight external GPs playing the role of patient. Each app was tested once against every vignette. Seven external GPs were also tested with the vignettes, providing preliminary diagnoses for the vignettes after telephone consultations.

GlobalData Strategic Intelligence

US Tariffs are shifting - will you react or anticipate?

Don’t let policy changes catch you off guard. Stay proactive with real-time data and expert analysis.

By GlobalData

The study found that only a handful of apps came close to the performance of human GPs.

The study looked at how comprehensively the apps covered all possible conditions and user types – a tool with poor coverage may exclude users who are too young, too old or pregnant, for example. Human GPs provided 100% coverage.

The most comprehensive app was Ada, which provided a condition suggestion 99% of the time, followed by WebMD at 93% and Buoy at 88.5%. The lowest scorers were Babylon, which was able to provide a condition suggestion only 51.5% of the time, followed by Symptomate at 61.5% and Your.MD at 64.5%.

Accuracy of each symptom assessment was also tested by comparing the conditions suggested with what a panel of doctors deemed to be the ‘gold standard’ response for each case. This metric was also highly variable, compared to the 82.1% accuracy of the GPs.

While Ada was rated as the most accurate, with the right condition in its top three suggestions 71% of the time, the other apps fell far below this. The next most accurate was Buoy with 43% accuracy, with the lowest scorer being Symptomate with only 27.5% accuracy.

The study also assessed the safety of the apps in question by examining whether the advice they gave to users had the appropriate level of urgency. Most apps gave safe advice, all scoring above 80%, but only Ada, Babylon and Symptomate came close to the 97% safety rating of the human GPs.

Brown University associate professor of Medical Science Dr Hamish Fraser said: “These results should help to determine which apps are ready for clinical testing in observational studies and then randomised controlled trials. The study design could form a model for future evaluations of symptom checker apps, and as part of assessment for regulatory approval.”

Sections

Sections

Sections

Sections

Sections

Sections

Significant disparities found in symptom assessment apps

Go deeper with GlobalData

Disruptive Tech Themes in Pharma

LOA and PTSR Model - Idiopathic Membranous Nephropathy

Go deeper with GlobalData

Discover B2B Marketing That Performs

US Tariffs are shifting - will you react or anticipate?

Disruptive Tech Themes in Pharma

LOA and PTSR Model - Idiopathic Membranous Nephropathy

Go deeper with GlobalData

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

Go deeper with GlobalData

Disruptive Tech Themes in Pharma

LOA and PTSR Model - Idiopathic Membranous Nephropathy

Go deeper with GlobalData

Discover B2B Marketing That Performs

US Tariffs are shifting - will you react or anticipate?

Sign up for our daily news round-up!

Give your business an edge with our leading industry insights.

Go deeper with GlobalData

Disruptive Tech Themes in Pharma

LOA and PTSR Model - Idiopathic Membranous Nephropathy

Go deeper with GlobalData

Access deeper industry intelligence

Sign up for our daily news round-up!

Sign up to the newsletter: In Brief

I would also like to subscribe to:

Thank you for subscribing