A dataset of Solicited Cough Sound for Tuberculosis Triage Testing

Summary

The article presents a dataset of 733,756 cough sounds from 2,143 patients across 7 countries, accompanied by accurate demographic, clinical, and microbiological diagnostic annotations. It was collected using an early version of Hyfe’s research application, specifically designed to capture cough audio.

Cough is a common and commonly ignored symptom of lung disease. Cough is often perceived as difficult to quantify, frequently self-limiting, and non-specific. However, cough has a central role in the clinical detection of many lung diseases including tuberculosis (TB), which remains the leading infectious disease killer worldwide. TB screening currently relies on self-reported cough which fails to meet the World Health Organization (WHO) accuracy targets for a TB triage test. Artificial intelligence (AI) models based on cough sound have been developed for several respiratory conditions, with limited work being done in TB.

To support the development of an accurate, point-of-care cough-based triage tool for TB, we have compiled a large multi-country database of cough sounds from individuals being evaluated for TB. The dataset includes more than 700,000 cough sounds from 2,143 individuals with detailed demographic, clinical and microbiologic diagnostic information. We aim to empower researchers in the development of cough sound analysis models to improve TB diagnosis, where innovative approaches are critically needed to end this long-standing pandemic.

Latest Publications

Inhaled alkaline hypertonic divalent salts reduce refractory chronic cough frequency

Continuous digital cough monitoring during 6-month pulmonary tuberculosis treatment