“Acoustic AI" refers to the application of artificial intelligence techniques to analyze sound data, including cough sounds, for purposes that can include speech recognition, music information retrieval, and, of course healthcare. Acoustic AI is the technology that enables the field of Acoustic Epidemiology.
Acoustic AI platforms leverage advanced algorithms and machine learning models to interpret the acoustic features of coughs, such as their inherent structure, their frequency, energy, and patterns over time, to detect and predict respiratory conditions (ref). Essentially, AI models are trained to look for health signals in sounds generated by patients. These sounds can include coughs, snores, wheezes or the tone of voice.
While Imaging AI - the application of AI algorithms to analyze images - has experienced significant growth and broad applicability across healthcare, acoustic AI is emerging as a viable and much scalable alternative that has the potential to deliver a genuine shift from traditional, invasive screening, diagnostics and compliance methods towards non-invasive, scalable, and patient-friendly digital health solutions.
In this review we will focus on acoustic AI that detects and analyzes cough sounds, as this is a particularly exciting niche in the world of Acoustic AI, with the potential to redefine respiratory health as we know it.
There are currently two different foundational approaches for Acoustic AI applied to health:
Let’s compare the two approaches and explore the transformative potential of integrating artificial intelligence with acoustic data for enhancing screening & diagnostic accuracy, patient compliance and adherence and healthcare accessibility in respiratory disease management.
Single-sound analysis operates on the premise that individual cough sounds carry distinct markers (ref), indicative of specific respiratory conditions. The process involves a patient coughing into a device that captures the sound, which is then analyzed by AI algorithms to detect disease-specific signals.
There are many research teams and companies implementing various versions of this framework, and, while the existing scientific literature as well as the regulatory frameworks have not confirmed their viability, the theoretical potential of this approach is intriguing and highly seductive. It would be very impactful to be able to cough once into a device and have accurate diagnostics delivered on the spot.
The biggest strength of this approach is its ability to drive technical Innovation and accelerate the development of powerful acoustic classifiers. Because of the iterative nature of innovation, the progress in our ability to build and train classifiers is very impactful. Even if disease specific classifiers have significant limitations (see below) there are classifiers that can be built and trained using the same approach that can add immediate value to healthcare. For example, pediatric vs. adult cough classifiers. Or productive vs. dry classifiers.
The single cough diagnostic approach also mimics a traditional diagnostic flow- a patient activates the process in some way and then delivers the data sample - the cough. This means that there aren’t many behavioral or patient flows changes needed and the need to educate patients and providers is reduced.
As seductive as this framework is, in practice there are a series of critical operational, medical and biological challenges that have to be taken very seriously when considering building or engaging with single-cough diagnostic models.
While possible in theory, there is no evidence that specific diseases carry a distinct signal present in a person’s cough that is also consistent and uniform across every population segment. Indeed, there are cautionary tales of diagnostic classifiers that seemed to have a high sensitivity for COVID during the recent Covid19 pandemic, that were actually not sensitive to Covid at all, but were simply distinguishing between sick and healthy people, in a time and place where “sick” meant a high likelihood of covid. The Literature on the matter is fairly consistent. There is no scientific evidence for uniform and consistent disease specific signatures in cough.
An additional limitation of single-cough analysis models is that they have to rely on elicited or forced coughs. There may be subtle but significant differences between naturally occuring coughs - i.e. coughs that occure because the cough reflex is triggered - and coughs that are forced - i.e. the patient coughs without the natural trigger.
In addition to the biological nature of coughs themselves, the accuracy of single-sound analysis is significantly affected by the variability in cough sounds due to age, gender and other demographic factors, which often affect the mechanics of coughing. This variability can lead to inconsistencies in diagnosis, particularly when deployed across diverse populations.
Effective machine learning models require vast, diverse datasets to train on. Single-sound diagnostics models require indication-specific sounds, across every possible social demographic segment and across all stages of a disease that would have to be paired with ground-truth diagnostics and medical data. Collecting vast datasets of high-quality data is practically impossible which means that ALL current models on the market are likely trained on insufficient data. This means that the risk of inherent biases will almost certainly skew performance at scale, in the real world. This is one of the reasons that several single-cough diagnostic tools have been rejected by the FDA (example)
Even if the problems above are somehow overcome, scaling a high performance single-cough model requires a controlled environment: specific hardware and software configurations as well as a highly controlled acoustic environment (like a soundproof booth that would eliminate or reduce the background noise and reverberation in the sample). This poses significant additional limitations to the scalability of such models and also radically limits access to such a tool, increasing costs and reducing the practicality of widespread implementation. In fact the operational complexities inherent in scaling such a model - and building vertically integrated supply chains for it - is significantly higher than simply using a PCR test or similar for diagnostics, leveraging well established supply chains and procedures with prodictable and well understood economics.
Regulatory frameworks pose significant hurdles for the scalability of single-cough models. The precision required in diagnostics necessitates indication-specific models and representative studies, complicating the deployment and adoption of single-sound analysis models on a large scale.
Continuous cough monitoring shifts the focus from individual cough sounds to analyzing cough patterns and frequency over time. This approach relies on much simpler classifiers, which only have to distinguish between coughs and non-coughs. Due to their simplicity, longitudinal models are basically small pieces of software that can run on any device and collect data passively, requiring no active participation from the patient while statistical AI models monitor cough dynamics continuously.
By minimizing infrastructure requirements and leveraging existing devices like smartphones, wearables, smart home devices and any other mic-enabled device, continuous monitoring tools offer a scalable, cost-effective solution adaptable to various environments. Due to their simplicity, these models can run on device - using edge computing - meaning that they also do not require access to the internet or any other specialized infrastructure.
Continuous monitoring tools rely on simple binary classifiers that only need to distinguish between cough and non-cough sounds. Companies like Hyfe have successfully solved this problem, building classifiers that are small enough to run on the smallest possible device, yet perform with near-perfect accuracy in the real world. These models are so good because they are trained on billions of data points, all of them collected in the real world, across every possible socio-demographic segment and using every possible combination of hardware, software and acoustic environment.
While consistent and uniform signals in individual coughs are not a scientifically proven hypothesis, there is significant scientific evidence that cough frequency patterns - the way cough changes in relation to an individual threshold - are consistent with specific indications and diseases. Peer reviewed literature has shown that cough frequency can predict TB (reference here and here) and there is rigurous evidence showing that cough frequency can predict clinical outcomes in Covid19 patients.
Continuous cough detection such as Hyfe’s, enable the identification of even subtle changes in cough patterns, facilitating early detection of respiratory exacerbations - such as acute COPD exacerbation or decompensation in CHF - or the onset of new conditions (such as Lung Cancer) even before the patient is aware and feels the need to seek medical attention.
It also allows for the creation of personalized health baselines, enhancing diagnostic precision and patient-specific care including personalized digital therapeutics.
The passive nature of data collection ensures high patient compliance, broadening the technology's applicability across patient populations, including those with chronic conditions or in remote areas. Continuous monitoring allows for the establishment of rolling, individual baselines, which also allows for individual customization of management regimes and therapies, significantly improving outcomes.
Due to their simplicity and lack of infrastructure dependencies, continuous cough monitoring models are pure software products and benefit from the operational economics of software - they are infinitely scalable and enjoy software specific economies of scale. This means that cough detection models such as the ones developed by Hyfe can be rolled out pretty much over night.
Related to above, the framework's flexibility allows for integration into diverse platforms and devices, facilitating widespread adoption within diverse health systems without significant infrastructural investments. This adaptability extends the reach of continuous monitoring into low-resource settings, democratizing access to advanced digital healh tools. This underscores the value of continuous cough monitoring as a global health tool. By offering an accessible, non-invasive means of early detection and monitoring for respiratory conditions, continuous monitoring can play a pivotal role in global health equity, offering new avenues for disease management and prevention, especially in underserved populations.
Continuous monitoring holds the potential to transform respiratory healthcare through early detection and intervention that happens before symptoms are present or before patients develop the awareness to seek medical attention. Cough monitoring predicts medically significant events and enables preventive measures, thus reducing hospital readmissions and healthcare costs.
The simplicity and flexibility of continuous monitoring models come with some tradeoffs that need to be addressed:
While evidence suggests that subtle changes in cough frequency can be a powerful predictor of disease risks, it should not be expected that cough frequency alone would replace the need for thorough diagnostics. This means that continuous monitoring tools are not good diagnostic replacements. However, they are powerful diagnostics support tools that can help screen for diseases and guide patients and providers through a rigorous diagnostic evaluation faster and earlier.
Healthcare is notoriously conservative. Providers in particular are deeply entrenched in processes and frameworks that just work. Most of them require the patient to initiate the contact with healthcare. Passive, continuous models flip this framework, meaning that the healthcare contact can be initiated by either patient or provider. Additionally, providers need to learn to factor in longitudinal datasets in their clinical evaluation, where before they only dealt with static, single-sample data (laboratory results, evaluation etc).
However the prevalence of other longitudinal datastreams - continuous glucose monitoring, HR * HRV, etc are accelerating this change.
Both frameworks harness the power of AI in transforming respiratory diagnostics. However, continuous monitoring's approach to data collection and analysis—focusing on patterns over time rather than the acoustic properties of individual coughs—offers a more robust, scalable, and patient-friendly model. Its ability to operate in diverse real-world environments and integrate seamlessly into patients' daily lives without requiring specialized equipment or environments positions it as a superior alternative.
While single-sound analysis brings valuable insights into the potential of acoustic-based diagnostics, its practical limitations in terms of variability, data collection challenges, and scalability concerns highlight significant drawbacks. In contrast, continuous or longitudinal analysis emerges as a more viable, scalable, and patient-centric solution. Its strengths in predictive analytics, personalization of care, and operational flexibility showcase a forward path in respiratory health, aligning with the broader goals of precision medicine and healthcare accessibility.
If you want to dive deeper into the ins and outs of continuous cough monitoring, here is a comprehensive FAQ
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Non eget pharetra nibh mi, neque, purus.