Researchers develop a machine-learning method that finds patterns of health decline in ALS, informing future clinical trial designs and mechanism discovery.
Alzheimer, a type of neuro-degenerative disease is a chronic ailment that can present with a variety of symptoms, worsen at different rates, and have many underlying genetic and environmental causes, some of which are unknown. ALS, or Lou Gehrig’s disease, and Parkinson’s are some similar degenerative diseases.
A new machine-learning method developed by researchers from MIT, IBM Research, and elsewhere aims to better characterize ALS disease progression patterns to inform clinical trial design. Their technique, indeed, identified discrete and robust clinical patterns in ALS progression, many of which are non-linear. Further, these disease progression subtypes were consistent across patient populations and disease metrics. The team additionally found that their method can be applied to Alzheimer’s and Parkinson’s diseases as well.
They designed an unsupervised machine-learning model that employed two methods: Gaussian process regression and Dirichlet process clustering. These inferred the health trajectories directly from patient data and automatically grouped similar trajectories together without prescribing the number of clusters or the shape of the curves, forming ALS progression “subtypes.”
Primarily, they applied the model to five longitudinal datasets from ALS clinical trials and observational studies. four dominant patterns of disease popped out of the many trajectories — sigmoidal fast progression, stable slow progression, unstable slow progression, and unstable moderate progression — many with strong nonlinear characteristics. Notably, it captured trajectories where patients experienced a sudden loss of ability, called a functional cliff, which would significantly impact treatments, enrollment in clinical trials, and quality of life.
The new work outperformed other commonly used linear and nonlinear approaches in the field. Impressively, when data were withheld, the model was able to interpolate missing values, and, critically, could forecast future health measures. The model could also be trained on one ALSFRS-R dataset and predict cluster membership in others, making it robust, generalizable, and accurate with scarce data.