MIT and Dana-Farber Cancer Institute researchers have developed an ML model to analyze 400 genes, making it easier to identify the origins of enigmatic cancers by predicting where a tumour originated in the body.
A significant number of people develop cancers of unknown primary (CUP) every year. Since most therapies require knowledge of the primary site for treatment, individuals with CUP have minimal options. Determining the cancer’s origin is challenging for a small percentage of cancer patients. Choosing a treatment for these patients is more difficult, as many cancer drugs are developed for specific cancer types.
Researchers at MIT and Dana-Farber Cancer Institute have developed a machine learning (ML) model that analyzes around 400 genes to predict a tumour’s origin. The researchers have developed a model that could accurately classify at least 40% of tumours of unknown origin with high confidence in a dataset of around 900 patients.Â
Uncovering Cancer’s Concealed Origins
Researchers have analyzed routine genetic data at Dana-Farber, comprising genetic sequences of around 400 often-mutated cancer genes, to see if it could predict cancer type. They trained an ML model on data from nearly 30,000 patients diagnosed with one of 22 known cancer types, including data from Memorial Sloan Kettering Cancer Center, Vanderbilt-Ingram Cancer Center, and Dana-Farber.
The researchers have tested the resulting model, OncoNPC, on roughly 7,000 tumours whose origin was known but unseen by the model, predicting their origins with 80% accuracy. For high-confidence predictions, comprising about 65% of the total, accuracy increased to around 95%. They further compared OncoNPC’s predictions with an analysis of inherited mutations in certain tumours, revealing potential genetic predispositions to specific cancer types.
Guiding Treatment Decisions
The researchers have compared CUP patients’ survival time with the model’s predicted prognosis. Those predicted to have poor-prognosis cancers had shorter survival times, while better-prognosis predictions led to longer survival. Among 10% of patients who received targeted treatments, those who were consistent with the model’s predictions fared better than those who received incongruent treatments.
The team found an additional 15% of patients could have received targeted treatment if their cancer type was known but received general chemotherapy instead. They now aim to expand the model to include data like pathology and radiology images for more comprehensive predictions, potentially even identifying optimal treatment.