Mathematical model finds the cancer mutations that matter
The new tool more accurately distinguishes mutations that drive cancer from ones that don’t, and could help focus future research and drug development.
By Stephanie McPherson
Credit: Susanna Hamilton
Researchers have generated a long list of genetic mutations linked to cancer, but sorting out which ones really drive tumors to grow uncontrollably and which ones don’t has been a challenge. A new mathematical model developed by researchers at the ӳý of MIT and Harvard and Massachusetts General Hospital (MGH) could help with this task, by accurately picking out the “driver” mutations from the less important “passenger” ones. Doing this more accurately could help drug developers focus their work on the true drivers of cancer.
Previous models have identified many driver mutations, but they largely lack the ability to dig down into the genome at finer scales, and so were often misidentifying passenger mutations as drivers. To reduce this false positive rate, the researchers used sequencing data from more patients than previous efforts to build a more precise model. They also accounted for differences in the overall mutation rate across the genome.
“We are building the most accurate-to-date model of the background rate of mutations in cancer,” said Gad Getz, director of the cancer genome computational analysis group at the ӳý, the Paul C. Zamecnik Chair in Oncology at the MGH Cancer Center, and co-senior author of the study in . “And we are showing that our list of driver mutations is clean from the false positives that previous methods had.”
Cancer mutations tend to recur in the exact same spots in patients’ genomes. The model examined these locations, known as “hotspots.” Some hotspots are in famous cancer driver genes such as RAS and TP53, and their cancer-promoting effects have been verified in the laboratory. However, the authors showed that many other hotspots are actually “passenger hotspots” — specific genomic positions at which mutations frequently occur because these regions are easily mutated.
“The most important thing to come out of the paper is a warning to researchers that just because we see, say, 10 cancer patients who all have the same base pair mutated, that does not mean that's a driver mutation hotspot,” says Michael Lawrence, co-senior author of the paper, an investigator at MGH, and a group leader in computational biology at the ӳý. “It doesn't mean that it’s necessarily a mutation that would be worthwhile to spend money and time following up in the lab.”
Needle in a haystack
The research team began working on this new model a few years ago, after reviewing some surprising data generated by another model.
“As datasets got larger and larger, we found implausibly high numbers of these significant hotspot mutations occurring at the exact same genomic position across multiple patients,” says Julian Hess, first author of the paper and a member of the Getz lab. It seemed unlikely that every mutation in all these hotspots was a driver. The researchers began to question the model they were using.
They used genetic data from approximately 10,000 cancer patients to build their new model, and generated a list of driver mutations with a 97 percent reduction in the false positive rate relative to other models.
The study also raised new questions, such as why there are so many passenger hotspots.
"The model's results indicate that there may be as-yet undiscovered genomic features that predispose certain base pairs to be much more intrinsically mutable than others," says Hess. This will be an area of exploration going forward.
The researchers say they may have to refine the model further as datasets grow, to make sure new passenger mutations aren’t being falsely named as drivers. But for now, they say their model is ready for other researchers to use as a better starting point than had been available before.
“The hotspots identified by the new model definitely can be followed up in future research with more confidence that these mutations are actually important in cancer,” says Getz.
This work was supported in part by National Cancer Institute GDAC grants (U24CA143845, U24CA210999).
Paper(s) cited
Hess, JM, Bernards, A, Kim, J, et al. “” Cancer Cell Online September 16, 2019. DOI: 10.1016/j.ccell.2019.08.002