New computer model helps identify cancer-causing mutations

Scientists have built a computer model capable of rapidly scanning the entire genome of cancer cells and identify mutations that occur more frequently than expected, suggesting that they are driving tumor growth.

The findings were according to a new Massachusetts Institute of Technology study that was funded, in part, by the National Institutes of Health and the National Cancer Institute.

Distinguishing these harmful driver mutations from neutral passengers could help researchers identify better drug targets. To boost these efforts, an MIT-led team has built a new computer model that can quickly scan the entire genome of cancer cells and identify mutations that occur more frequently than expected, suggesting they are cause of tumor growth. This type of prediction has been difficult because some genomic regions have an extremely high frequency of passenger mutations, drowning out the signal of real drivers.

“We created a probabilistic deep learning method that allowed us to get a really accurate model of how many passenger mutations should exist anywhere in the genome,” says MIT graduate student Maxwell Sherman. “Then we can search throughout the genome for regions where you have an unexpected accumulation of mutations, suggesting that these are motor mutations.”

In their new study, the researchers found additional mutations in the genome that appear to contribute to tumor growth in 5-10% of cancer patients. The findings could help doctors identify which drugs are more likely to successfully treat these patients, the researchers say. Currently, at least 30% of cancer patients have no detectable motor mutation that can be used to guide treatment.

Sherman, MIT graduate student Adam Yaari and former MIT research assistant Oliver Priebe are the lead authors of the study, which appears today in Nature Biotechnology. Bonnie Berger, Simons Professor of Mathematics at MIT and head of the Computing and Biology Group at the Computer Science and Artificial Intelligence Laboratory (CSAIL), is one of the study’s lead authors, along with Po-Ru Loh, assistant professor at Harvard. Faculty of Medicine and Associate Member of the Broad Institute of MIT and Harvard. Felix Dietlein, associate professor at Harvard Medical School and Boston Children’s Hospital, is also one of the authors of the article.

A new tool

Since the human genome was sequenced two decades ago, researchers have scoured the genome to try to find mutations that contribute to cancer by causing cells to grow out of control or evade the immune system. This has successfully led to targets such as epidermal growth factor receptor (EGFR), which is commonly mutated in lung tumors, and BRAF, a common driver of melanoma. These two mutations can now be targeted by specific drugs.

Although these targets have proven useful, protein-coding genes only make up about 2% of the genome. The remaining 98% also contain mutations that can occur in cancer cells, but it has been much more difficult to determine whether any of these mutations contribute to the development of cancer.

“There really has been a lack of computational tools that allow us to look for these driver mutations outside of protein-coding regions,” Berger says. “That’s what we were trying to do here: design a computational method to allow us to look at not just the 2% of the genome that codes for proteins, but 100% of it.”

To do this, the researchers trained a type of computer model known as a deep neural network to search cancer genomes for mutations that occur more frequently than expected. They first trained the model on genomic data from 37 different cancer types, which allowed the model to determine background mutation rates for each of these types.

“The really good thing about our model is that you train it once for a given type of cancer, and it learns the mutation rate all over the genome simultaneously for that particular type of cancer,” Sherman says. “Then you can query the mutations you see in a patient cohort versus how many mutations you should expect to see.”

The data used to train the models comes from the Roadmap Epigenomics Project and an international collection of data called Pan-Cancer Analysis of Whole Genomes (PCAWG). Analysis of this data by the model gave the researchers a map of the expected passenger mutation rate across the genome, so the expected rate in any set of regions (down to the single base pair) can be compared to the number of mutations seen anywhere in the genome. genome.

Change the landscape

Using this model, the MIT team was able to add mutations that can lead to cancer to the known landscape. Currently, when tumors of cancer patients are screened for cancer-causing mutations, a known driver appears about two-thirds of the time. The new results from the MIT study suggest possible driver mutations for an additional 5-10% of the patient group.

One type of non-coding mutation that researchers have focused on is called “cryptic splicing mutations”. Most genes are made up of sequences of exons, which encode building instructions for proteins, and introns, which are spacer elements that are usually removed from messenger RNA before it is translated into protein. Cryptic splicing mutations are found in introns, where they can confuse the cellular machinery that separates them. This results in introns being included when they should not be.

Using their model, the researchers found that many cryptic splicing mutations appear to disrupt tumor suppressor genes. When these mutations are present, the tumor suppressors are mis-spliced ​​and stop working, and the cell loses one of its defenses against cancer. The number of cryptic splice sites discovered by the researchers in this study represents approximately 5% of the driver mutations found in tumor suppressor genes.

Targeting these mutations could offer a new way to potentially treat these patients, the researchers say. One possible approach that is still in development uses short strands of RNA called antisense oligonucleotides (ASO) to patch onto a piece of mutated DNA with the correct sequence.

“If you could make the mutation go away somehow, then you would solve the problem. These tumor suppressor genes could continue to work and possibly fight cancer,” Yaari says. “ASO technology is being actively developed, and it could be a very good application.”

Another region where the researchers found a high concentration of non-coding driver mutations is in the untranslated regions of certain tumor suppressor genes. The tumor suppressor gene TP53, which is defective in many types of cancer, was already known to accumulate many deletions in these sequences, called 5′ untranslated regions. The MIT team found the same pattern in a tumor suppressor called ELF3.

The researchers also used their model to determine whether common mutations already known could also cause different types of cancers. For example, researchers found that BRAF, previously linked to melanoma, also contributes to cancer progression in smaller percentages of other types of cancers, including pancreatic, liver, and gastroesophageal.

“That means there’s actually a lot of overlap between the common motor landscape and the rare motor landscape. This provides an opportunity for therapeutic redirection,” Sherman said. “These results could help guide the clinical trials we should put in place to scale these drugs from just one cancer approval to multiple cancers and be able to help more patients.”

#computer #model #helps #identify #cancercausing #mutations

Leave a Comment

Your email address will not be published. Required fields are marked *