In this blog miniseries, we’ve asked Junaid Shabbeer, Clinical Science Director at QIAGEN Bioinformatics, to guide us through the process of analyzing DNA variants for genetic tests. In our last post we walked through what it takes to launch a new genetic test. Today, we look at the steps involved in scoring a variant.
“All labs go about gathering evidence to classify variants in more or less the same manner,” Junaid says, noting that the process of scoring a germline (hereditary disease) variant fundamentally differs from the process of scoring a somatic (tumor) variant. The goal of scoring a germline variant is to get at whether the variant is causal or not causal for a disease, and guidelines from organizations such as ACMG have helped make this process very consistent. The goal of scoring a somatic variant is to get at whether there are targeted treatments that correspond to the variant, as well as any prognostic or diagnostic information pertinent to that variant that could aid in the cancer patient’s outcome.
The process begins, naturally, with DNA sequencing data generated from the test taken by the patient. This data goes to the clinical lab director, who has to decide what information to report. Each lab has a variant interpretation workflow — involving a variant committee or a dedicated variant analyst — drawing upon external data as well as internal data from variants processed by that lab in the past.
Once the list of variants has been sifted out from the test data, the analysis process kicks off with a search of what is already known about each variant. Analysts comb through COSMIC, HGMD, and other public databases (including gene-specific databases) relevant to the particular disease or condition being tested for. They will use modeling tools as well to get a better sense of how each variant might affect gene expression or the structure or stability of the protein.
Publications are very important to variant scoring, so analysts will also query a search engine such as Google Scholar or PubMed to find information about each variant in the literature (both clinical and research papers). While research papers generally cannot be relied upon entirely in the variant scoring process, Junaid says, “they might be useful if the scientists did functional studies, say, in a model system like yeast cells.” The literature search can be very time-consuming for analysts and is not a completely reproducible science, though standard search strings can be used across variants to ensure consistency in the publication search process. “What if you miss that one critical paper?” Junaid says, noting that analysts are acutely aware that their reports might sway a physician and patient to go ahead with a drastic surgery or treatment.
For somatic variants, analysts will consult drug labels and focus heavily on guidelines articles in the literature, for example from NCCN and ASCO, as well as search for clinical trials that pertain to the variant. Additional literature containing treatment studies and prognostic data are also consulted.
For germline variants, the analysts will also consult pedigree information for variant scoring. “The more often you see a variant in affected individuals in families, the greater the weight of evidence that the variant tracks with disease,” Junaid says. “But if you’re seeing it in unaffected family members or broadly through the population, then it is more likely to be benign. Lab directors wouldn’t rely only on observations within families, but that’s certainly an important factor.”
The goal of all the research is to find enough evidence to classify a variant as pathogenic or as benign, with as few variants as possible in the “unknown significance” category that lies between (for somatic variants, analysts aim to further classify them in terms of their actionability in the patient’s particular cancer or in another cancer). “Once a lab has enough evidence to confidently classify a variant, that’s very good,” Junaid says. “That’s what labs try to do: understand as accurately as possible the role of that variant in disease causation or what targeted treatments are available.”
When a particular variant has been characterized well enough to be definitively classified , it can be included in automated reporting pipelines for that test in the future — an important step in reducing both the test turnaround time and the time spent by analysts and lab directors in the analysis process.
Particularly for new variants, however, such automated reporting has not been possible and the interpretation process remains a customized, hands-on effort. “We’re dealing with the information that makes that individual unique,” Junaid says, “so every result has to be examined and reviewed.”
Because of that, variant scoring can be an incredibly tedious effort. “It might take a person a whole day to classify a single difficult variant,” Junaid says. On average, he adds, analysts can typically expect to spend several hours to classify each variant. “You have to search through all this information, read lengthy clinical research papers, analyze clinical data, and weigh the importance and relevance of each study. Then there are the modeling programs, such as RNA splicing predictions. It’s a very time-consuming process,” he notes.
As NGS-based genetic testing becomes more commonplace, and as labs see increasing test volumes, having such a hands-on interpretation workflow will not be sustainable. Accelerating that process is one of the key goals of our newest product, Ingenuity Clinical, currently being evaluated by early access users. Our team has worked hard on this new application and we’re eager to get it out to a broader customer base so we can help clinical teams save time and perform high-quality variant interpretation for these important tests.
If you’ll be attending this week’s AMP conference, visit the QIAGEN booths (#707 and #1023) for a demo of the new platform. Junaid will be at the conference to answer any of your questions. In the meantime, check back for the last blog in our series, which will focus on interpreting cancer variants.