Genome Web: Human Reference Genome Choice Impacts Variant Calling, But Switching is Tricky
By Ciara Curtin
The choice of which human reference genome a lab uses could influence the variants called, an issue research and clinical labs need to keep in mind as they choose a reference.
The GRCh38 human reference genome came out more than seven years ago and filled in gaps, added alternate scaffolds, and made other updates, and new patches and updates are still being released for it. But not all labs have made the switch from the GRCh37 reference genome, also known as hg19, to GRCh38. One recent survey has suggested that most clinical labs are reluctant to make the switch, as they are not sure whether the benefit is worth the cost of changing workflows.
A new study has found that, for some regions of the genome, which reference genome is used to call variants from exome sequencing data may make a difference. Moez Dawood, an M.D./Ph.D. student at Baylor College of Medicine, and his colleagues uncovered a set of more than 200 genes — including ones implicated in Mendelian diseases — that are enriched for discordant variant calls.
"The implication in the entire field [that] moving up to 38 is just going to make things better is not quite what we saw," Dawood said, noting that in some cases the older reference was better than the newer one, but also the reverse. "You really need to be savvy about what you're looking at."
Still, the field is slowly updating its tools to work with GRCh38, though that changeover may become complicated by the release of the telomere-to-telomere human genome assembly.
"We're very quickly coming to a point where even as clinical labs migrate to 38, that there's a whole other frontier that's already live that they have to contend with," said Midhat Farooqi, the director of molecular oncology at the Center for Pediatric Genomic Medicine at Children's Mercy Kansas City.
According to their survey of about two dozen clinical labs offering next-generation sequencing-based testing, only 7 percent had already moved over to GRCh38, and most, 54 percent, had no plans to change. Most commonly, labs said they did not think the benefits from changing over outweighed the costs, both in time and money. Clinical labs, Farooqi noted, must revalidate their pipelines as well as realign their existing clinical data to GRCh38 for their internal variant database, a time-consuming process.
Still, a change might be coming. Both Lansdon and Farooqi noted that more tools that labs rely on for variant interpretation — like gnomAD and DECIPHER — now also support GRCh38 coordinates. Farooqi additionally pointed out that research labs are more likely to have made the switch, which will also lead the literature to switch and could, in turn, spur clinical labs to make the changeover.
"It's coming to a point where you cannot avoid the problem anymore," he said.
Read the full article via GenomeWeb
Learn more about the Genomic Medicine Center at Children's Mercy