What is the Difference Between HG19 and HG38?

🆚 Go to Comparative Table 🆚

HG19 and HG38 are human reference genomes developed by the Genome Reference Consortium. They serve as a basis for mapping DNA sequences for phylogenetic and bioinformatic analysis. The key differences between HG19 and HG38 are:

  1. Regions with alternate loci: HG19 has 7 regions with alternate loci, while HG38 has 207 regions with alternate loci.
  2. Year of development: HG19 was developed in 2009, while HG38 was developed in 2017.
  3. Genome coverage: HG38 has better genome coverage than HG19, with around 6.5% of the bases in HG19 and 4.4% of the bases in HG38 having no reads aligned.
  4. Variant calling: Calling with HG38 generates more single nucleotide variants (SNVs) than calling with HG19.
  5. Alignment: For genomic regions with intermediate amounts of aligned reads, HG19 and HG38 perform equally well. However, for regions with very high or very low amounts of aligned reads, HG19 and HG38 more often produce different alignments.

When analyzing clinical exome sequencing data, it has been observed that there are some variants present in HG19 that are not seen in HG38, and vice versa. It is generally found that more variants are called when using HG38 than with HG19.

Comparative Table: HG19 vs HG38

HG19 and HG38 are human reference genomes with some differences in their structures and properties. Here is a table summarizing the key differences between HG19 and HG38:

Feature HG19 HG38
Regions with alternate loci (gaps) 7 207
Date developed 2009 2017
Number of gaps between scaffoldings 271 349

Source:

The newer genome version (HG38) has better genome coverage and is generally more accurate than the older version (HG19). Additionally, converting from HG38 to HG19 is more error-prone than the opposite direction. Given these differences, it is recommended to use HG38 for sequencing data analysis aimed at variant calling.