PacBio on 4 August published a study in Nature Methods introducing the “Platinum Pedigree,” a long-read genome benchmark built from deep sequencing of a 28-member, multi-generational family. The dataset catalogues more than 37 megabases of genetic variation and adds over 200 million bases to existing reference regions, expanding benchmark coverage to 2.77 gigabases, including segmental duplications and other difficult-to-map areas. When Google’s DeepVariant AI variant-calling software was retrained with the new benchmark, erroneously called variants fell by 34 percent across the genome, with the largest gains in complex regions. PacBio developed the resource with researchers at the University of Washington, the University of Utah and other institutions, and is making the data and pipelines freely available through a public repository. The company says the Platinum Pedigree establishes a higher bar for evaluating sequencing workflows and for training AI models used in clinical and population genomics, potentially accelerating the adoption of long-read technologies in diagnostic and research settings.
#PacBio long-read data was used to build a new genome benchmark that reduced DeepVariant errors by 34 percent in difficult regions. Platinum Pedigree improves variant calling accuracy and sets a higher standard for AI in genomics. Press release here: https://t.co/615fzP3Fxz https://t.co/02ckX7jAY6
A Telomere-to-Telomere Diploid Reference Genome and Centromere Structure of the Chinese Quartet. #HumanGenome #ReferenceGenome #ChinesePopulation #T2T #Genomics @biorxiv_genomic https://t.co/c78sVt3qo7 https://t.co/pTfUnpmrHP
$PACB - Nature Methods Paper Leverages PacBio Sequencing Technology to Develop the Platinum Pedigree Benchmark, a New Standard for Accurate Characterization of Variation in the Human Genome that Improves Training for AI Models - https://t.co/k5NXxYQCth