Nanopore sequencing and assembly of a human genome with ultra-long reads

January 29, 2018
Author: Jain, Miten and Koren, Sergey and Miga, Karen H and Quick, Josh and Rand, Arthur C and Sasani, Thomas A and Tyson, John R and Beggs, Andrew D and Dilthey, Alexander T and Fiddes, Ian T and Malla, Sunir and Marriott, Hannah and Nieto, Tom and O’Grady, Justin and Olsen, Hugh E and Pedersen, Brent S and Rhie, Arang and Richardson, Hollian and Quinlan, Aaron R and Snutch, Terrance P and Tee, Louise and Paten, Benedict and Phillippy, Adam M and Simpson, Jared T and Loman, Nicholas J and Loose, Matthew

Abstract:

We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing 30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 3 Mb). We developed a protocol to generate ultra-long reads (N50 > 100 kb, read lengths up to 882 kb). Incorporating an additional 5× coverage of these ultra-long reads more than doubled the assembly contiguity (NG50 6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.

[ Read More ]