Genomic Technologies Reviewed
Is a $1000 genome feasible? Jay Shendure and colleagues break down this question into its components:
To resequence a genome, the sequencing error rate must be significantly lower than the amount of variation that is to be detected. As human chromosomes differ at 1 in every 1,000 bases, an error rate of 1/100,000 bp is a reasonable goal. If the base accuracy of a RAW READ is 99.7% (on a par with state-of-the-art instruments), and assuming that errors are random and independent, then X3 coverage of each base will yield the desired error rate. However, to ensure a minimum X3 coverage of >95% of a diploid human genome, X6.5 coverage is required, or 40 billion raw bases. In this situation, the cost per base for an accurate US $1,000 genome must approach 40 million raw bases per US $1 — a 4–5-log improvement over current methods. Although they could potentially approach the cost of a US $2,000 computer, current integrated genomics devices typically cost US $50,000–500,000. If we assume that the capital/operating costs of our hypothetical instrument are similar to those of conventional electrophoretic sequencers, the bulk of improvements must derive from an increase in the rate of sequence acquisition per device from 24 bases per second (bp/s) to 450,000 bp/s. No assembly is required in resequencing a genome; sequencing reads need only be long enough to allow a given read to be matched to a unique location in an assembled reference genome, and then to determine whether and how that read differs from the reference. In a model in which bases are ordered at random, nearly all 20-bp reads would be expected to be unique (420 >>3109). However, as the mammalian genome falls short of being random, only 73% of 20-bp genomic reads can in fact be assigned to a single unique location. Achieving >95% uniqueness — a modest goal — will require reads of 60 bp.
Given these assumptions, a resequencing instrument that can deliver a US $1,000 human genome with reasonable coverage and accuracy will need to achieve 60-bp reads with 99.7% raw-base accuracy, acquiring data at a rate of 450,000 bp/s. Departures from this situation are almost certain, but will generally involve some trade-off — for example, dropping capital/operating costs by tenfold would enable an instrument with one-tenth of the throughput to achieve the same cost per base.
Read the whole article.
Jay Shendure, Robi Mitra, Chris Varma, and George Church. Advanced Sequencing Technologies: Methods and Goals. Nature Reviews Genetics 5:335-344, 2004. (I doubt the link to the article will be persistent. Benevolently, the authors have posted preprint (pdf) at their recently updated Personal Genome Project website)