The Drosophila santomea genome - release 1.0

 

Release 1.0 details.  A standard Illumina library was prepared from genomic DNA of 10 adult females from a single strain (STO-CAGO1402-3, courtesy of Manyuan Long) that was inbred for 10 generations.  Illumina sequence reads (54bp) were aligned to the D. yakuba reference genome sequence (release 1.3) using bwa (http://bio-bwa.sourceforge.net/bwa.shtml) and -n 0.15.  The average coverage is ~10X. This is a (as yet unpublished) work in progress, so use at your own risk and keep in mind the limitations inherent in this type of approach.


Release 1.0 files

    dsan-all-chromosome-yak1.3-r1.0.fastq.zip     - Standard fastq format (@sequence/+phredQV).          

                                                                                Note that the default for all base positions is the

                                                                                D. yakuba genome reference state with quality score 0.  


    dsan-all-chromosome-yak1.3-r1.0.fasta.zip    - just the fasta formatted sequence

    dsan-all-chromosome-yak1.3-r1.0.qual.zip     - just the alignment phred quality scores



Background.  D. santomea was first described by Lachaise et al. (2000) as a new melanogaster-group sister species endemic to the island of São Tomé off the coast of West Africa.  D. santomea is most closely related to D. yakuba and the two species diverged ~0.5 Mya (Cariou et al. 2001; Llopart et al. 2005; Bachtrog et al. 2006). The species has a number of derived characters relative to D. yakuba, including highly reduced pigmentation as well as mating and temperature preferences, making it fertile ground for studies of the evolution of novel characters (Llopart et al. 2002; Coyne et al. 2004; Carbone et al. 2005; Llopart et al. 2005; Mas and Jallon 2005; Moehring et al. 2006; Jeong et al. 2008; Matute and Coyne 2009). The species is only partially reproductively isolated from D. yakuba, facilitating the genetic dissection of the factors underlying reproductive isolation and other derived phenotypic traits of interest.


The current draft genome represents an assembly of 65.9 million 54 bp Illumina sequence reads to the D. yakuba reference genome sequence (release 1.3) yielding an average coverage of ~10X.  Further updates are expected soon, including higher coverage with paired-end reads and a de novo assembly in collaboration with Mike Eisen at UC Berkeley. 


We hope this will stimulate evolutionary genetic research on D. santomea and other Drosophila species - enjoy!


Peter Andolfatto, pandolfa[at]princeton.edu

Tina Hu

Kevin Thornton