Genome evolution.

A major theme of research in my lab over the past decade, in which we have had a marked impact on the field, has been to document the effects of positive and negative selection in the Drosophila genome, through data analysis and the development of novel methods to estimate population genetic parameters. Notably, we have shown that most non-coding DNA in the Drosophila genome is subject to purifying selection and exhibits greater divergence that predicted by the neutral theory, consistent with an important role for recurrent positive selection. We have arrived at this conclusion from several complementary lines of evidence including patterns of polymorphism and divergence; the distribution of polymorphism frequencies; and patterns of neutral diversity linked to divergent amino acid substitutions. Together, these findings, along with work from other groups, reveal that positive selection is much more frequent that previously believed in the Drosophila genome. They therefore strongly challenge long-held views about the importance of positive selection relative to genetic drift in determining patterns of genome evolution.

Figure 1.  Relative rates of

lineage-specific divergence

on the X chromosome versus

Autosomes in D. melan-

ogaster and D. simulans.

(Hu et al. Genome Res

2012). Estmates are

partitioned by annotation

class: 0f=0-fold degenerate;

4f=4-fold degenerate; FEI=

fastest evolving intronic;



Our work continues along two paths that use both population genomic and transgenic/engineering approaches.  First, piggybacking on the efforts to understand the evolution of gene regulatory networks, we continue to develop genomic resources for the associated species in the D. yakuba and D. simulans species groups by establishing improved genome reference sequences and new population genomic datasets using a combination of recent technologies (i.e. Pacbio, Illumina and 10X-genomics). With these data, we continue to work with theoreticians to estimate selection parameters shaping patterns of genome variability in these species.  A key parameter in such inferences is the fine-scale recombination landscape. Previously my group developed a novel high-throughput genotyping method that enables use to rapidly and cost-effectively genotype thousands of recombinant Drosophila genomes for close to just $1-2 per genome. We have applied this method to constructing linkage maps in a variety of contexts. In a continuing collaboration with David Stern (HHMI Janelia Farm) and Justin Blumenstiel (U Kansas), we are using this approach to generate the first high-resolution genetic maps for species in the D. simulans and D. yakuba species groups.

Second, we are also employing transgenic/engineering approaches to investigate the relationship between non-coding genome divergence (a substantial fraction of which is putatively adaptive) and phenotype.  For example, much of the “adaptive non-coding DNA divergence” that we detected in Drosophila is presumed to have effects on gene expression phenotype. We have just completed a five-year NIH R01 grant investigating the nature of 3’UTR sequence divergence on gene expression divergence between two closely related species of Drosophila (as an example of the effects of regulatory non-coding DNA).  During the course of that study, we developed transgenic reporter and CRISPR-cas9-mediated genome editing approaches to evaluate the effect of 3’UTR divergence on gene expression phenotype in D. melanogaster. A major finding to emerge from these studies so far is that substitutions between closely related Drosophila species that result in gene expression differences often involve interactions with the genetic background on which they arose (a phenomenon called “epistasis”), in a sex and tissue-dependent manner.  This finding has broad implications for the relationship between genotype and phenotype in regulatory non-coding DNA and provides an interesting link to evidence for epistasis we have previously documented in Drosophila protein evolution. 

Representative papers:

Hu TT*, Eisen MB, Thornton KR, Andolfatto P. 2013. A second generation assembly of the Drosophila simulans genome provides new insights into patterns of lineage-specific divergence. Genome Research, 23:89-98. [Epub ahead of print Aug 2012]

Rogers RL, Cridland JM, Shao L, Hu TT*, Andolfatto P, Thornton KR. 2014. Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans. Mol Biol Evol. 31: 1750-66.

Elyashiv E, Sattath S, Hu TT*, Strustovsky A, McVicker G, Andolfatto P, Coop G, Sella G. 2016. A genomic map of the effects of linked selection in Drosophila.  arXiv doi: PLoS Genetics, 12:e1006130.