non-coding RNA and evolution

We know that simple Mendelian disorders can be caused by a wide variety of genetic variants, from single nucleotide point mutations to short insertions and deletions (indels) to larger structural variants. However most research into complex disease has focused on using genetic association methods on single nucleotide polymorphisms (SNPs) in conserved single copy parts of the genome. This overlooks fast evolving structurally variable parts of the human genome, representing perhaps 5-10% of the total sequence, which we know are enriched for functional non-coding sequence. In particular miRNAs evolve rapidly and are often present in multiple and variable copy numbers. We are exploring the contribution of such regions to quantitative traits in a vertebrate model system, focusing in particular on miRNA gene evolution, and develop methods that may potentially be applied to the structurally diverse portion of the human genome. Cichlid fish from the great lakes of Africa present an unparalleled opportunity to study the genetic basis of vertebrate phenotypes in a rapidly evolving system. In the last 5 million years, since humans began to diverge from chimps, African cichlids have radiated into more than 1,500 species that differ in craniofacial morphology, pigmentation, behavior and many other traits. These species can be considered a collection of “natural mutants”, screened by evolutionary selection for phenotypes expressed at all life stages. Because of their close genetic relationship, they share a fundamentally similar genetic system, which allows us to explore the specific genetic differences responsible for their diverse phenotypes. In recognition of this potential five African cichlid genomes were sequenced early in 2011 (Broad Institute). As with other fish genomes, these show extensive structural and copy number variation, even within a single diploid individual. In a collaboration with Richard Durbin we are investigating the genetic diversity and the contribution of copy number variable regions and non-coding RNAs to phenotypic diversity within a set of 100 phenotypically diverse but closely related Malawi haplochromine cichlids by genome sequencing. We expect this work to deliver unprecedented insights into functional non-coding RNAs in general and miRNA/miRNA target interactions in particular.

cichlid.jpg