"Next-generation Mendelian genetics by exome sequencing"
Whole exome resequencing reveals an unexpected amount of variability with possible functional consequences in human microRNAs
Targeted exome resequencing projects produce enormous amounts of data on protein coding sequences as well as other interesting functional elements of the genome, which includes microRNAs. These non-coding RNAs are key components of the gene regulatory network in a wide range of species and operate by base complementarity. Due to this mode of action it is commonly believed that such regulatory elements were highly conserved. The analysis of the information available on 23 exomes from healthy southern Spain population, sequenced in the context of the Medical Genome Project, has uncovered an unexpected amount of variability in microRNAs. A total of 558 variants were found in 291 different miRNAs, 131 of which are known to be involved in almost 200 diseases. Among these, 487 (87%) variants were described for the first time in this study. This figure almost doubles the number of known variants in microRNAs and constitutes a remarkably high ratio of discovery. Different parts of the mature structure of the microRNA were affected by variants, which suggest a potential functional effect in the variability found. The average number of variants per individual found within miRNAs positions was of 118. Despite miRNAs were thought to be a highly conserved genomic element, our study has uncovered an unexpectedly high level of variability.
Sequencing whole exomes in order to identify high penetrant variants in few individuals is becoming relatively easy, and calling variants is apparently an easy push-one-button procedure. However, understanding data quality and filtering out potential false positives in SNP calling is far more difficult. We will give a tour among the key QC and filtering issues, and discuss our experiences in calling variants from exome sequencing projects at UCL Genomics.