Preprocessing and analysis of microRNA-seq data

Preprocessing and analysis of microRNA-seq data


Author(s): Matthew Nicholson McCall,Sami Leon,Andrea Baran

Affiliation(s): University of Rochester Medical Center

Social media: https://twitter.com/matthewnmccall

MicroRNAs are a class of small (18-24 nucleotide) RNAs that are essential regulators of gene expression, which act within the RNA-induced silencing complex (RISC) to bind mRNAs and suppress translation (Valencia-Sanchez et al., 2006). Alterations in microRNA expression have been shown to disrupt entire cellular pathways, substantially contributing to a variety of human diseases (Mendell and Olson, 2012). Despite nearly 25 years of research, microRNAs remain difficult to measure due to their short length, relatively small number, sequence similarity, and difficulty to isolate from other small RNA fragments. The majority of recent studies use small RNA-seq (also called microRNA-seq) to quantify microRNA expression because it allows for the quantification of isomiRs (microRNA isoforms) and the possibility of identifying novel microRNAs. Statistical analyses of microRNA-seq data are typically performed using methods developed for mRNA-seq data despite the fact that microRNA-seq data violate several of the assumptions of these methods. We propose new statistical methods for preprocessing and analysis of microRNA-seq data that are tailored to the specific complexities of these data.