Computational Regulatory Genomics

OHLER LABORATORY

Are you sure you want to delete: << Multivariate Markov Modeling Inference Engine >> ?
<< YES >> << NO >>

Multivariate Markov Modeling Inference Engine

MUMMIE is a modeling tool designed for advanced sequence analysis in the post-genomic era. In addition to traditional HMM modeling of nucleic or amino acid sequences, it can utilize any number of additional “parallel tracks” containing continuous or discrete data—for example, epigentic marks, next-gen sequencing read coverage, conservation scores, thermodynamic scores, etc.—anything you can view in the UCSC browser can in principle be used in MUMMIE.

Whereas a standard HMM models the grammatical structure of a DNA sequence in isolation (or multiple DNA sequences in the case of a phyloHMM or pairHMM), MUMMIE allows continuous covariates to inform the prediction process, by modifying the probability distribution in each state to jointly model the sequence and the covariates. This has an enormous number of potential uses. MUMMIE has so far been used to predict miRNA target sites in RNA molecules, to perform motif finding in next-gen sequencing data, and to infer combinatorial binding of regulatory proteins.

MUMMIE is a modeling framework that permits these and other analyses to be performed. MUMMIE provides command-line UNIX programs that researchers use to build models, train them in a supervised or unsupervised (or semi-supervised) manner, and deploy them to perform parsing, clustering, or classification of test sequences using rigorous probabilistic methods.

MUMMIE is open-source software developed at Duke University.

MUMMIE has been successfully used for microRNA target-site prediction in PAR-CLIP data: see the microMUMMIE page.