Max Planck Institute for Plant-breeding Research, University of Cologne
"Turning Genomics Data into Text - An Application of Hidden Markov Models"
One of the key questions in biology asks how the information is encoded in the DNA is read and interpreted. There is a plethora of proteins that bind the DNA at specific positions and thereby modify the way the genomic text is transcribed into RNA, the next layer of functional molecules in the cell. The combination of proteins bound to a certain DNA locus is highly characteristic, and therefore termed the “epigenomic code” of the cell. Modern biochemical techniques allow us to measure the binding behavior of these proteins on a genome-wide level, yet the interpretation of these binding patterns requires the application of statistical tools like hidden Markov models, which turn fuzzy, high-dimensional protein binding data into discrete, interpretable patterns. After this conversion step, the genome can be searched like a standard text, e.g., using regular expression search, to discover and annotate new functionally relevant regions in the genome.
Research Interest: Computational Biology, RNA metabolism, Epigenomics
Research Statement: In order to obtain a systems level understanding of the cell, we develop and apply new statistical models that are capable of dealing with large amounts of heterogeneous data.