Associate Professor, Biological Statistics & Computational Biology
Director of Graduate Studies, Computational Biology
Associate Director, Cornell Center for Comparative and Population Genomics
Address: 102E Weill Hall,
Cornell University, Ithaca, NY 14853
Email: acs4 at cornell dot edu
My research interests lie in the area where statistics, computer science, evolutionary biology, and genomics meet. Currently, my main focus is developing computational methods for the identification of functional elements in eukaryotic (primarily mammalian) genomes, based on comparative sequence data. A major theme in my work is to model and analyze the evolution and the function of genomic sequences simultaneously, so that evolution sheds light on function, and function sheds light on evolution. I like to tackle problems of practical importance in genomics, such as gene finding and conserved element identification, using methods from machine learning and computational statistics. As much as possible, I try to stay grounded in biology by working with experimentalists to test predicted functional elements in the lab.
In the Fall, I alternate between teaching "BTRY 4840/6840: Computational Genomics" and "BTRY 6790: Probabilistic Graphical Models" (cross-listed CS 6782).
In Fall 2013, I am teaching BTRY 6790 (course web page). The official description for this course is as follows:
A thorough introduction to probabilistic graphical models, a flexible and powerful graph-based framework for probabilistic modeling. Covers directed and undirected models, exact and approximate inference, and learning in the presence of latent variables. Hidden Markov models, conditional random fields, and Kalman filtering are explored in detail.
In Fall 2014, I plan to teach BTRY 4840/6840 (course web page). The official description for this course is as follows:
A rigorous treatment of important computational principles and methods for the analysis of genomic data, emphasizing comparative and evolutionary genomics. Topics include sequence alignment, gene and motif finding, phylogeny reconstruction, and inference of gene regulatory networks. Covers both maximum likelihood and Bayesian principles, and both exact and approximate algorithms for inference. Draws heavily on general concepts from probabilistic graphical models.
Arbiza L, Gronau I, Aksoy BA, Hubisz MJ, Gulko B, Keinan A, Siepel A. Genome-wide inference of natural selection on human transcription factor binding sites. Nature Genetics 45(7):723-729, 2013.
Guertin MJ*, Martins AL*, Siepel A, Lis JT. Accurate prediction of inducible transcription factor binding intensities in vivo. PLoS Genetics 8(3):e1002610, 2012.
Gronau I, Hubisz MJ, Gulko B, Danko CG, Siepel A. Bayesian inference of ancient human demography from individual genome sequences. Nature Genetics 43(10):1031-1034, 2011.
Siepel A. Phylogenomics of primates and their ancestral populations. Genome Res, 19:1929-1941, 2009.
Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A. Patterns of positive selection in six mammalian genomes. PLoS Genet, 4(8):e1000144, 2008.
Siepel A, Diekhans M, Brejova B, Langton L, Stevens M, Comstock CLG, Davis C, Ewing B, Oommen S, Lau C, Yu H-C, Li J, Roe BA, Green P, Gerhard DS, Temple G, Haussler D, Brent MR. Targeted discovery of novel human exons by comparative genomics. Genome Res., 17:1763-1773, 2007.
Rhesus Macaque Genome Sequencing and Analysis Consortium. Evolutionary and biomedical insights from the rhesus macaque genome. Science, 13(316):222-234, 2007.
Pollard KS, Salama SR, Lambert N, Lambot M-A, Coppens S, Pedersen JS, Katzman S, King B, Onodera C, Siepel A, Kern AD, Dehay C, Igel H, Ares M, Vanderhaeghen P, and Haussler D. An RNA gene expressed during cortical development evolved rapidly in humans. Nature 443:167-172, 2006.
Siepel A, Bejerano G, Pedersen JS, Hinrichs A, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, Weinstock GM, Wilson RK, Gibbs RA, Kent WJ, Miller W, and Haussler D. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15:1034-1050, 2005.
Siepel A and Haussler D. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 21:468-488, 2004.
Siepel A and Haussler D. Combining phylogenetic and hidden Markov models in biosequence analysis. J Comput Biol 11:413-428, 2004.