Associate Professor, Biological Statistics & Computational Biology
Director of Graduate Studies, Computational Biology
Associate Director, Cornell Center for Comparative and Population Genomics
Address: 102E Weill Hall,
Cornell University, Ithaca, NY 14853
Email: acs4 at cornell dot edu (For students)
My research interests lie in the area where statistics, computer science, evolutionary biology, and genomics meet. Currently, my main focus is developing computational methods for the identification of functional elements in eukaryotic (primarily mammalian) genomes, based on comparative sequence data. A major theme in my work is to model and analyze the evolution and the function of genomic sequences simultaneously, so that evolution sheds light on function, and function sheds light on evolution. I like to tackle problems of practical importance in genomics, such as gene finding and conserved element identification, using methods from machine learning and computational statistics. As much as possible, I try to stay grounded in biology by working with experimentalists to test predicted functional elements in the lab.
My research is supported by a Packard Fellowship, a Microsoft Research New Faculty Fellowship, a Sloan Research Fellowship, a National Science Foundation CAREER Award, and grants from the National Institutes of Health.
In the Fall, I alternate between teaching "BTRY 4840/6840: Computational Genomics" and "BTRY 6790: Probabilistic Graphical Models" (cross-listed CS 6782).
In Fall 2011, I am teaching BTRY 4840/6840 (course web page). The official description for this course is as follows:
A rigorous treatment of important computational principles and methods for the analysis of genomic data, emphasizing comparative and evolutionary genomics. Topics include sequence alignment, gene and motif finding, phylogeny reconstruction, and inference of gene regulatory networks. Covers both maximum likelihood and Bayesian principles, and both exact and approximate algorithms for inference. Draws heavily on general concepts from probabilistic graphical models.
In Fall 2010, I taught BTRY 6790 (course web page). The official description for this course is as follows:
A thorough introduction to probabilistic graphical models, a flexible and powerful graph-based framework for probabilistic modeling. Covers directed and undirected models, exact and approximate inference, and learning in the presence of latent variables. Hidden Markov models, conditional random fields, and Kalman filtering are explored in detail.
Unfortunately, this course will not be offered in 2012, as I will be on sabbatical for the 2012-2013 academic year, but I hope to offer it again in 2013 or 2014.
I also contribute to a team-taught course called "BTRY 6700: Applied Bioinformatics", together with Haiyuan Yu, Jason Mezey, and Alon Keinan.
Siepel A. Phylogenomics of primates and their ancestral populations. Genome Res, 19:1929-1941, 2009.
Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A. Patterns of positive selection in six mammalian genomes. PLoS Genet, 4(8):e1000144, 2008.
Siepel A, Diekhans M, Brejova B, Langton L, Stevens M, Comstock CLG, Davis C, Ewing B, Oommen S, Lau C, Yu H-C, Li J, Roe BA, Green P, Gerhard DS, Temple G, Haussler D, Brent MR. Targeted discovery of novel human exons by comparative genomics. Genome Res., 17:1763-1773, 2007.
Rhesus Macaque Genome Sequencing and Analysis Consortium. Evolutionary and biomedical insights from the rhesus macaque genome. Science, 13(316):222-234, 2007.
Pollard KS, Salama SR, Lambert N, Lambot M-A, Coppens S, Pedersen JS, Katzman S, King B, Onodera C, Siepel A, Kern AD, Dehay C, Igel H, Ares M, Vanderhaeghen P, and Haussler D. An RNA gene expressed during cortical development evolved rapidly in humans. Nature 443:167-172, 2006.
Siepel A, Bejerano G, Pedersen JS, Hinrichs A, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, Weinstock GM, Wilson RK, Gibbs RA, Kent WJ, Miller W, and Haussler D. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15:1034-1050, 2005.
Siepel A and Haussler D. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 21:468-488, 2004.
Siepel A and Haussler D. Combining phylogenetic and hidden Markov models in biosequence analysis. J Comput Biol 11:413-428, 2004.
Siepel A, Farmer A, Tolopko A, Zhuang M, Mendes P, Beavis W, and Sobral B. ISYS: A decentralized, component-based approach to the integration of heterogeneous bioinformatics resources. Bioinformatics 17:83-94, 2001.
Siepel AC, Halpern AL, Macken C, and Korber B. A computer program designed to screen rapidly for HIV Type 1 intersubtype recombinant sequences. AIDS Res Hum Retroviruses 11:1413-1416, 1995.