
Image from ScienceNOW, July 14, 2005
Cornell Home
BSCB Home
MBG Home
CS Home
Compbio Graduate Field
New Life Sciences Init
Browser Mirror
Carlos Bustamante
Jason Mezey
Andy Clark
Chip
Aquadro
Lee Kraus
Steve Tanksley
Rick Durrett
David Haussler
Webb Miller
Mathieu Blanchette
Gill Bejerano
Katie Pollard
Michael Brent
Rasmus Nielsen
Assistant Professor, Biological Statistics & Computational Biology
Address: 101 Biotechnology Building,
Cornell University, Ithaca, NY 14853
Phone: 607-254-1157
Email: acs4 at cornell dot edu (Please read before
emailing)
Ph.D., Computer Science, UC Santa Cruz, 2005
M.S., Computer Science, University of New Mexico, 2001
B.S., Ag & Bio Engineering, Cornell, 1994
Short bio / CV
My research interests lie in the area where statistics, computer science, evolutionary biology, and genomics meet. Currently, my main focus is developing computational methods for the identification of functional elements in eukaryotic (primarily mammalian) genomes, based on comparative sequence data. A major theme in my work is to model and analyze the evolution and the function of genomic sequences simultaneously, so that evolution sheds light on function, and function sheds light on evolution. I like to tackle problems of practical importance in genomics, such as gene finding and conserved element identification, using methods from machine learning and computational statistics. As much as possible, I try to stay grounded in biology by working with experimentalists to test predicted functional elements in the lab.
Details about selected research projects.
My research is supported by a Packard Fellowship, a Microsoft Research New Faculty Fellowship, a National Science Foundation CAREER Award, and grants from the National Institutes of Health.
In the Fall, I alternate between teaching "BTRY 484/684: Computational Genomics" and "BTRY 479/679: Probabilistic Graphical Models" (cross-listed CS 4782/6782). In Fall 2008, I will teach BTRY 479/679. The official description for this course is:
A thorough introduction to graphical models, a flexible and powerful framework for machine learning and probabilistic modeling that combines graph theory and probability theory. Covers both directed models (Bayesian networks) and undirected models, inference and parameter learning, and exact and approximate algorithms. Special cases such as hidden Markov models, tree-like Bayesian nets, and conditional random fields are discussed in detail.
The description for BTRY 484/684 is:
A rigorous treatment of important computational principles and methods for the analysis of genomic data, emphasizing comparative and evolutionary genomics. Topics include sequence alignment, gene and motif finding, phylogeny reconstruction, and inference of gene regulatory networks. Covers both maximum likelihood and Bayesian principles, and both exact and approximate algorithms for inference. Draws heavily on general concepts from probabilistic graphical models.
The course web page for the Fall, 2007 version of BTRY 484/684 can be found at http://compgen.bscb.cornell.edu/btry484.
I also teach a journal-club style graduate seminar in computational genomics in the Spring: "BTRY 720: Topics in Computational Genomics" (2008 web page)
Kosiol C, Vinar T, da Fonseca RR, Hubisz MJ, Bustamante CD, Nielsen R, Siepel A. Patterns of positive selection in six mammalian genomes. To appear in PLoS Genetics.
Siepel A, Diekhans M, Brejova B, Langton L, Stevens M, Comstock CLG, Davis C, Ewing B, Oommen S, Lau C, Yu H-C, Li J, Roe BA, Green P, Gerhard DS, Temple G, Haussler D, Brent MR. Targeted discovery of novel human exons by comparative genomics. Genome Res., 17:1763-1773, 2007.
Rhesus Macaque Genome Sequencing and Analysis Consortium. Evolutionary and biomedical insights from the rhesus macaque genome. Science, 13(316):222-234, 2007.
Pollard KS, Salama SR, Lambert N, Lambot M-A, Coppens S, Pedersen JS, Katzman S, King B, Onodera C, Siepel A, Kern AD, Dehay C, Igel H, Ares M, Vanderhaeghen P, and Haussler D. An RNA gene expressed during cortical development evolved rapidly in humans. Nature 443:167-172, 2006.
Siepel A, Bejerano G, Pedersen JS, Hinrichs A, Hou M, Rosenbloom K, Clawson H, Spieth J, Hillier LW, Richards S, Weinstock GM, Wilson RK, Gibbs RA, Kent WJ, Miller W, and Haussler D. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15:1034-1050, 2005.
Siepel A and Haussler D. Phylogenetic estimation of context-dependent substitution rates by maximum likelihood. Mol Biol Evol 21:468-488, 2004.
Siepel A and Haussler D. Combining phylogenetic and hidden Markov models in biosequence analysis. J Comput Biol 11:413-428, 2004.
Siepel A, Farmer A, Tolopko A, Zhuang M, Mendes P, Beavis W, and Sobral B. ISYS: A decentralized, component-based approach to the integration of heterogeneous bioinformatics resources. Bioinformatics 17:83-94, 2001.
Siepel AC, Halpern AL, Macken C, and Korber B. A computer program designed to screen rapidly for HIV Type 1 intersubtype recombinant sequences. AIDS Res Hum Retroviruses 11:1413-1416, 1995.