Systems biology of regulatory motifs and networks – towards understanding gene expression from the DNA sequence

The regulation of gene expression in response to developmental or environmental stimuli is central to all organisms. Transcription is regulated by trans-acting transcription factors that recognize cis-regulatory DNA elements (CRMs or enhancers) and function in a combinatorial fashion. We use both bioinformatics and molecular biology methods to gain a systematic understanding of enhancer structure and function.  Our goal is to “crack” the regulatory code, predict enhancer activity from the DNA sequence, and to understand how transcriptional networks define cellular and developmental programs.

The regulatory code, gene regulatory motifs and regulatory networks

The regulation of gene expression is central to the development of all organisms. In higher eukaryotes, genes are expressed dynamically in complex spatial patterns and mis-expression often results in developmental failures and diseases such as cancer. Tissue specific gene expression is determined by regulatory programs and in turn defines the different animal cell types and their characteristics. Given their central role, core regulatory circuits or kernels have been found to be conserved between animals as diverged as flies and mammals.

Figure 1 (Click to view legend)

A major challenge in molecular biology is to define these circuits and to decipher how the cell utilizes the regulatory information present in the DNA. This is currently hampered by the lack of a regulatory code that– analogously to the genetic code for protein-coding sequences – would allow us to predict spatio-temporal enhancer activity from the DNA sequence. Our group uses both, bioinformatics and molecular biology methods to study enhancer structures, characterize tissue- and cell-type specific expression, and to predict and validate regulatory targets of transcription factors. We focus on the different cell types and organs in Drosophila and aim to explain their expression programs using the regulatory connections of transcription factors.

We use ChIP-Seq experiments to identify tissue-specific targets of transcription factors, and sequence determinants (e.g. other motifs, factors and their combinations) that mediate this tissue-specificity.

We are currently determining the spatio-temporal activity patterns of a large collection of putative enhancers in embryos, and systematically test candidate promoters and enhancers in specific cell-types.  Sequence analyses of enhancers with similar activities will allow us to determine the sequence features underlying enhancer function, which we plan to integrate with qualitative and quantitative information about transcription factor expression. 

Experimental and computational comparative genomics

Figure 2 (Click to view legend)

Functional elements in a genome are typically under evolutionary selection to maintain their functions in related organisms. In collaboration with the Zeitlinger group (Stowers Institute), we study in vivo transcription factor binding sites in 6 Drosophila species at various evolutionary distances from Drosophila melanogaster. We find that transcription factor binding is highly conserved in species as distant from D. melanogaster as platypus or chicken from human. Conservation of binding is strongly correlated with conservation of the corresponding transcription factor (sequence) motifs. We anticipate that these comparative data will allow a detailed dissection of enhancer structure and grammar.

Figure 3 (Click to view legend)

We have developed computational methods to score motif conservation in 12 Drosophila genomes. We discovered novel motif types, identified functional targets of many transcription factors and microRNAs with high confidence, and found that they can help to understand and refine experimental ChIP data. Comparative genomics and related bioinformatics approaches will allow us to integrate our data and knowledge to predict developmental enhancers, regulatory targets for transcription factors, and the expression patterns of genes. They also allow us to integrate microRNA-mediated regulation into regulatory networks and to understand their role in tissue-specific expression programs.

Regulation of gene expression and genome stability by novel classes of small RNAs Novel high-throughput sequencing technology reveals a myriad of novel small RNAs from different functional classes. These are for example involved in regulating gene expression by the microRNA and siRNA pathways and in the control of mobile genetic elements through related silencing pathways involving the PIWI-clade of Argonaute proteins. We are collaborating extensively with experimental labs in the analyses of small RNA and their functional characterization.

Selected Publications

Fly comparative genomics

  • Stark, A., Lin, M.F., Kheradpour, P., Pedersen, J.S., Parts, L., Carlson, J.W., Crosby, M.A., Rasmussen, M.D., Roy, S., Deoras, A.N., et al. (2007). Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature 450, 219-232.

Regulation of transcription

  • Kheradpour, P., Stark, A., Roy, S., and Kellis, M. (2007). Reliable prediction of regulator targets using 12 Drosophila genomes. Genome Res 17, pp. 1919-1931.
  • Zeitlinger, J., Zinzen, R.P., Stark, A., Kellis, M., Zhang, H., Young, R.A., and Levine, M. (2007). Whole-genome ChIP-chip analysis of Dorsal, Twist, and Snail suggests integration of diverse patterning processes in the Drosophila embryo. Genes Dev 21, 385-390.

microRNA gene finding and target prediction

  • Stark, A., Bushati, N., Jan, C.H., Kheradpour, P., Hodges, E., Brennecke, J., Bartel, D.P., Cohen, S.M., and Kellis, M. (2008). A single Hox locus in Drosophila produces functional microRNAs from opposite DNA strands. Genes Dev 22, 8-13.
  • Brennecke, J., Stark, A., Russell, R.B., and Cohen, S.M. (2005). Principles of MicroRNA-Target Recognition. PLoS Biol 3, e85.
  • Stark, A., Brennecke, J., Bushati, N., Russell, R.B., and Cohen, S.M. (2005). Animal MicroRNAs Confer Robustness to Gene Expression and Have a Significant Impact on 3'UTR Evolution. Cell 123, 1133-1146.
topprint