Skip to main contentSkip to breadcrumbsSkip to sub navSkip to doormat

Andrea Pauli

Recent genome-wide analyses have spurred the notion of ‘pervasive translation’ outside of known protein-coding genes. Some of these translated regions have been predicted to encode short, conserved proteins, while others lack signatures of protein conservation and might have regulatory roles. We aim to identify functions for these newly discovered short translated open reading frames (ORFs) during embryogenesis by employing genetic, molecular, cellular and genomics approaches in zebrafish embryos.

Functions of short translated open reading frames (ORFs) in the context of development

Figure 1: Identification and annotation of novel, embryonically expressed coding and non-coding transcripts. RNA-Seq-based transcript assemblies were evaluated for their coding potential by PhyloCSF (Li et al., 2011; Pauli & Valen et al., 2012), BLAST and Ribosome Profiling (Chew et al., 2013; Pauli et al., 2014). About ~400 novel coding genes in zebrafish, including Toddler, were identified (Pauli et al., 2014).

Large-scale forward-genetic screens towards the end of the last century have identified the majority of known embryonic signalling pathways and embryonically essential genes. This relatively small set of genes with known essential embryonic functions is in stark contrast to the large number of transcripts that are expressed during embryogenesis (Pauli & Valen et al., 2012). Moreover, hundreds of these transcripts were recently predicted to encode small proteins (Chew et al., 2013; Pauli et al., 2014; Bazzini et al., 2014). Apart from novel protein-coding genes, these studies have also identified a large number of translated regions outside of protein-coding ORFs on 5’ leaders of coding genes (upstream ORFs (uORFs)) and on transcripts previously thought to be non-coding.  

Figure 2: Toddler signalling promotes gastrulation movements. A) Toddler is conserved in vertebrates. ClustalW2 multiple protein sequence alignment of the Toddler peptide sequences from five vertebrates. B) toddler mutant embryos have defects in endoderm and mesoderm migration, lack a functional heart, and show posterior accumulation of blood. Shown are in situ hybridizations (endoderm (sox17), mesoderm (fn1), heart (cmlc2)) or live embryos (morphology) of wild type (top) and toddler mutant embryos at the indicated times of development (hpf, hours post fertilization). C) Illustration of mesendoderm migration in wild type (top) and toddler mutant (bottom) embryos during gastrulation. toddler mutant embryos show reduced animal pole-directed migration. Adapted from Pauli et al., 2014.

To determine if any of these uncharacterised, translated ORFs might have a function, we focused on the putative signalling protein, Toddler (Pauli et al., 2014). Toddler had previously been annotated as a non-coding RNA, but it encodes a short, conserved, and secreted peptide. Zebrafish embryos lacking Toddler peptide die during embryogenesis and lack a functional heart.

Local and ubiquitous expression of Toddler promotes mesendodermal cell movement during gastrulation, suggesting that Toddler is neither an attractant nor a repellent but acts globally as a motogen. Toddler drives internalization of G-protein-coupled APJ/Apelin receptors, and activation of APJ/Apelin receptor signalling rescues toddler mutants. These results indicate that Toddler is an activator of APJ/Apelin receptor signalling, promotes gastrulation movements, and might be the first in a series of uncharacterized developmental regulators. Moreover, the discovery of Toddler provides the proof of principle that functional, short translated ORFs remain to be identified.

We will build on these initial findings and investigate the functions and mechanisms of short translated ORFs during embryogenesis. We will mainly use zebrafish embryos since they are an ideal model system for functional genomics in a developing organism. 

Figure 3: Identifying functions of short translated ORFs. Many regions outside of known protein-coding genes are translated. Protein-coding regions are shown in blue, non-coding/untranslated regions in black. Translated yet non-conserved ORFs are indicated in grey (e.g. upstream ORFs (uORFs)). We will focus on three main areas: 1) Identifying the mechanism of the essential embryonic signal Toddler. 2) Identifying functions of other newly discovered short proteins. 3) Identifying functions for regulatory translation (e.g. uORFs).

We will focus on three main areas:

  1. What are the molecular and cellular mechanisms by which Toddler/Apelin receptor signalling promotes gastrulation movements?
  2. What are the functions of other conserved yet uncharacterized short proteins during embryogenesis?
  3. How does regulatory translation, e.g. of upstream ORFs (uORFs), contribute to developmental transitions?

Selected Publications

  • Pauli, A., Valen, E., Schier, AF. (2015). Identifying (non-)coding RNAs and small peptides: challenges and opportunities. Bioessays. 37(1):103-12
  • Pauli, A., Norris, ML., Valen, E., Chew, GL., Gagnon, JA., Zimmerman, S., Mitchell, A., Ma, J., Dubrulle, J., Reyon, D., Tsai, SQ., Joung, JK., Saghatelian, A., Schier, AF. (2014). Toddler: an embryonic signal that promotes cell movement via Apelin receptors. Science. 343(6172):1248636
  • Chew, GL., Pauli, A., Rinn, JL., Regev, A., Schier, AF., Valen, E. (2013). Ribosome profiling reveals resemblance between long non-coding RNAs and 5' leaders of coding RNAs. Development. 140(13):2828-34
  • Pauli, A., Valen, E., Lin, MF., Garber, M., Vastenhouw, NL., Levin, JZ., Fan, L., Sandelin, A., Rinn, JL., Regev, A., Schier, AF. (2012). Systematic identification of long noncoding RNAs expressed during zebrafish embryogenesis. Genome Res. 22(3):577-91
  • Pauli, A., Rinn, JL., Schier, AF. (2011). Non-coding RNAs as regulators of embryogenesis. Nat Rev Genet. 12(2):136-49