Scientists reveal how biological activity is regulated in fruit fly and roundworm genomes
The genetic code is complex. Cells convert four different chemical units that are strung along the DNA fiber into functional molecules. These molecules interact with enzymes, transcription factors, myriad RNAs, and other molecules to regulate the activity of the genome.
The genomes of the fruit fly, Drosophila melanogaster, and roundworm, Caenorhabditis elegans, were initially sequenced as part of the Human Genome Project. Both species are powerful models for understanding human biology and even disease, as many functional elements have been conserved across the vast evolutionary distances separating each genome.
The fruit fly and roundworm manuscripts appear in the Dec. 24 issue of Science and are authored by members of the model organism ENCyclopedia Of DNA Elements Consortium (modENCODE), which is funded by the National Human Genome Research Institute (NHGRI), part of the National Institutes of Health. In addition, more than a dozen companion modENCODE papers are published online in the journals Nature, Genome Research and Genome Biology.
“These findings will enable scientists everywhere to carry out experiments in flies and worms to better understand the relationship between molecular and biological activities in these animals,” said NHGRI Director Eric D. Green, MD, PhD “What we learn from these model organisms will contribute significantly to our understanding of health and disease in humans.”
The fruit fly and roundworm modENCODE projects were launched in 2007 to complement the work being done by the ENCyclopedia Of DNA Elements Consortium (ENCODE), which is building a comprehensive catalog of elements in the human genome crucial to biological function. In 2007, ENCODE completed a pilot project that developed innovative methods and technologies to find functional elements in about 1 percent of the human genome.
modENCODE takes advantage of many of the same tools, but applies them to the smaller, and therefore more manageable, genomes of the fruit fly and the roundworm. Unlike the human effort, modENCODE researchers can conduct genetic experiments on flies or worms to validate the biological relevance of the functional elements they have identified. By combining and comparing data from worms, flies and humans, researchers can learn far more about the functional elements than if they analyzed the genome of just one species.
To produce the functional analysis of the fly and worm genomes, researchers studied many different cell types and stages of development to produce the preliminary catalogs of functional elements. In addition to genes that code for proteins, these functional parts include non-protein-coding genes; regulatory elements involved in the control of gene transcription; and DNA sequences that mediate the structure and dynamics of chromosomes.
In today’s findings, the fruit fly and roundworm researchers report the discovery of hundreds of new protein-coding genes. For instance, the roundworm was previously estimated to have 20,000 genes in its genome,,and there is now direct evidence for a total of more than 21,000 genes. Moreover, they identified thousands of novel gene transcripts — instructions from genes that produce proteins — along with thousands of new non-protein coding RNAs (ncRNAs), which regulate gene expression, whether a gene is turned on or off.
“We now know when these genes are used in the life cycle and increasingly what cells the genes are used in,” said senior author of the roundworm paper Robert H. Waterston, M.D., Ph.D., chair of the Department of Genome Sciences, University of Washington in Seattle. “Putting the pieces together has begun to reveal how genes may work in concert to produce the marvelous biology of the roundworm and fruit fly.”
“Identification of thousands of new transcripts has significantly increased our knowledge of the repertoire of proteins used in flies,” said Susan Celniker, PhD, head of the Department of Genome Dynamics, Lawrence Berkeley National Laboratory and lead on the project to identify fruit fly RNAs. “Our work provides new resources for studying development, sex determination and aging.”
The researchers also examined the organization and structure of chromatin in the cells throughout the life stages of each organism. Chromatin is the protein superstructure that packages DNA and modulates which sections of the genome are accessible to regulatory molecules that convert the genetic code into cellular action. Both groups discovered specific chromatin signatures associated with the regulation of protein-coding genes in the fruit fly and roundworm genomes. Unique chromatin signatures were associated with distinct regions of the genome that either turn genes on and off.
“Chromatin signatures are emerging as a powerful lens into the structure and function of the regulatory portion of the genome that controls cell activity” said senior author of the fly paper Manolis Kellis, PhD, associate professor of computer science at MIT.
Next, to identify sites responsible for controlling when genes are turned on during the development of an organism, and in which tissues genes are used, the researchers searched across the genomes of worm and fly during key stages of development. Primarily, the looked for transcription factors — regulatory proteins often found in specific tissues that control the expression of different genes.
They found in both organisms that many different regulatory proteins bind to the same, overlapping regions of the genome in both organisms. Approximately 300 such sites in the roundworm genome and 2,000 sites in the fruit fly genome were identified that the researchers considered high-occupancy targets (HOT) for the transcription factors. The researchers from both groups speculate that HOT regions play a role in opening the chromatin so that other transcription factors can easily bind to the DNA inside.
The ability to combine the functional data in the fruit fly and roundworm genomes allowed the researchers to construct predictive models of functional networks that connected regulatory elements with gene expression changes across specific life stages. The researchers were able to use the combined data to predict the function and expression of thousands of genes in each genome.
“The results of modENCODE allow scientists to begin reading the genome sequences, moving from a list of letters to delineating words and punctuation marks,” said Elise Feingold, Ph.D., an NHGRI program director in charge of the ENCODE and modENCODE projects.
“To make this effort a success required a high level of coordination and teamwork amongst the groups that enabled the synthesis and high-level view of the data presented in these publications,” added NHGRI program director Peter Good, PhD, who is also involved in the management of the ENCODE and modENCODE projects.
Over the next year, modENCODE researchers will work to identify additional functional components to expand on the fruit fly and roundworm functional catalogs. Moreover, by combining and comparing data from worms, flies and humans, scientists hope to learn far more about the functional elements and regulatory principles in each genome, and gain new insights into human health and disease. The data for the project are released rapidly after being produced and are available from the modENCODE Data Coordination Center and the modMINE data warehouse.
NHGRI is one of 27 institutes and centers at the NIH, an agency of the Department of Health and Human Services. The NHGRI Division of Extramural Research supports grants for research and for training and career development at sites nationwide.
The National Institutes of Health — "The Nation's Medical Research Agency" — includes 27 institutes and centers, and is a component of the U.S. Department of Health and Human Services. It is the primary federal agency for conducting and supporting basic, clinical and translational medical research, and it investigates the causes, treatments, and cures for both common and rare diseases.