The Wayback Machine - https://web.archive.org/web/20080920064848/http://www.ornl.gov/sci/techresources/Human_Genome/faq/compgen.shtml
DOE Genomes
-

Human Genome Project Information


Functional and Comparative Genomics Fact Sheet

Basic Information
 FAQs
 Glossary
 Acronyms
 Links
 Genetics 101
 Publications

 Meetings Calendar
 Media Guide

About the Project
 What is it?
 Goals
 Landmark Papers
 Sequence Databases
 Timeline
 History
 Ethical Issues
 Benefits
 Genetics 101
 FAQs

Medicine &
the New Genetics

 Home
 Gene Testing
 Gene Therapy
 Pharmacogenomics

 Disease Information
 Genetic Counseling

Ethical, Legal,
Social Issues

 Home
 Privacy Legislation

 Gene Testing
 Gene Therapy
 Patenting
 Forensics
 Genetically Modified Food
 Behavioral Genetics
 Minorities, Race, Genetics
 Human Migration

Education
 Teachers
 Students
 Careers
 Webcasts
 Images
 Videos
 Chromosome Poster
 Presentations
 Genetics 101
 
Gen�tica Websites en Espa�ol

Research
 Home
 Sequence Databases
 Landmark Papers
 Insights

Publications
 Chromosome Poster
 Primer Molecular Genetics
 List of All Publications

  ???Search This Site


 Contact Us
 Privacy Statement

 Site Stats and Credits
 Site Map

Lisa Stubbs
Researching
Human-Mouse
Homologies
Quick Links to questions and answers on this page:

What is functional genomics?

Understanding the function of genes and other parts of the genome is known as functional genomics. The Human Genome Project was just the first step in understanding humans at the molecular level. Though the project is complete, many questions still remain unanswered, including the function of most of the estimated 30,000 human genes. Researchers also don't know the role of single nucleotide polymorphisms (SNPs) --single DNA base changes within the genome-- or the role of noncoding regions and repeats in the genome.


What is comparative genomics? How does it relate to functional genomics?

Comparative genomics is the analysis and comparison of genomes from different species. The purpose is to gain a better understanding of how species have evolved and to determine the function of genes and noncoding regions of the genome. Researchers have learned a great deal about the function of human genes by examining their counterparts in simpler model organisms such as the mouse. Genome researchers look at many different features when comparing genomes: sequence similarity, gene location, the length and number of coding regions (called exons) within genes, the amount of noncoding DNA in each genome, and highly conserved regions maintained in organisms as simple as bacteria and as complex as humans.

Comparative genomics involves the use of computer programs that can line up multiple genomes and look for regions of similarity among them. Some of these sequence-similarity tools are accessible to the public over the Internet. One of the most widely used is BLAST, which is available from the National Center for Biotechnology Information. BLAST is a set of programs designed to perform similarity searches on all available sequence data. For instructions on how to use BLAST, see the tutorial Sequence similarity searching using NCBI BLAST available through Gene Gateway, an online guide for learning about genes, proteins, and genetic disorders.


Why is model organism research important? Why do we care what diseases mice get?

Functional genomics research is conducted using model organisms such as mice. Model organisms offer a cost-effective way to follow the inheritance of genes (that are very similar to human genes) through many generations in a relatively short time. Some model organisms studied in the HGP were the bacterium Escherichia coli, yeast Saccharomyces cerevisiae, roundworm Caenorhabditis elegans, fruit fly Drosophila melanogaster, and laboratory mouse. Additionally, HGP spinoffs have led to genetic analysis of other environmentally and industrially important organisms in the United States and abroad. For more information see HGN 11(1-2) "Public, Private Sectors Join in Mouse Consortium," HGN 8(1) "Third Branch of Life Confirmed," and HGN 7(3-4) "Microbial Genomes Sequenced."


How closely related are mice and humans? How many genes are the same?

Answer provided by Lisa Stubbs of Lawrence Livermore National Laboratory, Livermore, California.

Mice and humans (indeed, most or all mammals including dogs, cats, rabbits, monkeys, and apes) have roughly the same number of nucleotides in their genomes -- about 3 billion base pairs. This comparable DNA content implies that all mammals contain more or less the same number of genes, and indeed our work and the work of many others have provided evidence to confirm that notion.

I know of only a few cases in which no mouse counterpart can be found for a particular human gene, and for the most part we see essentially a one-to-one correspondence between genes in the two species. The exceptions generally appear to be of a particular type --genes that arise when an existing sequence is duplicated.

Gene duplication occurs frequently in complex genomes; sometimes the duplicated copies degenerate to the point where they no longer are capable of encoding a protein. However, many duplicated genes remain active and over time may change enough to perform a new function. Since gene duplication is an ongoing process, mice may have active duplicates that humans do not possess, and vice versa. These appear to make up a small percentage of the total genes. I believe the number of human genes without a clear mouse counterpart, and vice versa, won't be significantly larger than 1% of the total. Nevertheless, these novel genes may play an important role in determining species-specific traits and functions.

However, the most significant differences between mice and humans are not in the number of genes each carries but in the structure of genes and the activities of their protein products. Gene for gene, we are very similar to mice. What really matters is that subtle changes accumulated in each of the approximately 25,000 genes add together to make quite different organisms. Further, genes and proteins interact in complex ways that multiply the functions of each. In addition, a gene can produce more than one protein product through alternative splicing or post-translational modification; these events do not always occur in an identical way in the two species. A gene can produce more or less protein in different cells at various times in response to developmental or environmental cues, and many proteins can express disparate functions in various biological contexts. Thus, subtle distinctions are multiplied by the more than 30,000 estimated genes.

The often-quoted statement that we share over 98% of our genes with apes (chimpanzees, gorillas, and orangutans) actually should be put another way. That is, there is more than 95% to 98% similarity between related genes in humans and apes in general. (Just as in the mouse, quite a few genes probably are not common to humans and apes, and these may influence uniquely human or ape traits.) Similarities between mouse and human genes range from about 70% to 90%, with an average of 85% similarity but a lot of variation from gene to gene (e.g., some mouse and human gene products are almost identical, while others are nearly unrecognizable as close relatives). Some nucleotide changes are �neutral� and do not yield a significantly altered protein. Others, but probably only a relatively small percentage, would introduce changes that could substantially alter what the protein does.

Put these alterations in the context of known inherited human diseases: a single nucleotide change can lead to inheritance of sickle cell disease, cystic fibrosis, or breast cancer. A single nucleotide difference can alter protein function in such a way that it causes a terrible tissue malfunction. Single nucleotide changes have been linked to hereditary differences in height, brain development, facial structure, pigmentation, and many other striking morphological differences; due to single nucleotide changes, hands can develop structures that look like toes instead of fingers, and a mouse's tail can disappear completely. Single-nucleotide changes in the same genes but in different positions in the coding sequence might do nothing harmful at all. Evolutionary changes are the same as these sequence differences that are linked to person-to-person variation: many of the average 15% nucleotide changes that distinguish humans and mouse genes are neutral; some lead to subtle changes, whereas others are associated with dramatic differences. Add them all together, and they can make quite an impact, as evidenced by the huge range of metabolic, morphological, and behavioral differences we see among organisms.


What are knockout mice? How will they help us determine human gene function?

Knockout mice are transgenic mice whose genetic code has been altered by the insertion of foreign genetic material into their DNA. Using this technology, researchers target specific genes --causing them to be expressed or inactivated. These mice are then bred --creating a population of offspring with the trait.

When researchers isolate human genes with unknown functions, they can create knockout mice with these genes and observe the results. Instead of creating merely the mouse equivalent of the human gene, researchers are able to reproduce and express actual human genes and their corresponding proteins in mice. Subsequent offspring will inherit not only the instructions coded by their original mouse genome, but also the traits coded for by the inserted human DNA. This helps researchers understand health and disease by observing how genes work in cells.

Knockout mice have many benefits. They not only allow researchers to determine gene function and understand diseases at the molecular level, but they also aid scientists in testing new drugs and devising novel therapies.


Why are mice used in this research?

Mice are genetically very similar to humans. They also reproduce rapidly, have short life spans, are inexpensive and easy to handle, and can be genetically manipulated at the molecular level.


What genomes have been sequenced completely?

In addition to the human genome, the genomes of about 800 organisms have been sequenced in recent years. These include the mouse Mus musculus, the fruitfly Drosophila melanogaster, the worm Caenorhabditis elegans, the bacterium Escherichia coli, the yeast Saccharomyces cerevisiae, the plant Arabidopsis thaliana, and many microbes.

Other resources for information on sequenced genomes:

  • DOE Joint Genome Institute -- Human, plant, animal, and microbial sequencing.
  • GOLD -- Genomes Online Database provides comprehensive access to information regarding complete and ongoing genome projects around the world.
  • Comprehensive Microbial Resource -- A tool that allows the researcher to access all of the bacterial genome sequences completed to date.
  • Entrez Genome -- A resource from the National Center for Biotechnology Information (NCBI) for accessing information about completed and in-progress genomes.

What are the comparative genome sizes of humans and other organisms being studied?

organism
estimated size
(base pairs)
estimated
gene number
average gene density
chromosome
number
Homo sapiens
(human)
3.2 billion
~25,000
1 gene per 100,000 bases
46
Mus musculus
(mouse)
2.6 billion
~25,000
1 gene per 100,000 bases
40
Drosophila melanogaster
(fruit fly)
137 million
13,000
1 gene per 9,000 bases
8
Arabidopsis thaliana
(plant)
100 million
25,000
1 gene per 4000 bases
10
Caenorhabditis elegans
(roundworm)
97 million
19,000
1 gene per 5000 bases
12
Saccharomyces cerevisiae
(yeast)
12.1 million
6000
1 gene per 2000 bases
32
Escherichia coli
(bacteria)
4.6 million
3200
1 gene per 1400 bases
1
H. influenzae
(bacteria)
1.8 million
1700
1 gene per 1000 bases
1
*Information extracted from genome publication papers below.

Genome size does not correlate with evolutionary status, nor is the number of genes proportionate with genome size.

Genome Publication Papers

Human
International Human Genome Sequencing Consortium. Initial sequencing and analysis of the human genome. Nature 409: 860-921. (15 February 2001)
[ Full Text]

Rat
Rat Genome Sequencing Project Consortium. Genome Sequence of the Brown Norway Rat Yields Insights into Mammalian Evolution. Nature 428: 493-521. (1 April 2004)
[Full Text

Mouse
Mouse Genome Sequencing Consortium. Initial sequencing and comparative analysis of the mouse genome. Nature 420: 520 -562. (5 December 2002)
[Full Text]

Fruit Fly
M. D. Adams, et al. The genome sequence of Drosophila melanogaster. Science (24 March 2000) 287: 2185-95.

Arabidopsis - First Plant Sequenced
The Arabidopsis Genome Initiative. Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature 408: 796-815. (14 December 2000)

Roundworm - First Mutlicellular Eukaryote Sequenced
The C. elegans Sequencing Consortium.Genome sequence of the nematode C. elegans: A platform for investigating biology. Science (11 December 1998) 282: 2012-8.

Yeast
A. Goffeau, et al. Life with 6000 genes. Science (25 October 1996) 274: 546, 563-7.

Bacteria - E. coli
F. R. Blattner, et al. The complete genome sequence of Escherichia coli K-12. Science 277: 1453-1474. (5 September 1997)

Bacteria - H. influenzae - First Free-living Organism to be Sequenced
R. D. Fleischmann, et al. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science (28 July 1995) 269: 496-512.


NCBI Entrez Genomes

Browse the genomes of model organisms with MapViewer and other genome resources from the National Center for Biotechnology Information (NCBI). For a step-by-step tutorial on how to use NCBI MapViwer to view the human genome see Finding a gene on a chromosome map.


More Information

  Related Web Sites

Articles

Other Resources

  • Genomics: The Human Genome and Beyond - From the DOE Joint Genome Institute.
  • Interactive Mouse Genetics from Explore Learning - Learn about mouse genetics and the statistics behind the inheritance of red eyes and black fur. Requires free Shockwave plug-in.
  • Graphic - Using Mice to Understand Human Gene Function. From the HGP Image Gallery.
  • Graphic - Mouse-human homology.
  • Graphic - Researcher and mouse.
  • Discovering Genomics, Proteomics, and Bioinformatics by A.M. Campbell and L.J. Heyer. Cold Spring Harbor Laboratory Press (2003), 252 pp.

Send the url of this page to a friend


Last modified: Friday, September 19, 2008

Home * Contacts * Disclaimer

Base URL: www.ornl.gov/hgmis

Office of Science Site sponsored by the U.S. Department of Energy Office of Science, Office of Biological and Environmental Research, Human Genome Program