Safekipedia
Biological databasesOntology (information science)

Gene Ontology

Adapted from Wikipedia · Discoverer experience

Logo of the Gene Ontology project, a biological database used for organizing scientific information.

The Gene Ontology (GO) is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. It helps scientists understand what genes do and how they work together in living things. The project maintains a special list of terms that describe gene functions, making it easier to study genes from different organisms in a consistent way.

The Gene Ontology aims to annotate genes and gene products, meaning it connects genes to what they do. This helps researchers share and use data more effectively. Tools are provided so scientists can analyze their experiments using these terms, which is very useful in research.

Unlike gene nomenclature, which names genes, the Gene Ontology looks at what those genes actually do. It uses a special markup language to make this information machine readable so computers can process it easily. This helps scientists all over the world study genes in a unified way, no matter which type of living thing they are researching.

History

The Gene Ontology began in 1998 when researchers studying fruit flies, mice, and yeast worked together. They created a shared vocabulary to describe genes and what they do. Over time, many databases for different plants, animals, and tiny organisms joined in. By July 2019, the Gene Ontology had over 44,000 terms and millions of annotations for genes in thousands of species. It has become an important tool in bioinformatics for understanding biological data.

The project has three main goals: building the vocabulary, assigning it to genes, and creating tools to help people use the data. Researchers continue to study and improve the Gene Ontology to make it even more useful.

Terms and ontology

An ontology is a way to describe and organize what we know about things and how they are related. The Gene Ontology project creates a special set of terms to describe genes and what they do. These terms cover three main areas: where something is in a cell (cellular component), what it does at the tiny particle level like binding or changing substances (molecular function), and the bigger jobs cells do to stay alive (biological process).

Each term in the Gene Ontology has a name, a special number, a clear meaning, and can be linked to other terms. This helps scientists talk about genes in a way everyone can understand, no matter what kind of living thing they study. The project always updates its terms with help from scientists around the world. You can explore all these terms online using tools like AmiGO.

Example term

Here’s an example of one term from the Gene Ontology:

  • ID: GO:0000016
  • Name: lactase activity
  • Type: molecular_function
  • Definition: "Catalysis of the reaction: lactose + H2O = D-glucose + D-galactose."
  • Synonyms:
    • "lactase-phlorizin hydrolase activity" (broader term)
    • "lactose galactohydrolase activity" (exact match)

This term shows how scientists describe exactly what a gene product does.

Annotation

Genome annotation is about collecting information about genes and their products. The Gene Ontology (GO) uses special terms to describe these genes. These annotations tell us about the genes and include details like where the information came from, how we know it's true, and when it was added.

There are different ways to decide if a gene has a certain function. Sometimes a scientist reads a research paper and decides based on that. Other times, a computer program makes a guess, which is then checked by a scientist. These guesses are a bit less certain, so they are often used for broader descriptions. You can find all these annotations on the GO website, and scientists use them to learn more about genes and how they work.

Example annotation

Here’s a simple example of a gene annotation:

  • Gene product: Actin, alpha cardiac muscle 1, UniProtKB:P68032
  • GO term: heart contraction; GO:0060047 (biological process)
  • Evidence code: Inferred from Mutant Phenotype (IMP)
  • Reference: PMID
  • Assigned by: UniProtKB, June 6, 2008
  • Data source

Tools

There are many tools that use the data from the Gene Ontology project. Most of these tools are made by outside groups, but the Gene Ontology Consortium has created two tools: AmiGO and OBO-Edit.

AmiGO is a website where you can search, look through, and see pictures of the Gene Ontology data. It also has a BLAST tool and ways to study big groups of data. You can use AmiGO on the internet or download it to use on your own computer. It is free open source software. OBO-Edit is another free tool that helps people change and look at ontologies. It is made in Java and works on any computer. OBO-Edit helps you find and change parts of the data, and it can even figure out missing links between pieces of information. Even though it was made for biology, OBO-Edit can work with any type of ontology.

Consortium

The Gene Ontology Consortium is a group of biological databases and research teams working together on the Gene Ontology project. It includes databases for studying different organisms, databases for proteins shared by many species, software developers, and an editorial office dedicated to managing the project.

This article is a child-friendly adaptation of the Wikipedia article on Gene Ontology, available under CC BY-SA 4.0.

Images from Wikimedia Commons. Tap any image to view credits and license.