Fetches gene ontology (GO) annotations, terms or slims from the QuickGO EBI database. Annotations can be retrieved for specific UniProt IDs or NCBI taxonomy identifiers. When terms are retrieved, a complete list of all GO terms is returned. For the generation of a slim dataset you can provide GO IDs that should be considered. A slim dataset is a subset GO dataset that considers all child terms of the supplied IDs.

fetch_quickgo(
  type = "annotations",
  id_annotations = NULL,
  taxon_id_annotations = NULL,
  ontology_annotations = "all",
  go_id_slims = NULL,
  relations_slims = c("is_a", "part_of", "regulates", "occurs_in"),
  timeout = 1200,
  max_tries = 2,
  show_progress = TRUE
)

Arguments

type

a character value that indicates if gene ontology terms, annotations or slims should be retrieved. The possible values therefore include "annotations", "terms" and "slims". If annotations are retrieved, the maximum number of results is 2,000,000.

id_annotations

an optional character vector that specifies UniProt IDs for which GO annotations should be retrieved. This argument should only be provided if annotations are retrieved.

taxon_id_annotations

an optional character value that specifies the NCBI taxonomy identifier (TaxId) for an organism for which GO annotations should be retrieved. This argument should only be provided if annotations are retrieved.

ontology_annotations

an optional character value that specifies the ontology that should be retrieved. This can either have the values "all", "molecular_function", "biological_process" or "cellular_component". This argument should only be provided if annotations are retrieved.

go_id_slims

an optional character vector that specifies gene ontology IDs (e.g. GO:0046872) for which a slim go set should be generated. This argument should only be provided if slims are retrieved.

relations_slims

an optional character vector that specifies the relations of GO IDs that should be considered for the generation of the slim dataset. This argument should only be provided if slims are retrieved.

timeout

a numeric value specifying the time in seconds until the download times out. The default is 1200 seconds.

max_tries

a numeric value that specifies the number of times the function tries to download the data in case an error occurs. The default is 2.

show_progress

a logical value that indicates if a progress bar will be shown. Default is TRUE.

Value

A data frame that contains descriptive information about gene ontology annotations, terms or slims depending on what the input "type" was.

Examples

# \donttest{
# Annotations
annotations <- fetch_quickgo(
  type = "annotations",
  id = c("P63328", "Q4FFP4"),
  ontology = "molecular_function"
)
#> Retrieving GO annotations ... 
#> DONE(0.51s)

head(annotations)
#> # A tibble: 6 × 15
#>   gene_product_db gene_product_id go_name     symbol qualifier go_term go_aspect
#>   <chr>           <chr>           <chr>       <chr>  <chr>     <chr>   <chr>    
#> 1 UniProtKB       P63328          phosphopro… Ppp3ca enables   GO:000… molecula…
#> 2 UniProtKB       P63328          phosphopro… Ppp3ca enables   GO:000… molecula…
#> 3 UniProtKB       P63328          phosphopro… Ppp3ca enables   GO:000… molecula…
#> 4 UniProtKB       P63328          protein se… Ppp3ca enables   GO:000… molecula…
#> 5 UniProtKB       P63328          protein se… Ppp3ca enables   GO:000… molecula…
#> 6 UniProtKB       P63328          protein se… Ppp3ca enables   GO:000… molecula…
#> # ℹ 8 more variables: eco_id <chr>, go_evidence_code <chr>, reference <chr>,
#> #   with_from <chr>, taxon_id <dbl>, assigned_by <chr>,
#> #   annotation_extension <chr>, date <dbl>

# Terms
terms <- fetch_quickgo(type = "terms")

head(terms)
#> # A tibble: 6 × 13
#>   main_id    is_obsolete main_name            definition ontology usage child_id
#>   <chr>      <lgl>       <chr>                <chr>      <chr>    <chr> <chr>   
#> 1 GO:0006600 FALSE       creatine metabolic … The chemi… biologi… Unre… GO:0006…
#> 2 GO:0006601 FALSE       creatine biosynthet… The chemi… biologi… Unre… NA      
#> 3 GO:0006611 FALSE       protein export from… The direc… biologi… Unre… GO:0046…
#> 4 GO:0006612 FALSE       protein targeting t… The proce… biologi… Unre… GO:0090…
#> 5 GO:0006610 FALSE       ribosomal protein i… The direc… biologi… Unre… NA      
#> 6 GO:0006604 FALSE       phosphoarginine met… The chemi… biologi… Unre… GO:0046…
#> # ℹ 6 more variables: children_relation <chr>, chebi_id <chr>,
#> #   relations_term <chr>, database <chr>, relations_url <chr>,
#> #   relations_relation <chr>

# Slims
slims <- fetch_quickgo(
  type = "slims",
  go_id_slims = c("GO:0046872", "GO:0051540")
)

head(slims)
#> # A tibble: 6 × 2
#>   slims_from_id slims_to_ids
#>   <chr>         <chr>       
#> 1 GO:0046914    GO:0046872  
#> 2 GO:0032791    GO:0046872  
#> 3 GO:1904434    GO:0046872  
#> 4 GO:1904433    GO:0046872  
#> 5 GO:1904432    GO:0046872  
#> 6 GO:0030151    GO:0046872  
# }