Fetches either domain level information with e.g. gene ontology annotations or residue level information from the InterPro database.
fetch_interpro(
uniprot_ids = NULL,
return_residue_info = FALSE,
manual_query = NULL,
page_size = 200,
max_tries = 3,
timeout = 20,
show_progress = TRUE
)a character vector of UniProt accession numbers.
a logical value that specifies if either domain or residue information
should be returned by the function. The default is FALSE.
optional, a character value that is a custom query to the InterPro database. This query is pastes after "https://www.ebi.ac.uk/interpro/api/" and before "&page_size=200". The raw data of the query is returned as a list.
a numeric value that specifies the number of entries that should be retrieved per page of a request. The function anyway iterates through all pages, but this parameters allows you to finetune the number of iterations and thus number of requests to the database. Default is 200.
a numeric value that specifies the number of times the function tries to download the data in case an error occurs. The default is 3.
a numeric value that specifies the maximum request time per try. Default is 20 seconds.
a logical value that determines if a progress bar will be shown. Default is TRUE.
A data frame that contains either domain or residue level information for the provided UniProt IDs.
# \donttest{
uniprot_ids <- c("P36578", "O43324", "Q00796", "O32583")
domain_info <- fetch_interpro(uniprot_ids = uniprot_ids)
#> Fetching InterPro Domains ■■■■■■■■■ 25% (1/4) ETA: 4s
head(domain_info)
#> # A tibble: 6 × 13
#> identifier identifier_name identifier_source_da…¹ identifier_type go_id
#> <chr> <chr> <chr> <chr> <chr>
#> 1 IPR002136 Large ribosomal subun… interpro family GO:0…
#> 2 IPR002136 Large ribosomal subun… interpro family GO:0…
#> 3 IPR002136 Large ribosomal subun… interpro family GO:0…
#> 4 IPR013000 Large ribosomal subun… interpro conserved_site NA
#> 5 IPR023574 Large ribosomal subun… interpro homologous_sup… GO:0…
#> 6 IPR023574 Large ribosomal subun… interpro homologous_sup… GO:0…
#> # ℹ abbreviated name: ¹identifier_source_database
#> # ℹ 8 more variables: go_name <chr>, go_code <chr>, go_type <chr>, start <int>,
#> # end <int>, dc_status <chr>, representative <lgl>, accession <chr>
residue_info <- fetch_interpro(
uniprot_ids = uniprot_ids,
return_residue_info = TRUE
)
head(residue_info)
#> accession start end residues fragment_description source_database
#> 1 P36578 NA NA <NA> <NA> <NA>
#> 2 O43324 70 70 I putative MetRS interface cdd
#> 3 O43324 97 97 D putative MetRS interface cdd
#> 4 O43324 100 100 S putative MetRS interface cdd
#> 5 O43324 101 101 Y putative MetRS interface cdd
#> 6 O43324 103 103 E putative MetRS interface cdd
#> source_accession source_name
#> 1 <NA> <NA>
#> 2 cd10305 GST_C_AIMP3
#> 3 cd10305 GST_C_AIMP3
#> 4 cd10305 GST_C_AIMP3
#> 5 cd10305 GST_C_AIMP3
#> 6 cd10305 GST_C_AIMP3
# }