R/fetch_metal_pdb.R
fetch_metal_pdb.Rd
Fetches information about protein-metal binding sites from the MetalPDB database. A complete list of different possible search queries can be found on their website.
fetch_metal_pdb(
id_type = "uniprot",
id_value,
site_type = NULL,
pfam = NULL,
cath = NULL,
scop = NULL,
representative = NULL,
metal = NULL,
ligands = NULL,
geometry = NULL,
coordination = NULL,
donors = NULL,
columns = NULL,
show_progress = TRUE
)
a character value that specifies the type of the IDs provided to id_value
.
Default is "uniprot". Possible options include: "uniprot", "pdb", "ec_number", "molecule" and
"organism".
a character vector supplying IDs that are of the ID type that was specified in
id_type
. E.g. UniProt IDs. Information for these IDs will be retreived.
optional, a character value that specifies a nuclearity for which information should be retrieved. The specific nuclearity can be supplied as e.g. "tetranuclear".
optional, a character value that specifies a Pfam domain for which information should be retrieved. The domain can be specified as e.g. "Carb_anhydrase".
optional, a character value that specifies a CATH ID for which information should be retrieved. The ID can be specified as e.g. "3.10.200.10".
optional, a character value that specifies a SCOP ID for which information should be retrieved. The ID can be specified as e.g. "b.74.1.1".
optional, a logical that indicates if only information of representative
sites of a family should be retrieved it can be specified here. A representative site is a
site selected to represent a cluster of equivalent sites. The selection is done by choosing
the PDB structure with the best X-ray resolution among those containing the sites in the
cluster. NMR structures are generally discarded in favor of X-ray structures, unless all the
sites in the cluster are found in NMR structures. If it is TRUE
, only representative
sites are retrieved, if it is FALSE
, all sites are retrieved.
optional, a character value that specifies a metal for which information should be retrieved. The metal can be specified as e.g. "Zn".
optional, a character value that specifies a metal ligand residue for which information should be retrieved. The ligand can be specified as e.g. "His".
optional, a character value that specifies a metal site geometry for which information should be retrieved. The geometry can be specified here based on the three letter code for geometries provided on their website.
optional, a character value that specifies a coordination number for which information should be retrieved. The number can be specified as e.g. "3".
optional, a character value that specifies a metal ligand atom for which information should be retrieved. The atom can be specified as e.g. "S" for sulfur.
optional, a character vector that specifies specific columns that should be retrieved based on the MetalPDB website. If nothing is supplied here, all possible columns will be retrieved.
logical, if true, a progress bar will be shown. Default is TRUE.
A data frame that contains information about protein-metal binding sites. The data frame contains some columns that might not be self explanatory.
auth_id_metal: Unique structure atom identifier of the metal, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.
auth_seq_id_metal: Residue identifier of the metal, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.
pattern: Metal pattern for each metal bound by the structure.
is_representative: A representative site is a site selected to represent a cluster of equivalent sites. The selection is done by choosing the PDB structure with the best X-ray resolution among those containing the sites in the cluster. NMR structures are generally discarded in favor of X-ray structures, unless all the sites in the cluster are found in NMR structures.
auth_asym_id_ligand: Chain identifier of the metal-coordinating ligand residues, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.
auth_seq_id_ligand: Residue identifier of the metal-coordinating ligand residues, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.
auth_id_ligand: Unique structure atom identifier of the metal-coordinating ligand r esidues, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.
auth_atom_id_ligand: Unique residue specific atom identifier of the metal-coordinating ligand residues, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.
# \donttest{
head(fetch_metal_pdb(id_value = c("P42345", "P00918")))
#> # A tibble: 6 × 25
#> site organism scop site_type ec_number pfam symbol_metal auth_id_metal
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 4jsv_1 Homo sapiens "" Trinucle… 2.7.11.1 PI3_… Mg 22160
#> 2 4jsv_1 Homo sapiens "" Trinucle… 2.7.11.1 PI3_… Mg 22160
#> 3 4jsv_1 Homo sapiens "" Trinucle… 2.7.11.1 PI3_… Mg 22160
#> 4 4jsv_1 Homo sapiens "" Trinucle… 2.7.11.1 PI3_… Mg 22160
#> 5 4jsv_1 Homo sapiens "" Trinucle… 2.7.11.1 PI3_… Mg 22163
#> 6 4jsv_1 Homo sapiens "" Trinucle… 2.7.11.1 PI3_… Mg 22163
#> # ℹ 17 more variables: name <chr>, pattern <chr>, geometry <chr>,
#> # auth_seq_id_metal <int>, coordination <int>, molecule <chr>, cath <chr>,
#> # uniprot <chr>, is_representative <lgl>, pdb <chr>,
#> # auth_asym_id_ligand <chr>, auth_seq_id_ligand <int>, residue <chr>,
#> # symbol <chr>, distance <dbl>, auth_id_ligand <int>,
#> # auth_atom_id_ligand <chr>
fetch_metal_pdb(id_type = "pdb", id_value = c("1g54"), metal = "Zn")
#> # A tibble: 5 × 25
#> site organism scop site_type ec_number pfam symbol_metal auth_id_metal
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <int>
#> 1 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1 Carb… Zn 2060
#> 2 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1 Carb… Zn 2060
#> 3 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1 Carb… Zn 2060
#> 4 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1 Carb… Zn 2060
#> 5 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1 Carb… Zn 2060
#> # ℹ 17 more variables: name <chr>, pattern <chr>, geometry <chr>,
#> # auth_seq_id_metal <int>, coordination <int>, molecule <chr>, cath <chr>,
#> # uniprot <chr>, is_representative <lgl>, pdb <chr>,
#> # auth_asym_id_ligand <chr>, auth_seq_id_ligand <int>, residue <chr>,
#> # symbol <chr>, distance <dbl>, auth_id_ligand <int>,
#> # auth_atom_id_ligand <chr>
# }