Fetches information about protein-metal binding sites from the MetalPDB database. A complete list of different possible search queries can be found on their website.

fetch_metal_pdb(
  id_type = "uniprot",
  id_value,
  site_type = NULL,
  pfam = NULL,
  cath = NULL,
  scop = NULL,
  representative = NULL,
  metal = NULL,
  ligands = NULL,
  geometry = NULL,
  coordination = NULL,
  donors = NULL,
  columns = NULL,
  show_progress = TRUE
)

Arguments

id_type

a character value that specifies the type of the IDs provided to id_value. Default is "uniprot". Possible options include: "uniprot", "pdb", "ec_number", "molecule" and "organism".

id_value

a character vector supplying IDs that are of the ID type that was specified in id_type. E.g. UniProt IDs. Information for these IDs will be retreived.

site_type

optional, a character value that specifies a nuclearity for which information should be retrieved. The specific nuclearity can be supplied as e.g. "tetranuclear".

pfam

optional, a character value that specifies a Pfam domain for which information should be retrieved. The domain can be specified as e.g. "Carb_anhydrase".

cath

optional, a character value that specifies a CATH ID for which information should be retrieved. The ID can be specified as e.g. "3.10.200.10".

scop

optional, a character value that specifies a SCOP ID for which information should be retrieved. The ID can be specified as e.g. "b.74.1.1".

representative

optional, a logical that indicates if only information of representative sites of a family should be retrieved it can be specified here. A representative site is a site selected to represent a cluster of equivalent sites. The selection is done by choosing the PDB structure with the best X-ray resolution among those containing the sites in the cluster. NMR structures are generally discarded in favor of X-ray structures, unless all the sites in the cluster are found in NMR structures. If it is TRUE, only representative sites are retrieved, if it is FALSE, all sites are retrieved.

metal

optional, a character value that specifies a metal for which information should be retrieved. The metal can be specified as e.g. "Zn".

ligands

optional, a character value that specifies a metal ligand residue for which information should be retrieved. The ligand can be specified as e.g. "His".

geometry

optional, a character value that specifies a metal site geometry for which information should be retrieved. The geometry can be specified here based on the three letter code for geometries provided on their website.

coordination

optional, a character value that specifies a coordination number for which information should be retrieved. The number can be specified as e.g. "3".

donors

optional, a character value that specifies a metal ligand atom for which information should be retrieved. The atom can be specified as e.g. "S" for sulfur.

columns

optional, a character vector that specifies specific columns that should be retrieved based on the MetalPDB website. If nothing is supplied here, all possible columns will be retrieved.

show_progress

logical, if true, a progress bar will be shown. Default is TRUE.

Value

A data frame that contains information about protein-metal binding sites. The data frame contains some columns that might not be self explanatory.

  • auth_id_metal: Unique structure atom identifier of the metal, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.

  • auth_seq_id_metal: Residue identifier of the metal, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.

  • pattern: Metal pattern for each metal bound by the structure.

  • is_representative: A representative site is a site selected to represent a cluster of equivalent sites. The selection is done by choosing the PDB structure with the best X-ray resolution among those containing the sites in the cluster. NMR structures are generally discarded in favor of X-ray structures, unless all the sites in the cluster are found in NMR structures.

  • auth_asym_id_ligand: Chain identifier of the metal-coordinating ligand residues, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.

  • auth_seq_id_ligand: Residue identifier of the metal-coordinating ligand residues, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.

  • auth_id_ligand: Unique structure atom identifier of the metal-coordinating ligand r esidues, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.

  • auth_atom_id_ligand: Unique residue specific atom identifier of the metal-coordinating ligand residues, which is provided by the author of the structure in order to match the identification used in the publication that describes the structure.

Examples

# \donttest{
head(fetch_metal_pdb(id_value = c("P42345", "P00918")))
#> # A tibble: 6 × 25
#>   site   organism     scop  site_type ec_number pfam  symbol_metal auth_id_metal
#>   <chr>  <chr>        <chr> <chr>     <chr>     <chr> <chr>                <int>
#> 1 4jsv_1 Homo sapiens ""    Trinucle… 2.7.11.1  PI3_… Mg                   22160
#> 2 4jsv_1 Homo sapiens ""    Trinucle… 2.7.11.1  PI3_… Mg                   22160
#> 3 4jsv_1 Homo sapiens ""    Trinucle… 2.7.11.1  PI3_… Mg                   22160
#> 4 4jsv_1 Homo sapiens ""    Trinucle… 2.7.11.1  PI3_… Mg                   22160
#> 5 4jsv_1 Homo sapiens ""    Trinucle… 2.7.11.1  PI3_… Mg                   22163
#> 6 4jsv_1 Homo sapiens ""    Trinucle… 2.7.11.1  PI3_… Mg                   22163
#> # ℹ 17 more variables: name <chr>, pattern <chr>, geometry <chr>,
#> #   auth_seq_id_metal <int>, coordination <int>, molecule <chr>, cath <chr>,
#> #   uniprot <chr>, is_representative <lgl>, pdb <chr>,
#> #   auth_asym_id_ligand <chr>, auth_seq_id_ligand <int>, residue <chr>,
#> #   symbol <chr>, distance <dbl>, auth_id_ligand <int>,
#> #   auth_atom_id_ligand <chr>

fetch_metal_pdb(id_type = "pdb", id_value = c("1g54"), metal = "Zn")
#> # A tibble: 5 × 25
#>   site   organism     scop  site_type ec_number pfam  symbol_metal auth_id_metal
#>   <chr>  <chr>        <chr> <chr>     <chr>     <chr> <chr>                <int>
#> 1 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1   Carb… Zn                    2060
#> 2 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1   Carb… Zn                    2060
#> 3 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1   Carb… Zn                    2060
#> 4 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1   Carb… Zn                    2060
#> 5 1g54_2 Homo sapiens b.74… Mononucl… 4.2.1.1   Carb… Zn                    2060
#> # ℹ 17 more variables: name <chr>, pattern <chr>, geometry <chr>,
#> #   auth_seq_id_metal <int>, coordination <int>, molecule <chr>, cath <chr>,
#> #   uniprot <chr>, is_representative <lgl>, pdb <chr>,
#> #   auth_asym_id_ligand <chr>, auth_seq_id_ligand <int>, residue <chr>,
#> #   symbol <chr>, distance <dbl>, auth_id_ligand <int>,
#> #   auth_atom_id_ligand <chr>
# }