Fetches structure metadata from RCSB. If you want to retrieve atom data such as positions, use
fetch_pdb(pdb_ids, batchsize = 200, show_progress = TRUE)
a character vector of PDB identifiers.
a numeric value that specifies the number of structures to be processed in a single query. Default is 2000.
a logical value that indicates if a progress bar will be shown. Default is TRUE.
A data frame that contains structure metadata for the PDB IDs provided. The data frame contains some columns that might not be self explanatory.
auth_asym_id: Chain identifier provided by the author of the structure in order to match the identification used in the publication that describes the structure.
label_asym_id: Chain identifier following the standardised convention for mmCIF files.
entity_beg_seq_id, ref_beg_seq_id, length, pdb_sequence:
entity_beg_seq_id is a
position in the structure sequence (
pdb_sequence) that matches the position given in
ref_beg_seq_id, which is a position within the protein sequence (not included in the
length identifies the stretch of sequence for which positions match
accordingly between structure and protein sequence.
entity_beg_seq_id is a residue ID
based on the standardised convention for mmCIF files.
auth_seq_id: Residue identifier provided by the author of the structure in order to
match the identification used in the publication that describes the structure. This character
vector has the same length as the
pdb_sequence and each position is the identifier for
the matching amino acid position in
pdb_sequence. The contained values are not
necessarily numbers and the values do not have to be positive.
#> # A tibble: 6 × 32 #> pdb_ids auth_asym_id label_asym_id reference_database_accession protein_name #> <chr> <chr> <chr> <chr> <chr> #> 1 6HG1 A A P27708 CAD protein #> 2 6HG1 A A P27708 CAD protein #> 3 6HG1 A A P27708 CAD protein #> 4 6HG1 A A P27708 CAD protein #> 5 6HG1 A A P05020 Dihydroorotase #> 6 6HG1 A A P05020 Dihydroorotase #> # … with 27 more variables: reference_database_name <chr>, #> # entity_beg_seq_id <int>, ref_beg_seq_id <int>, length <int>, #> # pdb_sequence <chr>, auth_seq_id <list>, id_nonpolymer <chr>, #> # type_nonpolymer <chr>, formula_weight_nonpolymer <dbl>, #> # name_nonpolymer <chr>, formula_nonpolymer <chr>, experimental_method <chr>, #> # structure_method <chr>, affinity_comp_id <chr>, affinity_value <dbl>, #> # pdbx_keywords <chr>, assembly_count <int>, …# }