Based on preceding and C-terminal amino acid, the peptide type of a given peptide is assigned. Peptides with preceeding and C-terminal lysine or arginine are considered fully-tryptic. If a peptide is located at the N- or C-terminus of a protein and fulfills the criterium to be fully-tryptic otherwise, it is also considered as fully-tryptic. Peptides that only fulfill the criterium on one terminus are semi-tryptic peptides. Lastly, peptides that are not fulfilling the criteria for both termini are non-tryptic peptides. In addition, peptides that miss the initial Methionine of a protein are considered "tryptic" at that site if there is no other peptide starting at position 1 for that protein.
assign_peptide_type(
data,
aa_before = aa_before,
last_aa = last_aa,
aa_after = aa_after,
protein_id = NULL,
start = start
)a data frame containing at least information about the preceding and C-terminal amino acids of peptides.
a character column in the data data frame that contains the preceding amino
acid as one letter code.
a character column in the data data frame that contains the C-terminal amino
acid as one letter code.
a character column in the data data frame that contains the following amino
acid as one letter code.
a character column in the data data frame that contains the protein
accession numbers.
a numeric column in the data data frame that contains the start position of
each peptide within the corresponding protein. This is used to check if the protein is consistently
missing the initial Methionine, making peptides starting at position 2 "tryptic" on that site.
A data frame that contains the input data and an additional column with the peptide type information.
data <- data.frame(
aa_before = c("K", "M", "", "M", "S", "M", "-"),
last_aa = c("R", "K", "R", "R", "Y", "K", "K"),
aa_after = c("T", "R", "T", "R", "T", "R", "T"),
protein_id = c("P1", "P1", "P3", "P3", "P2", "P2", "P2"),
start = c(38, 2, 1, 2, 10, 2, 1)
)
assign_peptide_type(data, aa_before, last_aa, aa_after, protein_id, start)
#> aa_before last_aa aa_after protein_id start pep_type
#> 1 K R T P1 38 fully-tryptic
#> 2 M K R P1 2 fully-tryptic
#> 3 R T P3 1 fully-tryptic
#> 4 M R R P3 2 semi-tryptic
#> 5 S Y T P2 10 non-tryptic
#> 6 M K R P2 2 semi-tryptic
#> 7 - K T P2 1 fully-tryptic