Calculate sequence coverage for each identified protein.

calculate_sequence_coverage(data, protein_sequence, peptides)

Arguments

data

a data frame containing at least the protein sequence and the identified peptides as columns.

protein_sequence

a character column in the data data frame that contains protein sequences. Can be obtained by using the function fetch_uniprot()

peptides

a character column in the data data frame that contains the identified peptides.

Value

A new column in the data data frame containing the calculated sequence coverage for each identified protein

Examples

data <- data.frame(
  protein_sequence = c("abcdefghijklmnop", "abcdefghijklmnop"),
  pep_stripped_sequence = c("abc", "jklmn")
)

calculate_sequence_coverage(
  data,
  protein_sequence = protein_sequence,
  peptides = pep_stripped_sequence
)
#>   protein_sequence pep_stripped_sequence coverage
#> 1 abcdefghijklmnop                   abc       50
#> 2 abcdefghijklmnop                 jklmn       50