R/calculate_aa_scores.R
calculate_aa_scores.Rd
Calculate a score for each amino acid position in a protein sequence based on the product of the -log10(adjusted p-value) and the absolute log2(fold change) per peptide covering this amino acid. In detail, all the peptides are aligned along the sequence of the corresponding protein, and the average score per amino acid position is computed. In a limited proteolysis coupled to mass spectrometry (LiP-MS) experiment, the score allows to prioritize and narrow down structurally affected regions.
calculate_aa_scores(
data,
protein,
diff = diff,
adj_pval = adj_pval,
start_position,
end_position,
retain_columns = NULL
)
a data frame containing at least the input columns.
a character column in the data frame containing the protein identifier or name.
a numeric column in the data
data frame containing the log2 fold change.
a numeric column in the data
data frame containing the adjusted p-value.
a numeric column data
in the data frame containing the start position
of a peptide or precursor.
a numeric column in the data frame containing the end position of a peptide or precursor.
a vector indicating if certain columns should be retained from the input
data frame. Default is not retaining additional columns retain_columns = NULL
. Specific
columns can be retained by providing their names (not in quotations marks, just like other
column names, but in a vector).
A data frame that contains the aggregated scores per amino acid position, enabling to draw fingerprints for each individual protein.
data <- data.frame(
pg_protein_accessions = c(rep("protein_1", 10)),
diff = c(2, -3, 1, 2, 3, -3, 5, 1, -0.5, 2),
adj_pval = c(0.001, 0.01, 0.2, 0.05, 0.002, 0.5, 0.4, 0.7, 0.001, 0.02),
start = c(1, 3, 5, 10, 15, 25, 28, 30, 41, 51),
end = c(6, 8, 10, 16, 23, 35, 35, 35, 48, 55)
)
calculate_aa_scores(
data,
protein = pg_protein_accessions,
diff = diff,
adj_pval = adj_pval,
start_position = start,
end_position = end
)
#> # A tibble: 47 × 3
#> # Groups: pg_protein_accessions, residue [47]
#> pg_protein_accessions residue amino_acid_score
#> <chr> <int> <dbl>
#> 1 protein_1 1 6
#> 2 protein_1 2 6
#> 3 protein_1 3 6
#> 4 protein_1 4 6
#> 5 protein_1 5 4.23
#> 6 protein_1 6 4.23
#> 7 protein_1 7 3.35
#> 8 protein_1 8 3.35
#> 9 protein_1 9 0.699
#> 10 protein_1 10 1.65
#> # ℹ 37 more rows