R/calculate_protein_abundance.R
calculate_protein_abundance.Rd
Determines relative protein abundances from ion quantification. Only proteins with at least three peptides are considered for quantification. The three peptide rule applies for each sample independently.
calculate_protein_abundance(
data,
sample,
protein_id,
precursor,
peptide,
intensity_log2,
method = "sum",
for_plot = FALSE,
retain_columns = NULL
)
a data frame that contains at least the input variables.
a character column in the data
data frame that contains the sample name.
a character column in the data
data frame that contains the protein
accession numbers.
a character column in the data
data frame that contains precursors.
a character column in the data
data frame that contains peptide sequences.
This column is needed to filter for proteins with at least 3 unique peptides. This can equate
to more than three precursors. The quantification is done on the precursor level.
a numeric column in the data
data frame that contains log2
transformed precursor intensities.
a character value specifying with which method protein quantities should be
calculated. Possible options include "sum"
, which takes the sum of all precursor
intensities as the protein abundance. Another option is "iq"
, which performs protein
quantification based on a maximal peptide ratio extraction algorithm that is adapted from the
MaxLFQ algorithm of the MaxQuant software. Functions from the
iq
package are
used. Default is "iq"
.
a logical value indicating whether the result should be only protein intensities
or protein intensities together with precursor intensities that can be used for plotting using
peptide_profile_plot()
. Default is FALSE
.
a vector indicating if certain columns should be retained from the input
data frame. Default is not retaining additional columns retain_columns = NULL
. Specific
columns can be retained by providing their names (not in quotations marks, just like other
column names, but in a vector).
If for_plot = FALSE
, protein abundances are returned, if for_plot = TRUE
also precursor intensities are returned in a data frame. The later output is ideal for plotting
with peptide_profile_plot()
and can be filtered to only include protein abundances.
# \donttest{
# Create example data
data <- data.frame(
sample = c(
rep("S1", 6),
rep("S2", 6),
rep("S1", 2),
rep("S2", 2)
),
protein_id = c(
rep("P1", 12),
rep("P2", 4)
),
precursor = c(
rep(c("A1", "A2", "B1", "B2", "C1", "D1"), 2),
rep(c("E1", "F1"), 2)
),
peptide = c(
rep(c("A", "A", "B", "B", "C", "D"), 2),
rep(c("E", "F"), 2)
),
intensity = c(
rnorm(n = 6, mean = 15, sd = 2),
rnorm(n = 6, mean = 21, sd = 1),
rnorm(n = 2, mean = 15, sd = 1),
rnorm(n = 2, mean = 15, sd = 2)
)
)
data
#> sample protein_id precursor peptide intensity
#> 1 S1 P1 A1 A 15.55374
#> 2 S1 P1 A2 A 16.64301
#> 3 S1 P1 B1 B 14.61170
#> 4 S1 P1 B2 B 17.42918
#> 5 S1 P1 C1 C 13.15697
#> 6 S1 P1 D1 D 12.58311
#> 7 S2 P1 A1 A 19.77101
#> 8 S2 P1 A2 A 21.74230
#> 9 S2 P1 B1 B 20.91708
#> 10 S2 P1 B2 B 21.78982
#> 11 S2 P1 C1 C 20.73229
#> 12 S2 P1 D1 D 20.40811
#> 13 S1 P2 E1 E 14.63165
#> 14 S1 P2 F1 F 13.14738
#> 15 S2 P2 E1 E 12.66077
#> 16 S2 P2 F1 F 12.11593
# Calculate protein abundances
protein_abundance <- calculate_protein_abundance(
data,
sample = sample,
protein_id = protein_id,
precursor = precursor,
peptide = peptide,
intensity_log2 = intensity,
method = "sum",
for_plot = FALSE
)
protein_abundance
#> # A tibble: 2 × 3
#> sample protein_id intensity
#> <chr> <chr> <dbl>
#> 1 S1 P1 18.5
#> 2 S2 P1 23.6
# Calculate protein abundances and retain precursor
# abundances that can be used in a peptide profile plot
complete_abundances <- calculate_protein_abundance(
data,
sample = sample,
protein_id = protein_id,
precursor = precursor,
peptide = peptide,
intensity_log2 = intensity,
method = "sum",
for_plot = TRUE
)
complete_abundances
#> # A tibble: 14 × 5
#> sample protein_id intensity precursor peptide
#> <chr> <chr> <dbl> <chr> <chr>
#> 1 S1 P1 18.5 protein_intensity NA
#> 2 S2 P1 23.6 protein_intensity NA
#> 3 S1 P1 15.6 A1 A
#> 4 S1 P1 16.6 A2 A
#> 5 S1 P1 14.6 B1 B
#> 6 S1 P1 17.4 B2 B
#> 7 S1 P1 13.2 C1 C
#> 8 S1 P1 12.6 D1 D
#> 9 S2 P1 19.8 A1 A
#> 10 S2 P1 21.7 A2 A
#> 11 S2 P1 20.9 B1 B
#> 12 S2 P1 21.8 B2 B
#> 13 S2 P1 20.7 C1 C
#> 14 S2 P1 20.4 D1 D
# }