The Relative Log Abundance (RLA) plot visualizes standardized feature abundances distributions across samples. RLA standardization involves subtracting either the within-batch or across-batch median from each feature's log-transformed abundance. Thse plots are effective for identifying systematic technical variations, such as batch effects, instrument drift, or sample handling inconsistencies, by providing a robust representation less susceptible to global intensity shifts.
The function also incorporates optional outlier detection and visualization functionalities to identify anomalous samples based on their median RLA values.
This funcion returns a list with the ggplot object representing the RLA plot and a table with detected outliers (if outlier_detection = TRUE
).
Usage
plot_rla_boxplot(
data = NULL,
rla_type_batch,
variable,
qc_types = NA,
plot_range = NA,
rla_limit_to_range = FALSE,
remove_gaps = FALSE,
filter_data = FALSE,
include_qualifier = TRUE,
include_istd = TRUE,
include_feature_filter = NA,
exclude_feature_filter = NA,
show_timestamp = FALSE,
min_feature_intensity = 0,
y_lim = NA,
outlier_detection = TRUE,
outlier_exclude = FALSE,
outlier_method = "mad",
outlier_qctypes = c("SPL", "TQC", "BQC", "LTR", "NIST"),
outlier_k = NULL,
show_batches = TRUE,
batch_zebra_stripe = FALSE,
batch_line_color = "#b6f0c5",
batch_fill_color = "grey93",
x_gridlines = FALSE,
linewidth = 0.2,
base_font_size = 8,
relative_log_abundances = TRUE,
show_plot = TRUE
)
Arguments
- data
MRMhubExperiment
- rla_type_batch
Character, must be either "within" or "across", defining whether to use within-batch or across-batch RLA
- variable
Variable to plot, must be one of "intensity", "norm_intensity", "conc", "area", "height", "fwhm", or one of "intensity_raw", "intensity_before", "norm_intensity_raw", "norm_intensity_before", "conc_raw", "conc_before"
- qc_types
QC types to be plotted. Can be a vector of QC types or a regular expression pattern.
NA
(default) displays all available QC/Sample types.- plot_range
Numeric vector of length 2, specifying the start and end indices of the analysis order to be plotted.
NA
plots all samples.- rla_limit_to_range
Logical, whether to limit the RLA values to the specified
plot_range
. Default isFALSE
, which means RLA values are calculated for all samples.- remove_gaps
Logical, whether to remove gaps in the x-axis, occuring from QC types that were not selected. Default is
TRUE
. is supplied, only features with exactly these names are excluded (applied individually as OR conditions).- filter_data
Logical, whether to use QC-filtered data based on criteria set via
filter_features_qc()
.- include_qualifier
Logical, whether to include qualifier features. Default is
TRUE
.- include_istd
Logical, whether to include internal standard (ISTD) features. Default is
TRUE
.- include_feature_filter
A regex pattern or a vector of feature names used to filter features by
feature_id
. IfNA
or an empty string (""
) is provided, the filter is ignored. When a vector of length > 1 is supplied, is supplied, only features with exactly these names are selected (applied individually as OR conditions).- exclude_feature_filter
A regex pattern or a vector of feature names to exclude features by feature_id. If
NA
or an empty string (""
) is provided, the filter is ignored. When a vector of length > 1 is supplied.- show_timestamp
Logical, whether to use the acquisition timestamp as the x-axis instead of the run sequence number
- min_feature_intensity
Numeric, exclude features with overall median signal below this value
- y_lim
Numeric vector of length 2, specifying the lower and upper y-axis limits. Default is
NA
, which uses limits calculated based onoutlier_exclude
.- outlier_detection
Logical, whether to show outlier fences on the plot and return a table with detect outliers based on the method defined by
outlier_method
.- outlier_exclude
Logical, whether to exclude outlier values from the plot. Default is
FALSE
, which means outliers are shown.- outlier_method
Character, method used for outlier detection. Default is "mad" (median absolute deviation). Other possible values are "iqr", "sd", "z_normal", "z_robust", "quantile", and "fold". See get_outlier_bounds() for details.
- outlier_qctypes
Character vector, QC types to use for outlier detection. Default is
c("SPL", "TQC", "BQC")
.- outlier_k
Numeric, multiplier for the outlier detection method. Default is
NULL
, which uses the default value for the selected method. See get_outlier_bounds() for details. When using the "fold" method, either single numeric value or a vector with two values (lower and upper fences) can be supplied.- show_batches
Logical, whether to show batch separators in the plot
- batch_zebra_stripe
Logical, whether to show batches as shaded areas instead of line separators
- batch_line_color
Character, color of the batch separator lines
- batch_fill_color
Character, color of the batch shaded areas
- x_gridlines
Logical, whether to show major x-axis gridlines
- linewidth
Numeric, line width used for whiskers of the boxplot
- base_font_size
Numeric, base font size for the plot
- relative_log_abundances
Logical, whether to use relative log abundances (RLA) or just log-transformed values
- show_plot
Logical, whether to display the plot. Default is
TRUE
.
Value
A list with the ggplot object representing the RLA plot and a table with detected outliers if outlier_detection = TRUE
.
References
De Livera et al. (2012) Normalizing and integrating metabolomics data. Analytical Chemistry 10768-10776 DOI: 10.1021/ac302748b De Livera et al. (2015) Statistical Methods for Handling Unwanted Variation in Metabolomics Data. Analytical Chemistry 87(7):3606-3615 DOI: 10.1021/ac502439y