Relative Log Abundance (RLA) Plot — plot_rla

The Relative Log Abundance (RLA) plot visualizes standardized feature abundances distributions across samples. RLA standardization involves subtracting either the within-batch or across-batch median from each feature's log-transformed abundance. Thse plots are effective for identifying systematic technical variations, such as batch effects, instrument drift, or sample handling inconsistencies, by providing a robust representation less susceptible to global intensity shifts.

The function also incorporates optional outlier detection and visualization functionalities to identify anomalous samples based on their median RLA values.

This funcion returns a list with the ggplot object representing the RLA plot and a table with detected outliers (if outlier_detection = TRUE).

Usage

plot_rla_boxplot(
  data = NULL,
  rla_type_batch,
  variable,
  qc_types = NA,
  plot_range = NA,
  rla_limit_to_range = FALSE,
  remove_gaps = FALSE,
  filter_data = FALSE,
  include_qualifier = TRUE,
  include_istd = TRUE,
  include_feature_filter = NA,
  exclude_feature_filter = NA,
  show_timestamp = FALSE,
  min_feature_intensity = 0,
  y_lim = NA,
  outlier_detection = TRUE,
  outlier_exclude = FALSE,
  outlier_method = "mad",
  outlier_qctypes = c("SPL", "TQC", "BQC", "LTR", "NIST"),
  outlier_k = NULL,
  show_batches = TRUE,
  batch_zebra_stripe = FALSE,
  batch_line_color = "#b6f0c5",
  batch_fill_color = "grey93",
  x_gridlines = FALSE,
  linewidth = 0.2,
  base_font_size = 8,
  relative_log_abundances = TRUE,
  show_plot = TRUE
)

Arguments

data: MRMhubExperiment
rla_type_batch: Character, must be either "within" or "across", defining whether to use within-batch or across-batch RLA
variable: Variable to plot, must be one of "intensity", "norm_intensity", "conc", "area", "height", "fwhm", or one of "intensity_raw", "intensity_before", "norm_intensity_raw", "norm_intensity_before", "conc_raw", "conc_before"
qc_types: QC types to be plotted. Can be a vector of QC types or a regular expression pattern. NA (default) displays all available QC/Sample types.
plot_range: Numeric vector of length 2, specifying the start and end indices of the analysis order to be plotted. NA plots all samples.
rla_limit_to_range: Logical, whether to limit the RLA values to the specified plot_range. Default is FALSE, which means RLA values are calculated for all samples.
remove_gaps: Logical, whether to remove gaps in the x-axis, occuring from QC types that were not selected. Default is TRUE. is supplied, only features with exactly these names are excluded (applied individually as OR conditions).
filter_data: Logical, whether to use QC-filtered data based on criteria set via filter_features_qc().
include_qualifier: Logical, whether to include qualifier features. Default is TRUE.
include_istd: Logical, whether to include internal standard (ISTD) features. Default is TRUE.
include_feature_filter: A regex pattern or a vector of feature names used to filter features by feature_id. If NA or an empty string ("") is provided, the filter is ignored. When a vector of length > 1 is supplied, is supplied, only features with exactly these names are selected (applied individually as OR conditions).
exclude_feature_filter: A regex pattern or a vector of feature names to exclude features by feature_id. If NA or an empty string ("") is provided, the filter is ignored. When a vector of length > 1 is supplied.
show_timestamp: Logical, whether to use the acquisition timestamp as the x-axis instead of the run sequence number
min_feature_intensity: Numeric, exclude features with overall median signal below this value
y_lim: Numeric vector of length 2, specifying the lower and upper y-axis limits. Default is NA, which uses limits calculated based on outlier_exclude.
outlier_detection: Logical, whether to show outlier fences on the plot and return a table with detect outliers based on the method defined by outlier_method.
outlier_exclude: Logical, whether to exclude outlier values from the plot. Default is FALSE, which means outliers are shown.
outlier_method: Character, method used for outlier detection. Default is "mad" (median absolute deviation). Other possible values are "iqr", "sd", "z_normal", "z_robust", "quantile", and "fold". See get_outlier_bounds() for details.
outlier_qctypes: Character vector, QC types to use for outlier detection. Default is c("SPL", "TQC", "BQC").
outlier_k: Numeric, multiplier for the outlier detection method. Default is NULL, which uses the default value for the selected method. See get_outlier_bounds() for details. When using the "fold" method, either single numeric value or a vector with two values (lower and upper fences) can be supplied.
show_batches: Logical, whether to show batch separators in the plot
batch_zebra_stripe: Logical, whether to show batches as shaded areas instead of line separators
batch_line_color: Character, color of the batch separator lines
batch_fill_color: Character, color of the batch shaded areas
x_gridlines: Logical, whether to show major x-axis gridlines
linewidth: Numeric, line width used for whiskers of the boxplot
base_font_size: Numeric, base font size for the plot
relative_log_abundances: Logical, whether to use relative log abundances (RLA) or just log-transformed values
show_plot: Logical, whether to display the plot. Default is TRUE.

Value

A list with the ggplot object representing the RLA plot and a table with detected outliers if outlier_detection = TRUE.

References

De Livera et al. (2012) Normalizing and integrating metabolomics data. Analytical Chemistry 10768-10776 DOI: 10.1021/ac302748b De Livera et al. (2015) Statistical Methods for Handling Unwanted Variation in Metabolomics Data. Analytical Chemistry 87(7):3606-3615 DOI: 10.1021/ac502439y