Drift Correction by Gaussian Kernel Smoothing — corr_drift

Function to correct for run-order drifts within or across batches using gaussian kernel smoothing (see Tan et al. (2020)). This is typically used to smooth based on the study samples. To avoid local biases and artefacts, this function should only be applied to analyses wit sufficient number of samples that were well randomized.

Usage

corr_drift_gaussiankernel(
  data,
  qc_types,
  bandwidth,
  log2_transform = TRUE,
  within_batch,
  apply_conditionally,
  apply_conditionally_per_batch = TRUE,
  feature_list = NULL,
  max_cv_ratio_before_after = 1,
  use_uncorrected_if_fit_fails = FALSE
)

Arguments

data: MidarExperiment object
qc_types: QC types used for drift correction. Typically includes the study samples (SPL).
bandwidth: Kernel bandwidth
log2_transform: Log transform the data for correction when TRUE (the default). Note: log transformation is solely applied internally for smoothing, results will not be be log-transformed. Log transformation may result in more robust smoothing that is less sensitive to outlier.
within_batch: Apply to each batch separately if TRUE (the default)
apply_conditionally: Apply drift correction to all species if TRUE, or only when sample CV after smoothing changes below a threshold defined via max_cv_ratio_before_after
apply_conditionally_per_batch: When apply_conditionally = TRUE, correction is conditionally applied per batch when TRUE and across all batches when FALSE
feature_list: Subset the features for correction whose names matches the specified text using regular expression. Default is NULL which means all features are selected.
max_cv_ratio_before_after: Only used when apply_conditionally = TRUE. Maximum allowed ratio of sample CV change before and after smoothing for the correction to be applied. A value of 1 (the default) indicates the CV needs to improve or remain unchanged after smoothing so that the conditional smoothing is applied. A value of < 1 means that CV needs to improve, a value of e.g. 1.20 that the CV need to improve or get worse by max 1.20-fold after smoothing.
use_uncorrected_if_fit_fails: In case the smoothing function fails for a species, then use original (uncorrected) data when TRUE (the default) or return NA for all analyses of the feature where the fit failed.

Value

MidarExperiment object

References

Teo G., Chew WS, Burla B, Herr D, Tai ES, Wenk MR, Torta F, & Choi H (2020). MRMkit: Automated Data Processing for Large-Scale Targeted Metabolomics Analysis. Analytical Chemistry, 92(20), 13677–13682. https://doi.org/10.1021/acs.analchem.0c03060