Skip to contents

Function to correct for run-order drifts within or across batches using gaussian kernel smoothing (see Tan et al. (2020)). This is typically used to smooth based on the study samples. To avoid local biases and artefacts, this function should only be applied to analyses wit sufficient number of samples that were well randomized.

Usage

corr_drift_gaussiankernel(
  data,
  qc_types,
  bandwidth,
  log2_transform = TRUE,
  within_batch,
  apply_conditionally,
  apply_conditionally_per_batch = TRUE,
  feature_list = NULL,
  max_cv_ratio_before_after = 1,
  use_uncorrected_if_fit_fails = FALSE
)

Arguments

data

MidarExperiment object

qc_types

QC types used for drift correction. Typically includes the study samples (SPL).

bandwidth

Kernel bandwidth

log2_transform

Log transform the data for correction when TRUE (the default). Note: log transformation is solely applied internally for smoothing, results will not be be log-transformed. Log transformation may result in more robust smoothing that is less sensitive to outlier.

within_batch

Apply to each batch separately if TRUE (the default)

apply_conditionally

Apply drift correction to all species if TRUE, or only when sample CV after smoothing changes below a threshold defined via max_cv_ratio_before_after

apply_conditionally_per_batch

When apply_conditionally = TRUE, correction is conditionally applied per batch when TRUE and across all batches when FALSE

feature_list

Subset the features for correction whose names matches the specified text using regular expression. Default is NULL which means all features are selected.

max_cv_ratio_before_after

Only used when apply_conditionally = TRUE. Maximum allowed ratio of sample CV change before and after smoothing for the correction to be applied. A value of 1 (the default) indicates the CV needs to improve or remain unchanged after smoothing so that the conditional smoothing is applied. A value of < 1 means that CV needs to improve, a value of e.g. 1.20 that the CV need to improve or get worse by max 1.20-fold after smoothing.

use_uncorrected_if_fit_fails

In case the smoothing function fails for a species, then use original (uncorrected) data when TRUE (the default) or return NA for all analyses of the feature where the fit failed.

Value

MidarExperiment object

References

Teo G., Chew WS, Burla B, Herr D, Tai ES, Wenk MR, Torta F, & Choi H (2020). MRMkit: Automated Data Processing for Large-Scale Targeted Metabolomics Analysis. Analytical Chemistry, 92(20), 13677–13682. https://doi.org/10.1021/acs.analchem.0c03060