Skip to contents

Function to correct for run-order drifts within or across batches using gaussian kernel smoothing (see Tan et al. (2020)). This is typically used to smooth based on the study samples. To avoid local biases and artefacts, this function should only be applied to analyses wit sufficient number of samples that were well randomized.

Usage

correct_drift_gaussiankernel(
  data = NULL,
  variable,
  reference_qc_types,
  within_batch = TRUE,
  replace_previous = TRUE,
  kernel_size,
  outlier_filter = FALSE,
  outlier_ksd = 5,
  location_smooth = TRUE,
  scale_smooth = FALSE,
  log_transform_internal = TRUE,
  recalc_trend_after = TRUE,
  conditional_correction = FALSE,
  feature_list = NULL,
  ignore_istd = TRUE,
  cv_ratio_threshold = 1,
  use_original_if_fail = FALSE,
  show_progress = TRUE
)

Arguments

data

MidarExperiment object

variable

The variable to be corrected for drift effects. Must be one of "intensity", "norm_intensity", or "conc"

reference_qc_types

QC types used for drift correction. Typically includes the study samples (SPL).

within_batch

Apply to each batch separately if TRUE (the default), or across all batches if FALSE.

replace_previous

Logical. Replace previous correction (TRUE), or adds on top of previous correction (FALSE). Default is TRUE.

kernel_size

Kernel bandwidth

outlier_filter

Kernel Outlier filter

outlier_ksd

Kernel K times standard deviation of data distribution

location_smooth

Location parameter smoothing

scale_smooth

Scale parameter smoothing

log_transform_internal

Apply log transformation internally for smoothing if TRUE (default). This enhances robustness against outliers but does not affect the final data, which remains untransformed.

recalc_trend_after

Recalculate trends after smoothing, used for plotting (e.g., in plot_runscatter())

conditional_correction

Apply drift correction to all species if TRUE, or only when sample CV after smoothing changes below a threshold defined via cv_ratio_threshold

feature_list

Subset the features for correction whose names matches the specified text using regular expression. Default is NULL which means all features are selected.

ignore_istd

Do not apply corrections to ISTDs

cv_ratio_threshold

Only used when conditional_correction = TRUE. Maximum allowed ratio of sample CV change before and after smoothing for the correction to be applied. A value of 1 (the default) indicates the CV needs to improve or remain unchanged after smoothing so that the conditional smoothing is applied. A value of < 1 means that CV needs to improve, a value of e.g. 1.20 that the CV need to improve or get worse by max 1.20-fold after smoothing.

use_original_if_fail

Determines the action when smoothing fails for a feature. If TRUE (default), the original data is used; if FALSE, the result for each analysis is NA.

show_progress

Show progress bars. Set this to FALSE when rendering the notebook.

Value

MidarExperiment object

References

Teo G., Chew WS, Burla B, Herr D, Tai ES, Wenk MR, Torta F, & Choi H (2020). MRMkit: Automated Data Processing for Large-Scale Targeted Metabolomics Analysis. Analytical Chemistry, 92(20), 13677–13682. https://doi.org/10.1021/acs.analchem.0c03060