Drift Correction by Gaussian Kernel Smoothing
Source:R/correct-drift-batch.R
correct_drift_gaussiankernel.Rd
Function to correct for run-order drifts within or across batches using gaussian kernel smoothing (see Tan et al. (2020)). This is typically used to smooth based on the study samples. To avoid local biases and artefacts, this function should only be applied to analyses wit sufficient number of samples that were well randomized.
Usage
correct_drift_gaussiankernel(
data = NULL,
variable,
reference_qc_types,
within_batch = TRUE,
replace_previous = TRUE,
kernel_size,
outlier_filter = FALSE,
outlier_ksd = 5,
location_smooth = TRUE,
scale_smooth = FALSE,
log_transform_internal = TRUE,
recalc_trend_after = TRUE,
conditional_correction = FALSE,
feature_list = NULL,
ignore_istd = TRUE,
cv_ratio_threshold = 1,
use_original_if_fail = FALSE,
show_progress = TRUE
)
Arguments
- data
MidarExperiment object
- variable
The variable to be corrected for drift effects. Must be one of "intensity", "norm_intensity", or "conc"
- reference_qc_types
QC types used for drift correction. Typically includes the study samples (
SPL
).- within_batch
Apply to each batch separately if
TRUE
(the default), or across all batches ifFALSE
.- replace_previous
Logical. Replace previous correction (
TRUE
), or adds on top of previous correction (FALSE
). Default isTRUE
.- kernel_size
Kernel bandwidth
- outlier_filter
Kernel Outlier filter
- outlier_ksd
Kernel K times standard deviation of data distribution
- location_smooth
Location parameter smoothing
- scale_smooth
Scale parameter smoothing
- log_transform_internal
Apply log transformation internally for smoothing if
TRUE
(default). This enhances robustness against outliers but does not affect the final data, which remains untransformed.- recalc_trend_after
Recalculate trends after smoothing, used for plotting (e.g., in
plot_runscatter()
)- conditional_correction
Apply drift correction to all species if
TRUE
, or only when sample CV after smoothing changes below a threshold defined viacv_ratio_threshold
- feature_list
Subset the features for correction whose names matches the specified text using regular expression. Default is
NULL
which means all features are selected.- ignore_istd
Do not apply corrections to ISTDs
- cv_ratio_threshold
Only used when
conditional_correction = TRUE
. Maximum allowed ratio of sample CV change before and after smoothing for the correction to be applied. A value of 1 (the default) indicates the CV needs to improve or remain unchanged after smoothing so that the conditional smoothing is applied. A value of < 1 means that CV needs to improve, a value of e.g. 1.20 that the CV need to improve or get worse by max 1.20-fold after smoothing.- use_original_if_fail
Determines the action when smoothing fails for a feature. If TRUE (default), the original data is used; if FALSE, the result for each analysis is NA.
- show_progress
Show progress bars. Set this to
FALSE
when rendering the notebook.
References
Teo G., Chew WS, Burla B, Herr D, Tai ES, Wenk MR, Torta F, & Choi H (2020). MRMkit: Automated Data Processing for Large-Scale Targeted Metabolomics Analysis. Analytical Chemistry, 92(20), 13677–13682. https://doi.org/10.1021/acs.analchem.0c03060