Computes various quality control (QC) metrics for each feature in a
MidarExperiment
object. Metrics are derived from different sample
types and can be computed either across the full dataset or as medians
of batch-wise calculations.
Usage
calc_qc_metrics(
data = NULL,
use_batch_medians = FALSE,
include_norm_intensity_stats = NA,
include_conc_stats = NA,
include_response_stats = NA,
include_calibration_results = NA
)
Arguments
- data
A
MidarExperiment
object containing data and metadata, whereby data needs to be normalized and quantitated for specific QC metrics, such as statistics based on normalized intensities and concentrations.- use_batch_medians
Logical, whether to compute QC metrics using the median of batch-wise derived values instead of the full dataset. Default is FALSE.
- include_norm_intensity_stats
Logical. If
NA
(default), statistics on normalized intensity values are included if the data is available. IfTRUE
, they are always calculated, raising an error if data is missing.- include_conc_stats
Logical. If
NA
(default), concentration-related statistics are included if concentration data is available. IfTRUE
, they are always calculated, raising an error if data is missing.- include_response_stats
Logical. If
NA
(default), response curve statistics are included if the required data is available. IfTRUE
, they are always calculated, raising an error if data is missing.- include_calibration_results
Logical, whether to incorporate external calibration results into the QC metrics table if available. Default is TRUE.
Value
A MidarExperiment
object with an updated metrics_qc
table
containing computed QC metrics for each feature.
Details
Batch-wise calculations: The function computes the following QC metrics for each feature and for different QC sample types (e.g., SPL, TQC, BQC, PBLK, NIST, LTR)
The format for the metrics is standardized as metric_name_qc_type
, where
qc_type
refers to the specific QC sample type for which the metric is
calculated. For example: intensity_min_spl
refers to the minimum intensity
Statistics of normalized intensities , external calibration, and response
curves can be included by setting the relevant arguments
(include_norm_intensity_stats
, include_conc_stats
,
include_response_stats
, include_calibration_results
) to TRUE
.
Note when corresponding underlying processed data is not available,
the function will not raise an error but will return NA
values for the
respective metrics. This, however, does not apply for the optinal metrics
mentioned above. For these cases an error will be raised if the underlying
data is missing.
If use_batch_medians = TRUE
, batch-specific QC statistics are computed
first, and then the median of these values is returned for each feature.
However, response curve and calibration statistics are calculated per
curve, irrespective of batches and use_batch_medians
settings.
The calculated metrics are stored in the metrics_qc
table of the
MidarExperiment
objects and comprises following details
Feature details: Specific feature information extracted from the feature metadata tanle, such as feature class, associated ISTD, quantifier status.
Feature MS Method Information (if method variables are available in the analysis data). Extracts and summarizes method-related variables for each feature. If multiple values exist for the same feature, these will be concatenated into a string. The latter would indicate inconsistent analysis conditions.
precursor_mz
: The m/z value of the precursor ion(s),product_mz
: The m/z value of the product ion(s), concatenated if multiple values exist for the same feature.collision_energy
: The collision energy used for fragmentation, concatenated if multiple values exist exist for the same feature.
Missing Value Metrics:
missing_intensity_prop_spl
: Proportion of missing intensities for the SPL sample type.missing_norm_intensity_prop_spl
: Proportion of missing normalized intensities for SPL samples.missing_conc_prop_spl
: Proportion of missing concentration values for SPL samples.na_in_all
: Indicator if a feature has all missing intensities across all samples
Retention Time (RT) Metrics: Requires that retention tim data are available.
rt_min_*
: Minimum retention time across different QC sample types (e.g., SPL, BQC, TQC).rt_max_*
: Maximum retention time across different QC sample types.rt_median_*
: Median retention time for specific QC sample types like PBLK, SPL, BQC, TQC, etc.
Intensity Metrics:
intensity_min_*
: Minimum intensity value for features across different QC sample types such as SPL, TQC, BQC, etc.intensity_max_*
: Maximum intensity values across sample types.intensity_median_*
: Median intensity for various QC sample types.intensity_cv_*
: Coefficient of variation (CV) of intensity values for specific QC types.sb_ratio_*
: Signal-to-blank ratios such as the ratio of intensity values for SPL vs PBLK, UBLK, or SBLK.intensity_q10_*
: The 10th percentile of intensity values for the SPL sample type.
Normalized Intensity Metrics (only if
include_norm_intensity_stats = TRUE
): Requires that raw intensities were normalized, seenormalize_by_istd()
for details.norm_intensity_cv_*
: Coefficient of variation (CV) of normalized intensities for QC sample types like TQC, BQC, SPL, etc.
Concentration Metrics (only if
include_conc_stats = TRUE
): Requires that concentration were calculated, seequantify_by_istd()
orquantify_by_calibration()
for details.conc_median_*
: Median concentration values for different QC sample types like TQC, BQC, SPL, NIST, and LTR.conc_cv_*
: Coefficient of variation (CV) for concentration values.conc_dratio_sd_*
: The ratio of standard deviations of concentration between BQC or TQC and SPL samples.conc_dratio_mad_*
: The ratio of median absolute deviations (MAD) between BQC or TQC and SPL concentrations.
Response Curve Metrics (if
include_response_stats = TRUE
): Calculates response curve statistics for each feature and each curve (where#
refers to the curve identifier). Requires that response curves are defined in the data. Seeget_response_curve_stats()
for additional details.r2_rqc_#
: R-squared value of the linear regression for the response curve, representing the goodness of fit.slopenorm_rqc_#
: Normalized slope of the linear regression for the response curve, indicating the relationship between the response and concentration.y0norm_rqc_#
: Normalized intercept of the linear regression for the response curve, representing the baseline or starting value.
External Calibration Results Incorporates external calibration results, if
include_calibration_results = TRUE
and calibration curves are defined in the data:fit_model
: The regression model used for curve fitting.fit_weighting
: The weighting method applied during curve fitting.lowest_cal
: The lowest nonzero calibration concentration.highest_cal
: The highest calibration concentration.r.squared
: R-squared value indicating the goodness of fit.coef_a
:For linear fits, this represents the slope of the regression line.
For quadratic fits, this represents the coefficient of the quadratic term (
x²
).
coef_b
:For linear fits, this represents the intercept of the regression line.
For quadratic fits, this represents the coefficient of the linear term (
x
).
coef_c
:Only present for quadratic fits, representing the intercept of the regression equation.
Set to
NA
for linear fits.
sigma
: The residual standard error of the regression model.reg_failed
: Boolean flag indicating if regression fitting failed.