Skip to contents

This function calibrates feature abundances based on a specified reference sample. Calibration can be applied to the entire dataset using one or more reference samples, or batch-wise using reference sample analyses present within each batch. For both approaches, multiple measurements of the same reference sample are summarized using either mean (default) or median (set by the summarize_fun argument).

Usage

calibrate_by_reference(
  data,
  variable,
  reference_sample_id,
  absolute_calibration,
  batch_wise = FALSE,
  summarize_fun = "mean",
  store_conc_ratio = NULL,
  undefined_conc_action = NULL,
  store_normalized = FALSE
)

Arguments

data

A MidarExperiment object containing the metabolomics data to be normalized

variable

Character string indicating which data type to calibrate Must be one of: "intensity", "norm_intensity", or "conc"

reference_sample_id

Character vector specifying the sample ID(s) to use as reference(s) or standards

absolute_calibration

Logical indicating whether to perform absolute calibration using known concentrations of the reference sample (TRUE) or relative calibration (FALSE).

batch_wise

Logical indicating whether to perform calibration for each batch seperately (TRUE) or for all samples together (FALSE).

summarize_fun

Either "mean" or "median". If absolute_calibration = TRUE, this function is used to summarize the reference sample concentrations across analyses of specified reference_sample_id. Default is "mean".

store_conc_ratio

Logical. Whether to store the ratio of measured (non-calibrated) compared to the expected (known) concentrations. Only applied if absolute_calibration = TRUE. This ratio is stored under the feature variable feature_conc_ratio. By default it is TRUE when variable = 'conc', otherwise FALSE`.

undefined_conc_action

Character string specifying how to handle features without defined concentrations in reference samples when absolute_calibration = TRUE. Must be one of: "original" (keep original values), "na" (set to NA), or "error". Default is "keep".

Default is TRUE.

store_normalized

Logical indicating whether to keep the normalized values in the dataset when absolute_calibration = TRUE. Default is FALSE. These values are then stored as [VARIABLE]_normalized, where [VARIABLE] is the input variable, e.g., conc.

Value

A MidarExperiment object with calibrated data

Details

Calibration can be performed in two ways, either absolute, resulting in concentrations, or relative, resulting in ratios:

  1. Absolute calibration (when absolute_calibration = TRUE)

    Calibrates (or re-calibrates) feature abundances based on known concentrations of the corresponding features defined for a reference sample. The calibrated concentration for a given analyte is calculated as:

    $$c_\textrm{cal}^\textrm{Analyte} = \frac{c_\textrm{sample}^\textrm{Analyte}}{c_\textrm{ref}^\textrm{Analyte}} \times c_\textrm{known}^\textrm{Analyte}$$

    The input variable can either conc, norm_intensity, or intensity, whereas the result will always be stored under the variableconc` (concentration), in the unit defined for the feature concentrations in the reference sample.

    Metadata requirements:

    • sample_id and analyte_id must be defined for the reference sample and features in the analysis and feature metadata, respectively.

    • Known analyte concentrations must be defined in the QC concentration metadata for the for the reference sample

    • An error will be raised if no concentrations are defined for any features

    Missing analyte concentrations for the reference sample can be handed via undefined_conc_handling with following options:

    • original: Keep original feature values, i.e. the non-calibrated values will be returned. Note: this is only available when variable = conc. Use with caution to avoid mixing units.

    • na: Set affected features values to NA

    • error (default): The function stops with error in case of any undefined reference sample feature concentration.

    • In case all feature concentrations are undefined, the function will stop with an error.

    The re-calibrated feature concentrations are stored as conc, overwriting existing conc values. The original conc values are stored as conc_beforecal.

    The ratio between the measured and expected (known) concentrations in the reference sample is available via the feature variable feature_conc_ratio and is calculated as follows:

    $$c_\textrm{ratio}^\textrm{Analyte} = \frac{c_\textrm{measured}^\textrm{Analyte}}{c_\textrm{expected}^\textrm{Analyte}}$$

    where \(c_\textrm{measured}\) is the measured (non-calibrated) concentration, and \(c_\textrm{expected}\) is the known or reference concentration for the same analyte. A bias value of 1 indicates perfect agreement; values above or below 1 indicate over- or underestimation.

    To export the calibrated concentrations use save_dataset_csv() with variable = "conc", or to export non-calibrated values with variable = "conc_beforecal". When saving the MiDAR XLSX report, the calibrated concentrations will also be stored as conc`.

  2. Normalization (relative calibration, absolute_calibration = FALSE)

    Normalizes features abundances with corresponding feature abundances in a reference sample, resulting in ratios. Any available feature abundance variable (i.e., conc, norm_intensity, or intensity) can be used as input. The normalization is calculate for all present features. The resulting output will be stored as [VARIABLE]_normalized, whereby [VARIABLE] is the input variable, e.g., conc_normalized.

    To export the normalized abundances , use save_dataset_csv() with variable = "[VARIABLE]_normalized" For MiDAR XLSX report, use save_report_xlsx() with same variable setting as for save_dataset_csv() to When saving the MiDAR XLSX report via save_report_xlsx(), availble unfiltered normalized feature abundances will be included by default. To include filtered normalized feature abundances, set filtered_variable = "[VARIABLE]_normalized".

Examples


dat_file = system.file("extdata", "S1P_MHQuant.csv", package = "midar")
meta_file = system.file("extdata", "S1P_metadata_tables.xlsx", package = "midar")
# Load data and metadata
mexp <- MidarExperiment()
mexp <- import_data_masshunter(mexp, dat_file, import_metadata = FALSE)
#>  Imported 65 analyses with 16 features
#>  `feature_area` selected as default feature intensity. Modify with `set_intensity_var()`.
mexp <- import_metadata_analyses(mexp, path = meta_file, sheet = "Analyses")
#>  Analysis metadata associated with 65 analyses.
mexp <- import_metadata_features(mexp, path = meta_file, sheet = "Features")
#>  Analysis metadata associated with 65 analyses.
#>  Feature metadata associated with 16 features.
mexp <- import_metadata_istds(mexp, path = meta_file, sheet = "ISTDs")
#>  Analysis metadata associated with 65 analyses.
#>  Feature metadata associated with 16 features.
#>  Internal Standard metadata associated with 2 ISTDs.

# Load known feature concentrations in the reference sample
mexp <- import_metadata_qcconcentrations(mexp, path = meta_file, sheet = "QCconcentrations")
#>  Analysis metadata associated with 65 analyses.
#>  Feature metadata associated with 16 features.
#>  Internal Standard metadata associated with 2 ISTDs.
#>  QC concentration metadata associated with 1 annotated samples and 6 annotated analytes
mexp <- normalize_by_istd(mexp)
#> ! Interfering features defined in metadata, but no correction was applied. Use `correct_interferences()` to correct.
#>  14 features normalized with 2 ISTDs in 65 analyses.
mexp <- quantify_by_istd(mexp)
#>  14 feature concentrations calculated based on 2 ISTDs and sample amounts of 65 analyses.
#>  Concentrations are given in μmol/L.

# Absolute calibration
# --------------------

  mexp <- calibrate_by_reference(
    data = mexp,
    variable = "conc",
    reference_sample_id = "SRM1950",
    absolute_calibration = TRUE,
    batch_wise = FALSE,
    summarize_fun = "mean",
    undefined_conc_action = "original"
  )
#> ! One or more feature concentration are not defined in the reference sample SRM1950. Original values will be returned for these. To change this, modify `undefined_conc_action` argument.
#>  6 feature concentrations were re-calibrated using the reference sample SRM1950.
#>  Concentrations are given in umol/L.

  # Export absolute calibration concentrations
  save_dataset_csv(mexp, "calibrated.csv", variable = "conc", filter_data = FALSE)
#>  Concentration values for 65 analyses and 7 features have been exported to 'calibrated.csv'.

  # Export non-calibrated concentrations
  save_dataset_csv(mexp, "noncalibrated.csv", variable = "conc_beforecal", filter_data = FALSE)
#>  Conc_beforecal values for 65 analyses and 16 features have been exported to 'noncalibrated.csv'.

  # Create XLSX report with calibrated concentrations as filtered dataset
  save_report_xlsx(mexp, "report.xlsx", filtered_variable = "conc")
#> 
Saving report to disk - please wait...
#>  
The data processing report has been saved to 'report.xlsx'.

# Relative calibration
# --------------------

  mexp <- calibrate_by_reference(
    data = mexp,
    variable = "conc",
    reference_sample_id = "SRM1950",
    batch_wise = FALSE,
    absolute_calibration = FALSE
  )
#>  All features were normalized with reference sample SRM1950 features.
#>  Unit is: sample [conc] / SRM1950 [conc]

  # Export SRM1950-normalized concentrations
  save_dataset_csv(mexp, "normalized.csv", variable = "conc_normalized", filter_data = FALSE)
#>  Conc_normalized values for 65 analyses and 16 features have been exported to 'normalized.csv'.

  # Create XLSX report with SRM1950-normalized concentrations as filtered dataset
  save_report_xlsx(mexp, "report.xlsx", filtered_variable = "conc_normalized")
#> 
Saving report to disk - please wait...
#>  
The data processing report has been saved to 'report.xlsx'.