The runscatter
function visualizes raw or processed feature signals across
different sample/QC types along the analysis sequence. It helps identify
trends, detect outliers, and assess analytical performance. Available
feature variables, such as retention time (RT) and full width at
half maximum (FWHM), can be plotted against analysis order or timestamps.
By default, all QC types present in the dataset will be plotted. QC types that
predefined colors or shapes are assigned black shapes.
User-defined QC types that have no predefined colors or shapes in midar.
will be assigned black shapes.
have no predefined color and shape, will be assigned shapes in black. To show specific
QC types use the qc_types
argument.
To plot the feature values before the last applied drift/batch correction,
add *_before
to the variable name, e.g., intensity_before
or conc_before
.
To plot the uncorrected feature values (before any drift/batch correction),
add *_raw
to the variable name, e.g., intensity_raw
or conc_raw
. To show
corresponding fit curves, set show_trend = TRUE
.
The function also supports visualizing analysis batches, reference lines (mean \(\pm\) SD), and trends. It offers customization options to display batch separators, apply outlier capping, show smoothed trend curves, add reference lines, and incorporate other features. Outlier capping is particularly useful to focus on QC or study sample trends that might otherwise be obscured by extreme values or high variability.
The runscatter
function serves as a central QC tool in the workflow,
providing critical insights into data quality.
Usage
plot_runscatter(
data = NULL,
variable = c("intensity", "norm_intensity", "conc", "rt", "area", "height", "fwhm",
"intensity_raw", "intensity_before", "norm_intensity_raw", "norm_intensity_before",
"conc_raw", "conc_before"),
filter_data = FALSE,
qc_types = NA,
include_qualifier = TRUE,
include_istd = TRUE,
include_feature_filter = NA,
exclude_feature_filter = NA,
plot_range = NA,
output_pdf = FALSE,
path = NA,
return_plots = FALSE,
show_batches = TRUE,
batch_zebra_stripe = FALSE,
batch_line_color = "#cdf7d9",
batch_fill_color = "grey93",
cap_outliers = FALSE,
cap_sample_k_mad = 4,
cap_qc_k_mad = 4,
cap_top_n_outliers = NA,
show_reference_lines = FALSE,
ref_qc_types = NA,
reference_k_sd = 2,
reference_batchwise = FALSE,
reference_line_color = "#04bf9a",
reference_sd_shade = FALSE,
reference_fill_color = NA,
reference_linewidth = 0.75,
show_trend = FALSE,
trend_color = "#22e06b",
log_scale = FALSE,
show_gridlines = FALSE,
point_size = 1.5,
point_transparency = 1,
point_border_width = 1,
base_font_size = 11,
rows_page = 3,
cols_page = 3,
specific_page = NA,
page_orientation = "LANDSCAPE",
y_label_text = NA,
show_progress = FALSE
)
Arguments
- data
A
MidarExperiment
object containing the dataset and metadata.- variable
The variable to plot on the y-axis, one of 'intensity', 'norm_intensity', 'conc', 'conc', 'rt', 'fwhm', 'area', 'height', response'. Add
_before
after the variable name to plot the feature values before the last applied drift/batch correction, (e.g.,conc_before
). Add_raw
after the variable name to plot the raw uncorrected feature values (e.g.,conc_raw
).- filter_data
Logical, whether to use QC-filtered data based on criteria set via
filter_features_qc()
.- qc_types
QC types to be plotted. Can be a vector of QC types or a regular expression pattern.
NA
(default) displays all available QC/Sample types.- include_qualifier
Logical, whether to include qualifier features. Default is
TRUE
.- include_istd
Logical, whether to include internal standard (ISTD) features. Default is
TRUE
.- include_feature_filter
A regex pattern or a vector of feature names used to filter features by
feature_id
. IfNA
or an empty string (""
) is provided, the filter is ignored. When a vector of length > 1 is supplied, is supplied, only features with exactly these names are selected (applied individually as OR conditions).- exclude_feature_filter
A regex pattern or a vector of feature names to exclude features by feature_id. If
NA
or an empty string (""
) is provided, the filter is ignored. When a vector of length > 1 is supplied, is supplied, only features with exactly these names are excluded (applied individually as OR conditions).- plot_range
Numeric vector of length 2, specifying the start and end indices of the analysis order to be plotted.
NA
plots all samples.- output_pdf
Logical, whether to save the plot as a PDF file.
- path
File name for the PDF output.
- return_plots
Logical, whether to return the list of ggplot objects.
- show_batches
Logical, whether to show batch separators in the plot.
- batch_zebra_stripe
Logical, whether to display batches with alternating shaded and non-shaded areas.
- batch_line_color
Color of the batch separator lines.
- batch_fill_color
Color for the shaded areas representing batches.
- cap_outliers
Logical, whether to cap upper outliers based on MAD fences of SPL and QC samples.
- cap_sample_k_mad
Numeric, k * MAD (median absolute deviation) for outlier capping of SPL samples.
- cap_qc_k_mad
Numeric, k * MAD (median absolute deviation) for outlier capping of QC samples.
- cap_top_n_outliers
Numeric, cap the top n outliers regardless of MAD fences.
NA
or0
ignores this filter.- show_reference_lines
Whether to display reference lines (mean \(\pm\) n x SD).
- ref_qc_types
QC type for which the reference lines are calculated.
- reference_k_sd
Multiplier for standard deviations to define SD reference lines.
- reference_batchwise
Whether to calculate reference lines per batch.
- reference_line_color
Color of the reference lines.
- reference_sd_shade
TRUE
plots a colored band indicating the \(\pm\) n x SD reference range.FALSE
(default) shows reference lines instead.- reference_fill_color
Fill color of the batch-wise reference ranges. If
NA
(default), the color assigned to the qc_type is used.- reference_linewidth
Width of the reference lines.
- show_trend
If
TRUE
trend curves before or after drift/batch correction are shown.- trend_color
Color of the trend curve.
- log_scale
Logical, whether to use a log10 scale for the y-axis.
- show_gridlines
Whether to show major x and y gridlines.
- point_size
Size of the data points.
- point_transparency
Alpha transparency of the data points.
- point_border_width
Width of the data point borders.
- base_font_size
Base font size for the plot.
- rows_page
Number of rows per page.
- cols_page
Number of columns per page.
- specific_page
Show/save a specific page number only.
NA
plots/saves all pages.- page_orientation
Page orientation, "LANDSCAPE" or "PORTRAIT".
- y_label_text
Override the default y-axis label text.
- show_progress
Logical, whether to show a progress bar.
Details
The outlier capping feature (
cap_outliers
) allows you to cap upper outliers based on median absolute deviation (MAD) fences of SPL and QC samples, or to remove the top n points. This can help to focus on the trends of interest when there are outlier or a high variability in the data, e.g. in the study samples.When using log-scale (
log_scale = TRUE
), zero or negative values will replaced with the minimum positive value divided by 5 to avoid log 0 errorsReference lines/ranges corresponding to mean \(\pm\) k x SD can be shown across or within batches as lines or shaded stripes.
Trend curves can be displayed before or after drift/batch correction. In either case, a drift and/or batch correction must be applied to the data to enable plotting of trend curves. To show trend curves used for the last drift or batch correction, add "_before" to the variable name, e.g.
conc_before
orintensity_before
and setshow_trend = TRUE
.