Generates a Principal Component Analysis (PCA) plot for visualizing samples including quality control (QC) samples. This function provides options for filtering data, applying transformations, and customizing visual elements to enhance the visualization.
Usage
plot_pca(
data = NULL,
variable,
qc_types = NA,
ellipse_variable = "qc_type",
ellipse_levels = NA,
pca_dimensions = c(1, 2),
log_transform = TRUE,
filter_data = FALSE,
include_qualifier = FALSE,
include_istd = FALSE,
include_feature_filter = NA,
exclude_feature_filter = NA,
min_median_value = NA,
show_labels = TRUE,
labels_threshold_mad = 3,
shared_labeltext_hide = NA,
label_font_size = 3,
point_size = 2,
point_alpha = 0.8,
font_base_size = 8,
ellipse_confidence_level = 0.95,
ellipse_linewidth = 1,
ellipse_fill = TRUE,
ellipse_fillcolor = NA,
ellipse_alpha = 0.1
)
Arguments
- data
A MidarExperiment object
- variable
A character string indicating the variable to use for PCA analysis. Must be one of: "area", "height", "intensity", "response", "conc", "conc_raw", "rt", "fwhm".
- qc_types
A character vector specifying the QC types to plot. It must contain at least one element. The default is
NA
, which means any of the non-blank QC types ("SPL", "TQC", "BQC", "HQC", "MQC", "LQC", "NIST", "LTR") will be plotted if present in the dataset.- ellipse_variable
String specifying which sample variable to show as ellipses. Must be one of: "none", "qc_type", "batch_id". "none" omits ellipses.
- ellipse_levels
A character vector specifying the levels of
ellipse_variable
to display as ellipses.- pca_dimensions
A numeric vector of length 2 indicating the PCA dimensions to plot. Default is c(1, 2).
- log_transform
A logical value indicating whether to log-transform the data before the PCA. Default is
TRUE
.- filter_data
A logical value indicating whether to use all data (default) or only QC-filtered data (filtered via
filter_features_qc()
).- include_qualifier
A logical value indicating whether to include qualifier features. Default is
TRUE
.- include_istd
A logical value indicating whether to include internal standard (ISTD) features. Default is
TRUE
.- include_feature_filter
A character or regex pattern used to filter features by
feature_id
. IfNA
or an empty string (""
) is provided, the filter is ignored. When a vector of length > 1 is supplied, only features with exactly these names are selected (applied individually as OR conditions).- exclude_feature_filter
A character or regex pattern used to exclude features by
feature_id
. IfNA
or an empty string (""
) is provided, the filter is ignored. When a vector of length > 1 is supplied, only features with exactly these names are excluded (applied individually as OR conditions).- min_median_value
Minimum median feature value (as determined by the
variable
) across all samples from selected QC types that must be met for a feature to be included in the PCA analysis.NA
(default) means no filtering will be applied. This parameter provides an fast way to exclude noisy features from the analysis. However, it is recommended to usefilter_data
withfilter_features_qc()
.- show_labels
A logical value indicating whether to show analysis_id labels for points outside k * MAD of the selected PCA dimensions. Default is
TRUE
.- labels_threshold_mad
A numeric value determining the threshold for showing labels based on the median absolute deviation (MAD). Default is 3. Set to
NULL
to suppress labels.A character string representing text shared across labels to be hidden (case-sensitive). If this results in non-unique analysis_id's, an error will be raised.
- label_font_size
Number indicating the font size for labels in 'mm'. Note the unit is different from font_base_size that is in 'pt'.
- point_size
A numeric value indicating the size of points in millimeters. Default is 2.
- point_alpha
A numeric value indicating the transparency of points (0-1). Default is 0.5.
- font_base_size
A numeric value indicating the base font size for plot text elements. Default is 8.
- ellipse_confidence_level
A numeric value indicating the confidence level for the ellipses. Default is 0.95.
- ellipse_linewidth
A numeric value indicating the line width of the ellipses. Default is 1.
- ellipse_fill
A logical value indicating whether to fill the ellipses.
- ellipse_fillcolor
A vector specifying the fill colors for ellipse corresponding to different
ellipse_variable
levels. This can be either an unnamed vector or a named vector, with names corresponding to leves inellipse_variable
. Unused fill colors will be ignored. Default isNA
which corresponds to the default fill colors in case ofellipse_variable = qc_type
, and to automatically generated colors otherwise.- ellipse_alpha
A numeric value indicating the transparency of the ellipse fill (0-1). Default is 0.3.