Skip to contents

Import Analysis Results from Plain CSV Files

Usage

import_data_csv(
  data = NULL,
  path,
  variable_name,
  analysis_id_col = NA,
  import_metadata = TRUE,
  first_feature_column = NA,
  na_strings = "NA"
)

Arguments

data

MidarExperiment object

path

One or more file names with path, or a folder path, which case all *.csv files in this folder will be read.

variable_name

Variable type representing the values in the table. Must be one of "intensity", "norm_intensity", "conc", "area", "height", "response")

analysis_id_col

Column to be used as analysis_id. NA (default) used 'analysis_id' if present, or the first column if it contains unique values.

import_metadata

Import additional metadata columns (e.g. batch ID, sample type) and add to the MidarExperiment object. Only following metadata column names are supported: "qc_type", "batch_id", "is_quantifier", "is_istd", "analysis_order"

first_feature_column

Column number of the first column representing the feature values

na_strings

A character vector of strings which are to be interpreted as NA values. Blank fields are also considered to be missing values.

Value

MidarExperiment object

Details

This function imports analysis result data from .csv files in wide format, where each row represents a sample and each column signifies a feature.

The dataset must have an analysis identifier, either in an "analysis_id" column or inferred from the first column if it contains unique values. The variable_name argument specifies the data type in the table: "area", "height", "intensity", "norm_intensity", "response", "conc", "conc_raw", "rt", or "fwhm".

Specific metadata columns, i.e., "analysis_order", "qc_type", "batch_id", "is_quantifier", "is_istd", can be imported if present.

WHen there are other columns that not represent features, use the first_feature_column parameter to define where feature columns start.

When a directory path is specified, the function processes all .csv files in that directory, merging them into a single dataset. This is useful for datasets divided into multiple files during preprocessing. Ensure each feature and raw data file pair appears only once to avoid duplication errors.

The na_strings parameter allows specifying strings to be interpreted as NA, ensuring the dataset is clean for analysis.

Examples

file_path <- system.file("extdata", "plain_wide_dataset.csv", package = "midar")

mexp <- MidarExperiment()

mexp <- import_data_csv(
  data = mexp,
  path = file_path,
 variable_name = "conc",
 import_metadata = TRUE)
#>  Metadata column(s) 'qc_type, batch_id' imported. To ignore, set `import_metadata = FALSE`
#>  Imported 87 analyses with 5 features
#>  Analysis metadata associated with 87 analyses.
#>  Feature metadata associated with 5 features.
#>  Analysis order was based on `analysis_order` column of imported data. Use `set_analysis_order` to change the order.

print(mexp)
#> 
#> ── MidarExperiment ─────────────────────────────────────────────────────────────
#> Title:
#> 
#> Processing status: Annotated raw CONC values
#> 
#> ── Annotated Raw Data ──
#> 
#> • Analyses: 87
#> • Features: 5
#> • Raw signal used for processing: `feature_conc`
#> 
#> ── Metadata ──
#> 
#> • Analyses/samples: 
#> • Features/analytes: 
#> • Internal standards: 
#> • Response curves: 
#> • Calibrants/QC concentrations: 
#> • Study samples: 
#> 
#> ── Processing Status ──
#> 
#> • Isotope corrected: 
#> • ISTD normalized: 
#> • ISTD quantitated: 
#> • Drift corrected variables: 
#> • Batch corrected variables: 
#> • Feature filtering applied: 
#> 
#> ── Exclusion of Analyses and Features ──
#> 
#> • Analyses manually excluded (`analysis_id`): 
#> • Features manually excluded (`feature_id`):