methylationGLM_T1() is the high-level coordinator for the one-timepoint GLM stage of the dnaEPICO workflow. It prepares the merged phenotype-plus-beta input, optionally creates exploratory plots, fits one Gaussian GLM per CpG for each requested phenotype, extracts CpG-level summaries, optionally collects significant CpG coefficient tables, generates diagnostic plots, annotates the combined summary table, and optionally writes legacy-style outputs to disk. The default behavior is now in-memory and quiet, which makes the function easier to compose with other package functions and more aligned with typical Bioconductor usage.

methylationGLM_T1(
  inputPheno = "rData/preprocessingPheno/mergeData/phenoBetaT1.RData",
  outputLogs = "logs",
  outputRData = "rData/methylationGLM_T1/models",
  outputPlots = "figures/methylationGLM_T1",
  phenotypes = c("DASS_Depression", "DASS_Anxiety", "DASS_Stress", "PCL5_TotalScore",
    "MHCSF_TotalScore", "BRS_TotalScore"),
  covariates = "Sex,Age,Ethnicity,TraumaDefinition,Leukocytes,Epithelial.cells",
  factorVars = "Sex,Ethnicity,TraumaDefinition",
  cpgPrefix = "cg",
  cpgLimit = NA,
  nCores = 32,
  plotWidth = 2000,
  plotHeight = 1000,
  plotDPI = 150,
  interactionTerm = NULL,
  libPath = NULL,
  glmLibs = "glm2",
  prsMap = NULL,
  summaryPval = NA,
  summaryResidualSD = TRUE,
  saveSignificantCpGs = FALSE,
  significantCpGDir = "preliminaryResults/cpgs/methylationGLM_T1",
  significantCpGPval = 0.05,
  saveTxtSummaries = TRUE,
  chunkSize = NULL,
  summaryTxtDir = "preliminaryResults/summary/methylationGLM_T1/glm",
  fdrThreshold = 0.05,
  padjmethod = "fdr",
  annotationPackage = "IlluminaHumanMethylationEPICv2anno.20a1.hg38",
  annotationCols = c("Name", "chr", "pos", "UCSC_RefGene_Group", "UCSC_RefGene_Name",
    "Relation_to_Island", "GencodeV41_Group"),
  annotatedGLMOut = "data/methylationGLM_T1",
  display = FALSE,
  verbose = FALSE,
  logs = FALSE,
  saveOutputs = FALSE
)

Arguments

inputPheno

Character. Path to the merged phenotype-plus-beta .RData or .rds object created by preprocessingPheno(). The default points to the timepoint-1 object produced by the package workflow.

outputLogs

Character. Directory used for optional log files.

outputRData

Character. Directory used for optional serialized model and summary outputs.

outputPlots

Character. Directory used for optional TIFF plots.

phenotypes

Character vector or comma-separated phenotype variables to model.

covariates

Character. Comma-separated covariate variables included in each GLM.

factorVars

Character. Comma-separated variables that should be treated as factors before modeling.

cpgPrefix

Character. Prefix used to identify methylation columns in the merged phenotype-plus-beta input object. The default is "cg".

cpgLimit

Integer or NA. Maximum number of CpGs to analyse. Use NA to keep all CpGs matching cpgPrefix.

nCores

Integer. Number of worker processes to use while fitting models and extracting summaries.

plotWidth

Integer. TIFF width in pixels when plots are written to disk.

plotHeight

Integer. TIFF height in pixels when plots are written to disk.

plotDPI

Integer. TIFF resolution in DPI when plots are written to disk.

interactionTerm

Character or NULL. Optional interaction term. When supplied and present in the input data, the phenotype is modeled together with its interaction against this variable.

libPath

Character vector or NULL. Optional library paths forwarded to worker processes. By default, the current .libPaths() are used.

glmLibs

Character. Comma-separated package names to validate on worker processes. The default is "glm2".

prsMap

Character or NULL. Optional phenotype-to-PRS mapping in the form "Phenotype1:PRS_1,Phenotype2:PRS_2".

summaryPval

Numeric or NA. Optional p-value threshold applied to the returned CpG summary tables. Use NA to keep all summary rows.

summaryResidualSD

Logical. If TRUE, append residual standard deviations to the CpG summary tables and residual diagnostic plots.

saveSignificantCpGs

Logical. If TRUE, collect significant CpG coefficient tables in the returned object and optionally write them to disk when saveOutputs = TRUE.

significantCpGDir

Character. Directory used for optional significant CpG coefficient tables.

significantCpGPval

Numeric. P-value threshold used to collect or write significant CpG coefficient tables.

saveTxtSummaries

Logical. If TRUE and saveOutputs = TRUE, write tab-delimited summary tables to summaryTxtDir.

chunkSize

Integer or NULL. Number of CpGs processed per summary extraction chunk. NULL chooses a value automatically.

summaryTxtDir

Character. Directory used for optional tab-delimited GLM summary tables.

fdrThreshold

Numeric. False-discovery-rate threshold used to highlight CpGs in the residual-significance diagnostic plots.

padjmethod

Character. P-value adjustment method passed to stats::p.adjust(). The default is "fdr".

annotationPackage

Character. Annotation package or object name passed to minfi::getAnnotation(), for example "IlluminaHumanMethylationEPICv2anno.20a1.hg38".

annotationCols

Character vector or comma-separated annotation columns to append to the combined GLM summary table. Available columns depend on the selected annotation package.

annotatedGLMOut

Character. Directory used for the optional annotated GLM summary CSV file.

display

Logical. If TRUE, draw exploratory and diagnostic plots on the active graphics device.

verbose

Logical. If TRUE, emit progress messages with message(). The default is FALSE, so the function is quiet unless requested.

logs

Logical. If TRUE, write the same progress messages to file.path(outputLogs, "log_methylationGLM_T1.txt").

saveOutputs

Logical. If TRUE, write optional serialized model files, summary tables, significant-CpG tables, annotated results, and TIFF plots to the requested output directories. The default is FALSE, so the function returns in-memory results without writing files.

Value

A list with class "dnaEPICO_methylationGLM_T1".

preparedData

Object returned by prepareMethylationGLM_T1Data() containing the merged phenotype-plus-beta analysis table and modeling metadata.

distributionPlots

Object returned by plotMethylationGLM_T1Distributions() describing any exploratory plots that were generated or written.

modelFits

Object returned by fitMethylationGLM_T1Models() containing the per-phenotype CpG model fits.

modelSummaries

Object returned by summarizeMethylationGLM_T1Models() containing the combined CpG summary tables used for reporting and annotation.

significantCpGs

Object returned by collectSignificantCpGsMethylationGLM_T1() containing optional phenotype-specific significant-CpG tables.

diagnosticPlots

Object returned by plotMethylationGLM_T1Diagnostics() describing the diagnostic plot objects and any written TIFF files.

annotation

Object returned by annotateMethylationGLM_T1Summaries() containing the annotated combined summary table.

savedFiles

Object returned by writeMethylationGLM_T1Outputs() when saveOutputs = TRUE, otherwise NULL.

See dnaEPICO_methylationGLM_T1 for a class-level overview.

Examples

if (requireNamespace("IlluminaHumanMethylation450kanno.ilmn12.hg19", quietly = TRUE)) {
  tmp <- tempdir()
  toy_path <- file.path(tmp, "phenoBetaT1.RData")
  phenoBT1 <- data.frame(
    Sample_Name = c("S1", "S2", "S3", "S4"),
    status = factor(c("Case", "Case", "Control", "Control")),
    sex = factor(c("F", "M", "F", "M")),
    cg00000029 = c(0.20, 0.25, 0.22, 0.27),
    cg00000108 = c(0.60, 0.55, 0.52, 0.58),
    check.names = FALSE
  )
  save(phenoBT1, file = toy_path)

  result <- methylationGLM_T1(
    inputPheno = toy_path,
    phenotypes = "status",
    covariates = "sex",
    factorVars = "status,sex",
    cpgLimit = 2,
    nCores = 1,
    summaryPval = 1,
    annotationPackage = "IlluminaHumanMethylation450kanno.ilmn12.hg19",
    annotationCols = "Name,chr,pos",
    display = FALSE,
    verbose = FALSE,
    logs = FALSE,
    saveOutputs = FALSE
  )

  class(result)
}
#> [1] "dnaEPICO_methylationGLM_T1"