Home

About the R code for 4 datasets
(Meta-analysis using GeneMeta R package)

Overview of the Gene Expression Analysis Script

This script performs a comprehensive analysis of multiple GEO datasets, including data downloading, normalization, batch effect correction, and visualization of differential gene expression. It consists of 18 custom functions that handle various stages of the analysis.

Key Functions


1. DownloadGEO:

Downloads the data for a specified GEO accession number and stores it locally, making it ready for further analysis.


2. ReadGEO:

Reads and processes the downloaded GEO dataset, converting it into a matrix format for downstream analysis.


3. Makephenotype:

Generates phenotype data files from GEO datasets based on the project name and comparison factors (e.g., 'grade').


4. MakeBoxPlot:

Creates boxplots to visualize gene expression data distribution both before and after normalization, highlighting the variance within the dataset.


5. MakePCA:

Performs Principal Component Analysis (PCA) and generates visualizations to explore sample grouping based on experimental factors such as 'grade.'


6. VSNQuantilNorm:

Applies Variance Stabilizing Normalization (VSN) and quantile normalization to the gene expression data, providing normalized datasets for further analysis.


7. Mergestudies:

Merges gene expression matrices from multiple studies without batch effect correction, allowing a comparison across multiple datasets.


8. StudyBatchEffect:

Corrects for batch effects in merged datasets using statistical methods, ensuring that technical variation is minimized and biological variation is emphasized.


9. MakeDensityPlot:

Generates density plots to assess the distribution of gene expression levels in datasets before and after batch effect correction.


10. MakeMDSPlot:

Creates Multidimensional Scaling (MDS) plots to visualize similarities and differences between samples based on gene expression profiles.


11. CochranQTest:

Performs Cochran's Q test to evaluate the heterogeneity among studies, a critical step in meta-analysis when comparing multiple datasets.


12. FEMREMAnalysis:

Conducts meta-analysis using either the Fixed Effects Model (FEM) or Random Effects Model (REM) to identify significantly differentially expressed genes across multiple studies.


13. Makevolcano:

Generates a volcano plot for differential gene expression analysis. The plot visually represents the effect size versus statistical significance and highlights upregulated and downregulated genes.


14. MakeVenna:

Creates a Venn diagram to visualize the overlap of upregulated and downregulated genes across multiple datasets. It also identifies common genes and generates files of shared DEGs.


15. Makenetworkanalyst:

Prepares the gene expression data matrix for use in the NetworkAnalyst platform, a tool used for network-based analysis.


16. MakeHeatmap:

Generates heatmaps for visualizing the expression patterns of top differentially expressed genes, grouping samples based on experimental conditions such as 'grade.'


17. GSEAAnalysis:

Performs Gene Set Enrichment Analysis (GSEA) to identify significantly enriched pathways, using gene set data from MSigDB and presenting the results in an NES-based visualization.


18. MakeQQPlot:

Creates a QQ plot using the results from Cochran’s Q test to assess the distribution of p-values and identify potential discrepancies in the meta-analysis.


Usage Flow


the Phase 3 R packages: