geneExpressionFromGEO: An R Package to Facilitate Data Reading from Gene Expression Omnibus (GEO).

Item request has been placed! ×
Item request cannot be made. ×
loading   Processing Request
  • Author(s): Chicco D;Chicco D
  • Source:
    Methods in molecular biology (Clifton, N.J.) [Methods Mol Biol] 2022; Vol. 2401, pp. 187-194.
  • Publication Type:
    Journal Article
  • Language:
    English
  • Additional Information
    • Source:
      Publisher: Humana Press Country of Publication: United States NLM ID: 9214969 Publication Model: Print Cited Medium: Internet ISSN: 1940-6029 (Electronic) Linking ISSN: 10643745 NLM ISO Abbreviation: Methods Mol Biol Subsets: MEDLINE
    • Publication Information:
      Publication: Totowa, NJ : Humana Press
      Original Publication: Clifton, N.J. : Humana Press,
    • Subject Terms:
    • Abstract:
      Gene expression profiling is a useful way to measure the activity of genes in molecular biology and, because of its effectiveness, researchers have released thousands of gene expression datasets publicly in online databases and repositories, such as Gene Expression Omnibus (GEO). To read and analyze gene expression data, the computational biology community has developed several tools and platforms, including Bioconductor, an R open-source platform of software packages that can be used to analyze these data. Despite the usefulness of Bioconductor and of its packages, it is still difficult to read gene expression data from GEO, and to assign gene symbols to the probesets of datasets. To alleviate this problem, we introduce here a new R software package, geneExpressionFromGEO, which provides to the users the possibility to easily download gene expression data from GEO and to easily associate gene symbols to probesets. In this short chapter, we describe the assets of our software package, and we report an example of its usage. We believe that geneExpressionFromGEO can be very useful for the R community of bioinformaticians working on gene expression data.
      (© 2022. The Author(s), under exclusive license to Springer Science+Business Media, LLC, part of Springer Nature.)
    • References:
      Taub F, DeLeo J, Thompson EB (1983) Sequential comparative hybridizations analyzed by computerized image processing can identify and quantitate regulated RNAs. DNA 2(4):309–327. (PMID: 10.1089/dna.1983.2.309)
      McLachlan GJ, Do KA, Ambroise C (2005) Analyzing microarray gene expression data, vol 422. Wiley, New York.
      Edgar R, Domrachev M, Lash AE (2002) Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30(1):207–210. (PMID: 10.1093/nar/30.1.207)
      Clough E, Barrett T (2016) The Gene Expression Omnibus database. Statistical genomics. Methods in molecular biology, vol 1418. Springer, New York, pp 93–110.
      Gentleman R, Carey V, Huber W et al (2006) Bioinformatics and computational biology solutions using R and bioconductor. Springer.
      Gentleman R, Carey V, Bates D et al (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5(10):R80. (PMID: 10.1186/gb-2004-5-10-r80)
      Davis S, Meltzer PS (2007) GEOquery: a bridge between the Gene Expression Omnibus (GEO) and bioconductor. Bioinformatics 23(14):1846–1847. (PMID: 10.1093/bioinformatics/btm254)
      The Comprehensive R Archive Network (CRAN) (2021) geneExpressionFromGEO: retrieves gene expression dataset and gene symbols from GEO code. https://cran.r-project.org/web/packages/geneExpressionFromGEO/index.html . Accessed 13 Jan 2021.
      GitHub.com (2021) geneExpressionFromGEO. https://github.com/davidechicco/geneExpressionFromGEO . Accessed 13 Jan 2021.
      Huber W, Carey VJ, Gentleman R et al (2015) Orchestrating high-throughput genomic analysis with bioconductor. Nat Methods 12(2):115–121. (PMID: 10.1038/nmeth.3252)
      Bioconductor (2021) annotate: annotation for microarrays. https://bioconductor.org/packages/release/bioc/html/annotate.html . Accessed 18 Jan 2021.
      xml2 (2021) xml2-Parse XML. https://xml2.r-lib.org . Accessed 28 Jan 2021.
      Li L, Guturi K, Gautreau B et al (2018) Ubiquitin ligase RNF8 suppresses Notch signaling to regulate mammary development and tumorigenesis. J Clin Invest 128(10):4525–4542. (PMID: 10.1172/JCI120401)
      Cangelosi D, Morini M, Zanardi N et al (2020) Hypoxia predicts poor prognosis in neuroblastoma patients and associates with biological mechanisms involved in telomerase activation and tumor microenvironment reprogramming. Cancers 12(9):2343. (PMID: 10.3390/cancers12092343)
      Heider A, Alt R (2013) virtualArray: a R/bioconductor package to merge raw data from different microarray platforms. BMC Bioinformatics 14(1):75. (PMID: 10.1186/1471-2105-14-75)
      Bostanabad SY, Noyan S, Dedeoglu BG et al (2021) Overexpression of β-Arrestins inhibits proliferation and motility in triple negative breast cancer cells. Sci Rep 11(1539):1–14.
      Van’t Veer LJ, Dai H, Van De Vijver MJ et al (2002) Gene expression profiling predicts clinical outcome of breast cancer. Nature 415(6871):530–536. (PMID: 10.1038/415530a)
      Sotiriou C, Pusztai L (2009) Gene-expression signatures in breast cancer. N Engl J Med 360(8):790–800. (PMID: 10.1056/NEJMra0801289)
      Ma XJ, Salunga R, Tuggle JT et al (2003) Gene expression profiles of human breast cancer progression. Proc Natl Acad Sci U S A 100(10):5974–5979. (PMID: 10.1073/pnas.0931261100)
      Raudvere U, Kolberg L, Kuzmin I et al (2019) g:Profiler: a web server for functional enrichment analysis and conversions of gene lists (2019 update). Nucleic Acids Res 47(W1):W191–W198. (PMID: 10.1093/nar/gkz369)
      The Comprehensive R Archive Network (CRAN) (2021) gprofiler2: interface to the ‘g:Profiler’ toolset. https://cran.r-project.org/eb/packages/gprofiler2/index.html . Accessed 18 Jan 2021.
      Conda (2021) Package, dependency and environment management for any language. https://conda.io . Accessed 21 Jan 2021.
      Hahne F, Huber W, Gentleman R et al (2010) Bioconductor case studies. Springer, Berlin.
      Prlić A, Procter JB (2012) Ten simple rules for the open development of scientific software. PLoS Comput Biol 8(12):e1002802. (PMID: 10.1371/journal.pcbi.1002802)
      Wilkinson MD, Dumontier M, Aalbersberg IJ et al (2016) The FAIR guiding principles for scientific data management and stewardship. Sci Data 3(1):1–9. (PMID: 10.1038/sdata.2016.18)
      Chicco D (2017) Ten quick tips for machine learning in computational biology. BioData Mining 10(1):35. (PMID: 10.1186/s13040-017-0155-3)
      Barnes N (2010) Publish your computer code: it is good enough. Nature 467(7317):753. (PMID: 10.1038/467753a)
      Brazma A, Parkinson H, Sarkans U et al (2003) ArrayExpress–a public repository for microarray gene expression data at the EBI. Nucleic Acids Res 31(1):68–71. (PMID: 10.1093/nar/gkg091)
      Tomczak K, Czerwińska P, Wiznerowicz M (2015) The Cancer Genome Atlas (TCGA): an immeasurable source of knowledge. Contemp Oncol 19(1A):A68.
      Wang Z, Gerstein M, Snyder M (2009) RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10(1):57–63. (PMID: 10.1038/nrg2484)
      Saliba AE, Westermann AJ, Gorski SA et al (2014) Single-cell RNA-Seq: advances and future challenges. Nucleic Acids Res 42(14):8845–8860. (PMID: 10.1093/nar/gku555)
      Grüning B, Dale R, Sjödin A et al (2018) Bioconda: sustainable and comprehensive software distribution for the life sciences. Nat Methods 15(7):475–476. (PMID: 10.1038/s41592-018-0046-7)
    • Contributed Indexing:
      Keywords: Bioconductor; Gene Expression Omnibus; Gene expression; Microarray; R programming language
    • Publication Date:
      Date Created: 20211213 Date Completed: 20220124 Latest Revision: 20220124
    • Publication Date:
      20240105
    • Accession Number:
      10.1007/978-1-0716-1839-4_12
    • Accession Number:
      34902129