Datasets - Computational Cancer Genomics

Genomic projects focused on rare cancers encounter the limitation of availability of high-quality biological material suitable for such studies. This translates in small series of samples usually underpowered to draw meaningful conclusions. Thus, facilitating the integration of independent datasets into larger sample series is of the upmost importance. We provide a full spectrum of data and tools to maximize reuse potential for a wide range of users: raw sequencing reads, data notes and processed data, interactive computational notebooks, interactive tumor maps, and reproducible bioinformatics pipelines. We have also reprocessed available raw sequencing datasets to homogenize independently published series using our reproducible bioinformatics pipelines.

Our raw data are available on European Genome-Phenome Archive. To get access please contact us at ccg@iarc.who.int and fill the Data Access Agreement (DAA) template of the project for which you would like to get the raw data (available below in the corresponding projet section)
Once access has been granted, data can be downloaded using the EGA python client (see detailed instructions here and video tutorial here)
If you use our data please acknowledge the rare cancers genomics initiative in the acknowledgments section, for example: « The results shown here are in part based upon data generated by the Rare Cancers Genomics initiative (www.rarecancersgenomics.com) »

LungNENomics

Study EGAS00001003699 from paper: Alcala et al. 2019 Nat Comms

  •                     – Whole genome and exome sequencing dataset:  EGAD00001005087
  •                     – RNA-sequencing dataset: EGAD00010001719
  •                     – Infinium EPIC 850K DNA methylation beadchip dataset: EGAD00010001720

                    Download LungNENomics DAA template

MESOMICS

Study EGAS00001004812 from paper: Mangiante et al. 2023 Nat Genet

  •                     – Whole genome sequencing dataset:  EGAD00001007023
  •                     – RNA-sequencing dataset: EGAD00001007024
  •                     – Infinium EPIC 850K DNA methylation beadchip dataset: EGAD00010002053

                    Download MESOMICS DAA template

LungNENomics

                                                            Organoids:

MESOMICS

  •                                                             – Gigascience (Di Genova et al. 2023)
  •                                                             – GigaDB
  •                                                             – Github mesomics data notes

We developed an interactive web application to explore the European Prospective Investigation into Cancer and Nutrition (EPIC) rare cancers data.
EPIC study is one of the largest cohort studies in the world with more than half a million participants recruited across 10 western European countries (23 centers) and followed for almost 15 years. Detailed information on diet, lifestyle characteristics, anthropometric measurements and medical history was collected at recruitment (1992-1999). Biological samples including plasma, serum, leukocytes and erythrocytes were also collected at baseline from the majority of individuals. They are stored in the International Agency for Research on Cancer (IARC)’s biobank and at EPIC collaborating centers. More than more than 9 million aliquots are hosted, constituting one of the largest biobanks in the world for biochemical and genetic investigations on cancer and other chronic diseases.
Our rare cancer database which can be explored at https://epic-rare-cancers-explorer.opendata.iarc.who.int/ includes 8851 rare cancer cases among which 5688 females and 3163 males.

Tumor map is an interactive browser that allows biologists, who may not have computational expertise, to richly explore the results of high-throughput cancer genomics experiments on thousands of patient samples.