Skip to content

April/May Public Studies#2288

Open
Rima-Waleed wants to merge 6 commits into
masterfrom
april26_release
Open

April/May Public Studies#2288
Rima-Waleed wants to merge 6 commits into
masterfrom
april26_release

Conversation

@Rima-Waleed

@Rima-Waleed Rima-Waleed commented Apr 9, 2026

Copy link
Copy Markdown
Collaborator
# Study_ID Testing Instance Link Sample Count
1 nfib_ctf_biobank_2025 https://private.cbioportal.mskcc.org/study/summary?id=nfib_ctf_biobank_2025 68 Samples
2 hdcn_msk_2025 https://private.cbioportal.mskcc.org/study/summary?id=hdcn_msk_2025 1,270 Samples
3 rectal_radiation_msk_2024 https://private.cbioportal.mskcc.org/study/summary?id=rectal_radiation_msk_2024 48 Samples
4 pancan_metaprism_2023 https://private.cbioportal.mskcc.org/study/summary?id=pancan_metaprism_2023 1,031 Samples
5 ucec_msk_dacruzpaula_2025 https://private.cbioportal.mskcc.org/study/summary?id=ucec_msk_dacruzpaula_2025 11 Samples
6 gist_msk_2025 https://private.cbioportal.mskcc.org/study/summary?id=gist_msk_2025 183 Samples
7 pancan_pdmr_2025 https://www.cbioportal.org/study/summary?id=pancan_pdmr_2025 6,272 Samples
Total Samples 8,883 Samples

Notes:

Study_ID Missing Data Notes
Hdcn_msk_2025 Missing clinical data Pending author response
gist_msk_2025: Missing clinical data Pending author response
Ucec_msk_dacruzpaula_2025 Missing clinical data Author preparing data

@rmadupuri

rmadupuri commented May 20, 2026

Copy link
Copy Markdown
Collaborator

1. rectal_radiation_msk:

  • Update study name to Secondary Rectal Cancers (MSK, JAMA Netw Open 2025)
  • Update description to reflect that this is a secondary rectal cancer cohort (≥5 yr post prostate RT)
  • Is the cancer type READ or COADREAD?
  • Normalize Sample_Type: it should only be Primary, Metastasis, Local Recurrence. For WES Recapture samples, it should be same the corresponding impact sample?
    *Pending author response
  • Normalize Sample_Class: it should only be Tumor, Cell line, and Xenograft.
    *Pending author response
  • Cap Age at Rectal Cancer Diagnosis at 89.
  • Gene alteration %'s are slightly different in paper vs portal (TP53:77%, APC:48.40%, KRAS:39%, SMAD4:25.80%, FBXW7:13%, PIK3CA:6%). Is this due to filtering?
    (Filtering out the WES samples and then selecting for oncogenic somatic alterations.frequencies in the paper match the frequencies reported in the public portal.)
  • Mutational signatures pvalue profile not loaded to portal. profile_name is all lower case.
  • No cna segment data for P-0054770-T01-IM6, P-0066749-T01-IM7?
  • Does WES Recapture samples undergo SV, CNA calling? paper doesn't report CNA/SV calling and there is no CNA/SV data for WES samples. (Yes, the CNA SV from the WES/research IMPACT cases weren't included as they are processed using a different pipeline.)
  • CNA, CNASeq and SV caselists includes WES samples. Need to double check if thats correct?
  • 7-Mar gene in mutations file.
  • Supplemental files also have the data on the primary rectal cancer cohort. Is there a reason for not including it in the portal?
  • Supp eTable 10 : has additional clinical elements - OS/DFS outcome data, Neoadjuvant therapy, Timeline/treatment events, Radiation data etc, We can create a timeline, add additional elements to clinical etc

2. gist_msk_2025

  • ctdna samples missing from panel matrix
    *Pending author response
  • do the ACCESS samples have DMP IDs?
    *Pending author response
    *Pending author response
  • mention phase II trial and trial related data in study name and description
  • Serial ctDNA VAF timelines: (Figures 3A–H show ctDNA VAF over time). No timeline data file present. Possible to include in portal? There's 5 timepoints for ctDNA samples analyzed.
    *Pending author response
  • Description says Targeted sequencing of 42 gastrointestinal stromal tumor samples but study contains 183 total samples (42 IMPACT + 141 ctDNA).
    *Pending author response
  • meta_cna: Typo in profile_description: "MK-IMPACT" should be "MSK-IMPACT". Update stable_id from gistic to cna.
  • meta_cna_hg19_seg.txt: description reads "Somatic CNA data … from TCGA." data is from MSK-IMPACT.
  • Missing patients: The paper reports 42 patients enrolled and 31 patients with ctDNA data. Only 24 patient records exist in the patient clinical file. Patients whose data exists only as ctDNA (C-series IDs) have no entries in data_clinical_patient.txt
    *Author response: samples did not have suitable data for inclusion
  • Patient file uses GENDER, normalize to SEX.
  • Possible to get clinical data from authors? (Table1 shows the stats)
    *Pending author response
  • cna, cnaseq, sv caselists: were ctdna samples profiled? no data available.
    *Pending author response
  • Missing Pubmed and citation in portal link

3. pancan_pdmr_2025:

  • Rename study to Patient-Derived Models Repository (NCI, 2025)
  • The description says all models are Xenografts, is that correct? Can Sample Type be added to clinical, indicating whether models are PDX, PDC, PDO..
  • Cancer Type Detailed for one sample is NA
  • Missing mRNA zscores
  • What normalization was done on mRNA (Are these really RSEMs)?
  • What does Available from NCI PDMR, CTC-Derived Model attributes mean? Can we add a description?
  • What does Germline Availability coorespond to? If the patients have a matched germline? or samples have germline variants?
    *Pending author response
  • How were the patient and sample IDs mapped between Portal and the PDMR website
  • Missing RNA case list

4. hdcn_msk_2025

  • Missing Pubmed and citation in portal link
  • paper mentions 498 patients, paper includes 639 patients - is it a larger registry in portal?
    *Pending author response
  • meta_cna: update gistic to cna type. profile_description: Typo, "MK-IMPACT" should be "MSK-IMPACT"
  • Remove sv variants in file with no gene info
  • study folder has timeline files in /additional_files folder. Are they needed?
  • Study description is misleading: study includes all histiocytosis subtypes (not only RAF-independent MEK mutants), also includes 500 ctDNA (MSK-ACCESS) samples
  • Update citation to Diamond et al. Cancer Cell 2026
  • Update GENDER -> SEX
  • Add sample_class : Tumor/cfDNA
    *Pending author response
  • sv case list: were ctdna samples profiled?
    *Pending author response

@ritikakundra

Copy link
Copy Markdown
Collaborator

review_ucec_msk_dacruzpaula_2025.html
review_pancan_metaprism_2023.html
review_nfib_ctf_biobank_2025.html

@Rima-Waleed @BabyASatravada I did a validation check using claude. Not all points will be applicable but a good report to do a check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants