We have developed the CancerGenes resource to simplify the process of gene selection and prioritization in large collaborative projects. CancerGenes combines gene lists annotated by experts with information from key public databases.

Further information on resource architecture and list generation methods is available from the companion paper Higgins, et al. (2007) and this website's methods section. Aaron Gabow is the primary architect and engineer of this current, updated resource.

Gene lists in the CancerGenes resource are from various sources and have been mapped to UCSC canonical gene ids. Below are descriptions of gene list sources.

Cancer Cell Map Pathways
We downloaded 10 BioPAX-formatted, cancer-related pathways from MSKCC's Cancer Cell Map website, extracted protein accessions, and then mapped these over to UCSC gene ids.

Cancer Reviews
We downloaded supplementary data or manually curated gene symbols from four recently published reviews of cancer genes, and then mapped these over to UCSC gene ids. The citations for these publications follow:

  • Cancer Review: Futreal et al. 2004 = Futreal PA, Coin L, Marshall M, Down T, Hubbard T, Wooster R, Rahman N, Stratton MR. (2004) A census of human cancer genes. Nat Rev Cancer. Mar;4(3):177-83.
  • Cancer Review: Hahn and Weinberg 2002 = Hahn WC, Weinberg RA. (2002) Modeling the molecular circuitry of cancer. Nat Rev Cancer. May;2(5):331-41.
  • Cancer Review: Mitelman 2000 = Mitelman F. Recurrent chromosome aberrations in cancer. (2000) Mutat Res. Apr;462(2-3):247-53.
  • Cancer Review: Vogelstein and Kinzler 2004 = Vogelstein B, Kinzler KW. (2004) Cancer genes and the pathways they control. Nat Med. Aug;10(8):789-99.

Entrez Queries
We performed a series of queries to gene-centric Entrez databases at NCBI's eUtils site, extracted gene ids and mapped these to UCSC gene ids. The gene list titles and corresponding Entrez queries are as follows:

  • Entrez Query: Oncogene = union of two separate Entrez Gene and OMIM searches
    Gene search: "oncogene"[All Fields] AND "homo sapiens"[ORGN] AND "current only"[Filter] NOT "ras oncogene family"[Gene/Protein Name]
    OMIM search: "oncogene"[Title] OR "protooncogene"[Title] AND "has locus"[Properties], and then displayed Gene links
  • Entrez Query: Phosphatase = Gene search: (cd00047 OR pfam04387 OR pfam00102 OR smart00404 OR smart00194 OR pfam01451 OR cd00115 OR smart00195 OR pfam00782 OR cd00127 OR pfam06617 OR cd01530) AND "homo sapiens"[ORGN] AND "current only"[Filter]
  • Entrez Query: Protein Kinase = Gene search: "protein kinase"[GO] OR cd00192[Domain Name] OR "serine/threonine kinase"[GO] AND "homo sapiens"[ORGN] AND "current only"[Filter] NOT pseudogene[All Fields] NOT hypothetical
  • Entrez Query: Tumor Suppressor = Gene search: "tumor suppressor"[All Fields] AND "homo sapiens"[ORGN] AND "current only"[Filter]
  • Entrez Query: Tyrosine Kinase = Gene search: cd00192[Domain Name] AND "homo sapiens"[ORGN] AND "current only"[Filter]

Sanger Cancer Gene Census (CGC)
We downloaded the supplementary table from the Sanger Institute's Cancer Gene Census website and then extracted lists of Entrez Gene ids for the following codes from the column labeled "Mutation Type", and entered them into gene lists with the corresponding titles below:

  • Sanger CGC: Translocation = T
  • Sanger CGC: Missense mutation = Mis
  • Sanger CGC: Frameshift mutation = F
  • Sanger CGC: Nonsense mutation = N
  • Sanger CGC: Splicing mutation = S
  • Sanger CGC: Large deletion = D
  • Sanger CGC: Amplification = A

Sanger Catalogue of Somatic Mutations in Cancer (COSMIC)
We downloaded the entire latest COSMIC data table from the Sanger FTP site, and extracted Entrez Gene ids from the first column in that table. These ids were mapped to UCSC gene ids, and entered into the gene list entitled "Sanger COSMIC: Somatic mutations." COSMIC also has an interactive website at .

Prostate Cancer List (MSKCC)
We downloaded the BioPAX formatted androgen receptor pathway from MSKCC's Cancer Cell Map and modified it based on changes seen in copy number data and mutation data from gene sequencing. We also manually curated data from several references, the principal of which were:

Schröder FH., Progress in understanding androgen-independent prostate cancer (AIPC): a review of potential endocrine-mediated mechanisms. Eur Urol. 2008 Jun;53(6):1129-37.

Tomlins SA, Mehra R, Rhodes DR, Cao X, Wang L, Dhanasekaran SM, Kalyana-Sundaram S, Wei JT, Rubin MA, Pienta KJ, Shah RB, Chinnaiyan AM., Integrative molecular concept modeling of prostate cancer progression. Nat Genet. 2007 Jan;39(1):41-51.

Ergün A, Lawrence CA, Kohanski MA, Brennan TA, Collins JJ. A network biology approach to prostate cancer. Mol Syst Biol. 2007;3:82. Epub 2007 Feb 13.

Back to main page