Gene Panels & Activity Scores¶
ExposoGraph ships with curated reference gene panels and CPIC-standardized activity scores from the CarcinoGenomic Platform (Tables 3–6).
Each seed gene now includes structured database provenance, and the package
exposes a formal source manifest through
CURATION_SOURCE_MANIFEST.
The manuscript-aligned primary curation sources are IARC, KEGG, PharmVar,
CPIC, CTD, GTEx, and PubMed. ExposoGraph also uses NCBI Gene and ClinPGx as
supporting implementation sources for stable identifiers and pharmacogene
coverage.
The curated KEGG pathway catalog exposed through
REFERENCE_KEGG_PATHWAYS currently tracks
hsa00980 (xenobiotic metabolism by cytochrome P450), hsa00140
(steroid hormone biosynthesis), hsa05204 (chemical carcinogenesis - DNA
adducts), and hsa05208 (chemical carcinogenesis - reactive oxygen species).
Terminology:
Tierrefers to panel priority or inclusion level.Phaseis reserved for Phase I, II, and III metabolism/transport labels.DNA repair genes are classified by repair class such as BER, NER, or Direct Reversal.
Tier 1: Core Panel (13 genes)¶
The core carcinogen-metabolizing enzyme panel:
Gene |
Classification |
Role |
Detail |
|---|---|---|---|
CYP1A1 |
Phase I |
Activation |
PAH diol-epoxide formation; AhR-inducible; extrahepatic expression |
CYP1A2 |
Phase I |
Activation |
HCA and aromatic amine activation; AhR-inducible; hepatic |
CYP1B1 |
Phase I |
Activation |
PAH and estrogen activation; 4-OH-estradiol formation |
CYP2A6 |
Phase I |
Activation |
NNK and nitrosamine activation; nicotine metabolism |
CYP2E1 |
Phase I |
Activation |
Small-molecule carcinogen activation; ethanol, benzene, NDMA |
CYP3A4 |
Phase I |
Activation |
Aflatoxin B1 8,9-epoxidation; broad substrate range |
GSTM1 |
Phase II |
Detoxification |
GSH conjugation of PAH diol-epoxides; null polymorphism common |
GSTT1 |
Phase II |
Detoxification |
GSH conjugation of small electrophiles; null polymorphism common |
GSTP1 |
Phase II |
Detoxification |
GSH conjugation of BPDE and PAH diol-epoxides |
NAT2 |
Phase II |
Mixed |
N-acetylation of aromatic amines; rapid vs slow acetylator |
EPHX1 |
Phase II |
Mixed |
Microsomal epoxide hydrolysis; activation and detoxification |
UGT1A1 |
Phase II |
Detoxification |
Glucuronidation of PAH metabolites and bilirubin |
NQO1 |
Phase II |
Detoxification |
Two-electron quinone reduction; prevents ROS from redox cycling |
Tier 2: Extended Panel (23 genes)¶
Additional Phase I, II, III, and DNA repair genes:
Gene |
Classification |
Role |
Detail |
|---|---|---|---|
SULT1A1 |
Phase II |
Detoxification |
Sulfation of phenolic compounds, HCA intermediates |
NAT1 |
Phase II |
Mixed |
O-acetylation of aromatic/heterocyclic amines in peripheral tissues |
UGT2B7 |
Phase II |
Detoxification |
Glucuronidation of steroid hormones, carcinogen metabolites |
UGT2B17 |
Phase II |
Detoxification |
Glucuronidation of testosterone, DHT, and related androgen metabolites |
SULT1E1 |
Phase II |
Detoxification |
High-affinity estrogen sulfotransferase; estradiol and catechol-estrogen inactivation |
COMT |
Phase II |
Detoxification |
O-methylation of catechol estrogens; limits redox-cycling estrogen metabolites |
ABCB1 |
Phase III |
Transport |
P-gp; ATP-driven efflux of hydrophobic xenobiotics |
ABCC2 |
Phase III |
Transport |
MRP2; export of GSH/glucuronide/sulfate conjugates |
ABCG2 |
Phase III |
Transport |
BCRP; efflux of PAHs, PhIP, porphyrins |
XRCC1 |
DNA Repair (BER) |
Repair |
BER scaffold protein; oxidative/alkylation DNA damage |
XPC |
DNA Repair (NER) |
Repair |
GG-NER damage sensor; bulky DNA adducts from PAHs |
ERCC2/XPD |
DNA Repair (NER) |
Repair |
NER helicase; unwinds DNA at damage sites |
OGG1 |
DNA Repair (BER) |
Repair |
8-oxoguanine DNA glycosylase; oxidative DNA damage |
MGMT |
DNA Repair (Direct Reversal) |
Repair |
Direct reversal of O6-alkylguanine; suicidal repair enzyme |
CYP2C9 |
Phase I |
Mixed |
Oxidation of PAH metabolites; major drug-metabolizing CYP |
CYP2C19 |
Phase I |
Mixed |
Minor procarcinogen activation; nitrosamine metabolism |
CYP2D6 |
Phase I |
Mixed |
Minor NNK activation; dual PGx/carcinogen-risk reporting |
CYP2A13 |
Phase I |
Activation |
Primary lung NNK-metabolizing CYP; tobacco-smoke activation |
CYP17A1 |
Phase I |
Activation |
Steroid 17alpha-hydroxylase/17,20-lyase; androgen precursor synthesis |
SRD5A1 |
Phase I |
Activation |
5alpha-reductase type 1; peripheral testosterone-to-DHT conversion |
SRD5A2 |
Phase I |
Activation |
5alpha-reductase type 2; major DHT-forming enzyme in prostate tissues |
CYP19A1 |
Phase I |
Activation |
Aromatase; converts androgen precursors to estrogens |
AKR1C3 |
Phase I |
Mixed |
Local androgen/estrogen activation with quinone-reductase overlap |
Activity Scores¶
ExposoGraph currently ships activity-score tables for 18 genes. These tables mix two evidence classes:
guideline-backed pharmacogene resources, primarily surfaced through ClinPGx/PharmVar
research-use literature-derived mappings for carcinogen metabolism and DNA repair genes
They should therefore be treated as referenced curation aids, not as a full
clinical pharmacogenomics engine. Each gene now has supporting evidence
metadata available through
ACTIVITY_SCORE_METADATA.
Each per-allele entry has:
allele — Star allele or variant name
value — Numeric activity score (0.0 = no function, 1.0 = normal, 2.0 = ultrarapid)
phenotype — Functional interpretation
confidence — High or Moderate
Example: CYP2D6¶
Allele |
Score |
Phenotype |
Confidence |
|---|---|---|---|
*1, *2, *35 |
1.0 |
Normal Metabolizer |
High |
*9, *17, *29, *41 |
0.5 |
NM (lower range) |
High |
*10 |
0.25 |
NM (lowest); IM |
High |
*3, *4, *5, *6, *40 |
0.0 |
Poor Metabolizer |
High |
*1x2, *2x2 (dupl.) |
2.0 |
Ultrarapid Metabolizer |
High |
The full set covers 18 genes. See
ACTIVITY_SCORES for the score table and
ACTIVITY_SCORE_METADATA for the references.
Usage¶
from ExposoGraph import (
build_full_panel,
get_activity_scores,
get_activity_score_references,
)
# Load all 36 genes into a KnowledgeGraph
kg = build_full_panel()
# Look up scores for a specific gene
scores = get_activity_scores("CYP1A1")
for entry in scores:
print(f"{entry['allele']}: {entry['value']} — {entry['phenotype']}")
# Inspect the supporting references
for ref in get_activity_score_references("CYP1A1") or []:
print(ref["source_db"], ref.get("pmid") or ref.get("record_id"), ref["url"])