Gene Panels & Activity Scores

ExposoGraph ships with curated reference gene panels and CPIC-standardized activity scores from the CarcinoGenomic Platform (Tables 3–6).

Each seed gene now includes structured database provenance, and the package exposes a formal source manifest through CURATION_SOURCE_MANIFEST. The manuscript-aligned primary curation sources are IARC, KEGG, PharmVar, CPIC, CTD, GTEx, and PubMed. ExposoGraph also uses NCBI Gene and ClinPGx as supporting implementation sources for stable identifiers and pharmacogene coverage.

The curated KEGG pathway catalog exposed through REFERENCE_KEGG_PATHWAYS currently tracks hsa00980 (xenobiotic metabolism by cytochrome P450), hsa00140 (steroid hormone biosynthesis), hsa05204 (chemical carcinogenesis - DNA adducts), and hsa05208 (chemical carcinogenesis - reactive oxygen species).

Terminology:

  • Tier refers to panel priority or inclusion level.

  • Phase is reserved for Phase I, II, and III metabolism/transport labels.

  • DNA repair genes are classified by repair class such as BER, NER, or Direct Reversal.

Tier 1: Core Panel (13 genes)

The core carcinogen-metabolizing enzyme panel:

Gene

Classification

Role

Detail

CYP1A1

Phase I

Activation

PAH diol-epoxide formation; AhR-inducible; extrahepatic expression

CYP1A2

Phase I

Activation

HCA and aromatic amine activation; AhR-inducible; hepatic

CYP1B1

Phase I

Activation

PAH and estrogen activation; 4-OH-estradiol formation

CYP2A6

Phase I

Activation

NNK and nitrosamine activation; nicotine metabolism

CYP2E1

Phase I

Activation

Small-molecule carcinogen activation; ethanol, benzene, NDMA

CYP3A4

Phase I

Activation

Aflatoxin B1 8,9-epoxidation; broad substrate range

GSTM1

Phase II

Detoxification

GSH conjugation of PAH diol-epoxides; null polymorphism common

GSTT1

Phase II

Detoxification

GSH conjugation of small electrophiles; null polymorphism common

GSTP1

Phase II

Detoxification

GSH conjugation of BPDE and PAH diol-epoxides

NAT2

Phase II

Mixed

N-acetylation of aromatic amines; rapid vs slow acetylator

EPHX1

Phase II

Mixed

Microsomal epoxide hydrolysis; activation and detoxification

UGT1A1

Phase II

Detoxification

Glucuronidation of PAH metabolites and bilirubin

NQO1

Phase II

Detoxification

Two-electron quinone reduction; prevents ROS from redox cycling

Tier 2: Extended Panel (23 genes)

Additional Phase I, II, III, and DNA repair genes:

Gene

Classification

Role

Detail

SULT1A1

Phase II

Detoxification

Sulfation of phenolic compounds, HCA intermediates

NAT1

Phase II

Mixed

O-acetylation of aromatic/heterocyclic amines in peripheral tissues

UGT2B7

Phase II

Detoxification

Glucuronidation of steroid hormones, carcinogen metabolites

UGT2B17

Phase II

Detoxification

Glucuronidation of testosterone, DHT, and related androgen metabolites

SULT1E1

Phase II

Detoxification

High-affinity estrogen sulfotransferase; estradiol and catechol-estrogen inactivation

COMT

Phase II

Detoxification

O-methylation of catechol estrogens; limits redox-cycling estrogen metabolites

ABCB1

Phase III

Transport

P-gp; ATP-driven efflux of hydrophobic xenobiotics

ABCC2

Phase III

Transport

MRP2; export of GSH/glucuronide/sulfate conjugates

ABCG2

Phase III

Transport

BCRP; efflux of PAHs, PhIP, porphyrins

XRCC1

DNA Repair (BER)

Repair

BER scaffold protein; oxidative/alkylation DNA damage

XPC

DNA Repair (NER)

Repair

GG-NER damage sensor; bulky DNA adducts from PAHs

ERCC2/XPD

DNA Repair (NER)

Repair

NER helicase; unwinds DNA at damage sites

OGG1

DNA Repair (BER)

Repair

8-oxoguanine DNA glycosylase; oxidative DNA damage

MGMT

DNA Repair (Direct Reversal)

Repair

Direct reversal of O6-alkylguanine; suicidal repair enzyme

CYP2C9

Phase I

Mixed

Oxidation of PAH metabolites; major drug-metabolizing CYP

CYP2C19

Phase I

Mixed

Minor procarcinogen activation; nitrosamine metabolism

CYP2D6

Phase I

Mixed

Minor NNK activation; dual PGx/carcinogen-risk reporting

CYP2A13

Phase I

Activation

Primary lung NNK-metabolizing CYP; tobacco-smoke activation

CYP17A1

Phase I

Activation

Steroid 17alpha-hydroxylase/17,20-lyase; androgen precursor synthesis

SRD5A1

Phase I

Activation

5alpha-reductase type 1; peripheral testosterone-to-DHT conversion

SRD5A2

Phase I

Activation

5alpha-reductase type 2; major DHT-forming enzyme in prostate tissues

CYP19A1

Phase I

Activation

Aromatase; converts androgen precursors to estrogens

AKR1C3

Phase I

Mixed

Local androgen/estrogen activation with quinone-reductase overlap

Activity Scores

ExposoGraph currently ships activity-score tables for 18 genes. These tables mix two evidence classes:

  • guideline-backed pharmacogene resources, primarily surfaced through ClinPGx/PharmVar

  • research-use literature-derived mappings for carcinogen metabolism and DNA repair genes

They should therefore be treated as referenced curation aids, not as a full clinical pharmacogenomics engine. Each gene now has supporting evidence metadata available through ACTIVITY_SCORE_METADATA.

Each per-allele entry has:

  • allele — Star allele or variant name

  • value — Numeric activity score (0.0 = no function, 1.0 = normal, 2.0 = ultrarapid)

  • phenotype — Functional interpretation

  • confidence — High or Moderate

Example: CYP2D6

Allele

Score

Phenotype

Confidence

*1, *2, *35

1.0

Normal Metabolizer

High

*9, *17, *29, *41

0.5

NM (lower range)

High

*10

0.25

NM (lowest); IM

High

*3, *4, *5, *6, *40

0.0

Poor Metabolizer

High

*1x2, *2x2 (dupl.)

2.0

Ultrarapid Metabolizer

High

The full set covers 18 genes. See ACTIVITY_SCORES for the score table and ACTIVITY_SCORE_METADATA for the references.

Usage

from ExposoGraph import (
    build_full_panel,
    get_activity_scores,
    get_activity_score_references,
)

# Load all 36 genes into a KnowledgeGraph
kg = build_full_panel()

# Look up scores for a specific gene
scores = get_activity_scores("CYP1A1")
for entry in scores:
    print(f"{entry['allele']}: {entry['value']}{entry['phenotype']}")

# Inspect the supporting references
for ref in get_activity_score_references("CYP1A1") or []:
    print(ref["source_db"], ref.get("pmid") or ref.get("record_id"), ref["url"])