D2: Traceable AI Recommendations | ICAD Principles

What does this mean?

When a decision is made on governed data — whether by a human signing a Certificate of Analysis, an AI system flagging a stability deviation, or an automated workflow releasing a batch — D2 requires that the decision is traceable backward through the entire ICAD sequence: from the decision (D) through the analysis (A) that informed it, through the contextualized data (C) the analysis operated on, to the original instrument measurements (I) that produced the data.

This is not a summary or explanation. It is a full provenance chain, navigable by scientists and machines, extending the C3 lineage graph through the analytical layer (A3) into the decision layer.

The regulatory imperative

The FDA's 2023 discussion paper on AI/ML in drug development states that "sponsors should be able to describe and explain the AI/ML model's function, including input data, output data, and logic." The European Medicines Agency's (EMA) reflection paper on AI in the pharmaceutical lifecycle (2023) requires that "the use of AI/ML methods should be transparent and the impact on the quality, safety and efficacy of medicinal products should be traceable."

Traceability is not a nice-to-have. It is a condition for regulatory acceptance of AI-driven decisions in pharmaceutical manufacturing, quality control, and regulatory submission.

Operational test

For any AI-generated recommendation: (1) Can you identify which data points the AI considered? (2) Can you verify that each data point has complete I→C→A lineage? (3) Can you identify the analytical model or algorithm that produced the recommendation (A3)? (4) Can you articulate the decision logic — why this recommendation and not an alternative? If any of these questions requires reverse-engineering from the AI's output rather than following a recorded provenance chain, D2 is not satisfied.

Explainability by construction, not post-hoc

D2 requires traceability by construction — built into the system architecture — not post-hoc explainability bolted on after the fact. The distinction matters:

Post-hoc explainability: An AI model produces an opaque output. A separate "explainability module" generates an approximate explanation (e.g., SHAP values, attention maps). The explanation is an interpretation of the model, not a trace of its actual computation.
Traceability by construction: The AI system records, at each step of its computation, which data it accessed, which models it invoked, which intermediate results it produced, and how it arrived at its recommendation. The trace is the actual computation path, not an interpretation.

D2 requires the second approach. In a regulated environment, approximate explanations are insufficient. The auditor requires the actual provenance chain.

Example — AI-flagged stability deviation

An AI monitoring system flags a stability trend deviation for a biologic product. The D2 provenance chain shows: the AI identified the deviation by comparing the current 12-month purity trajectory against the historical model (A3, model version ST-v2.1) → the model was trained on 2,847 governed purity measurements from 42 batches (A2 governed comparison, method equivalence verified) → each measurement has full C3 lineage to SEC-HPLC instruments at three sites → the flagged batch's data was integrated via integration pipeline (I1–I2) within 30 minutes of each instrument run. The quality reviewer can follow this chain from flag to instrument without leaving the system.

Contextualization as explainability

The I→C→A sequence is an explainability architecture. When every input to a decision has scientific context (C1), reconciled identity (C2), complete lineage (C3), and machine-readable structure (C4), the decision’s inputs are self-explaining. An AI recommendation based on “426 purity measurements from 8 batches, acquired via SEC-HPLC method MET-042 at three sites, governing study STB-2024-001” is explainable not because of a post-hoc SHAP analysis, but because the contextualized inputs carry the explanation with them.

Relationship to other principles

D2 is the capstone of the ICAD lineage chain that begins at I2 (provenance at ingestion), continues through C3 (data lineage), extends through A3 (model traceability), and culminates here at the decision layer. D2 also enables D4 (feedback loop) — you can only improve the decision system by analyzing its traceable historical decisions and their outcomes.

D2: Every decision is traceable to its source data and decision logic

What does this mean?

The regulatory imperative

Explainability by construction, not post-hoc

Contextualization as explainability

Relationship to other principles