A4

A4: Insights are reproducible

Insights are reproducible — any analyst, any site, same governed dataset produces equivalent results within governed tolerances.

What does this mean?

A4 defines the validation criterion for the entire Analyze phase. Reproducibility operates at three tiers. Data reproducibility is exact: the same query on the same versioned dataset returns the same data — this is an infrastructure requirement that must be absolute. Analytical reproducibility is validated: the same statistical or ML method applied to the same data produces results within validated acceptance criteria — acknowledging that floating-point arithmetic, non-deterministic algorithms, and hardware differences introduce bounded variation. Scientific reproducibility is practical: two analysts reach the same scientific conclusion — the same OOS root cause, the same stability trend direction, the same shelf-life prediction within the prediction interval. Regulatory guidance — including the FDA/EMA Guiding Principles of Good AI Practice in Drug Development — adopts a risk-based approach to computational reproducibility, not a requirement for bit-exact identity.

Reproducibility failures in scientific data analysis almost never originate in the statistical method. They originate in data selection, context interpretation, and environmental differences. Two analysts querying "impurity data for Compound X at 25°C/60% RH" get different results because they use different data filters, include different method versions, or operate on databases with different replication states. A4 eliminates these sources of irreproducibility by requiring that analysis operates on a single, governed, versioned dataset.

The reproducibility requirement

A4 requires:

  • Dataset versioning: the governed dataset used for any analysis is frozen and version-controlled. If the underlying data changes (new measurements added, corrections applied), this is a new version with its own lineage.
  • Query determinism: the same query parameters, applied to the same dataset version, return the same result set — regardless of which site or instance executes the query.
  • Computation reproducibility: non-determinism in computation is bounded and documented, not necessarily eliminated. Sources of variation — random seeds, hardware architecture, library versions, GPU scheduling — are recorded as part of analysis provenance (A3). Results are validated against acceptance criteria appropriate to the analytical method, not required to be bit-exact.
  • Environment specification: the computational environment (software version, library versions, hardware architecture where relevant) is recorded as part of the analysis provenance (A3).
Operational test

Run an analytical computation at two different sites, on the same governed dataset version, using the same analysis parameters. Compare the results. If they differ, identify the source of divergence. Common sources: database replication lag (data was not yet synchronized), query filter differences (local site variables), software version mismatch, floating-point precision differences. A4 is satisfied when all identified sources of divergence are eliminated by the governed infrastructure.

Reproducibility extends across tool boundaries: another analyst must be able to recreate the analysis from governed inputs and a documented analytical method — not from an ad-hoc data extraction and an undocumented tool workspace.

Reproducibility versus repeatability

In analytical chemistry, repeatability refers to agreement between measurements made under the same conditions (same analyst, same instrument, same day). Reproducibility refers to agreement under different conditions (different analysts, different instruments, different sites). A4 applies the reproducibility concept to data analysis: different analysts, different sites, same governed data, same result.

This is a higher bar than most organizations currently meet. In practice, analytical results vary across sites because of data access differences, not analytical differences. Two sites running the same statistical analysis on "the same data" often operate on subtly different data sets — because their LIMS instances replicate on different schedules, their data warehouses are refreshed at different intervals, or their local data exports use different filters.

Example — cross-site OOS investigation

An out-of-specification (OOS) investigation requires root-cause analysis using dissolution data from three manufacturing sites. Under A4, the investigation team queries the governed dataset for "all dissolution results for Product X, Method DIS-001, Batches B001–B050, 25°C/60%RH condition." This query returns an identical result set whether executed by an analyst in New Jersey, a QC manager in Cork, or a regulatory affairs specialist in Tokyo — because they are all querying the same versioned dataset, with the same reconciled master data (C2), the same method identity, and the same context filters. The root-cause analysis is reproducible because the input data is identical.

Relationship to other principles

A4 is the quality assurance principle for the Analyze phase. It validates that A1 (context prerequisite), A2 (governed comparison), and A3 (traceable models) are working correctly — because reproducibility is the observable outcome when all three are in place. A4 also creates the trust foundation for Decide (D): AI-enabled decisions can only be trusted when the analytical inputs to those decisions are reproducible.