I1

I1: Every scientific output is captured at the point of creation

Every scientific output is captured at the point of creation, not exported manually, not aggregated after the fact.

What does this mean?

In most pharmaceutical R&D organizations, instrument data leaves its source system through manual export — a scientist saves a chromatogram as a CSV, copies a mass spectrum to a network drive, or emails a plate reader output to a colleague. Each of these actions introduces a gap in provenance. The timestamp of the export is not the timestamp of the measurement. The file on the network drive has no link to the run parameters, operator identity, or instrument calibration state at the time of acquisition.

Principle I1 requires that data capture occurs at the point of creation. The integration pipeline connects directly to the instrument's data output — whether that is a file system watch on a CDS results folder, an OPC-UA or SiLA 2 service interface, a vendor API, or a direct database connection. The data enters the governed pipeline with its original creation timestamp, in the format the instrument produced it, before any human has the opportunity to modify, rename, or relocate it.

Point-of-creation capture extends beyond instruments and applies wherever scientific records originate. When an ELN entry is captured at the moment of creation — with a governed timestamp, operator identity, and full provenance — it establishes the evidentiary chain required for intellectual property protection: priority dates, proof of inventorship, and documented experimental rationale. Manual re-entry or after-the-fact aggregation breaks this chain.

Why this matters in regulated environments

21 CFR Part 11 and EU GMP Annex 11 require that electronic records are attributable, legible, contemporaneous, original, and accurate (ALCOA). Point-of-creation capture satisfies "attributable," "contemporaneous," and "original" by design — the system records data at the moment it is created, with the operator identity and source system provenance that establish who created the record and under what conditions.

Manual export introduces a temporal gap. If a scientist exports results two hours after a run, and the Laboratory Information Management System (LIMS) records the export timestamp as the data timestamp, the regulatory record is inaccurate. If a file is renamed during transfer, the original filename — which often encodes run parameters — is lost. These are not theoretical problems; they are the root cause of data integrity findings in FDA warning letters.

The technical requirement

Point-of-creation capture requires an integration architecture that can:

  • Monitor instrument output locations (file shares, databases, service endpoints) continuously, not on a batch schedule
  • Detect new data within seconds of creation and initiate ingestion — recognizing that some instruments create files that take minutes or hours to complete during data acquisition, so file existence alone is not sufficient; the pipeline must use completion signals, file-lock monitoring, or instrument-specific readiness indicators
  • Preserve the original file in its native format alongside any converted or derived output
  • Record the instrument identifier, operator (where available), acquisition timestamp, and source system version as first-class metadata
  • Operate without requiring the scientist to perform any export, save, or transfer action
Operational test

If a scientist must perform any manual step between instrument acquisition and data appearing in the governed pipeline, I1 is not satisfied. The integration is not yet at point-of-creation.

What this does not mean

I1 does not require real-time streaming of raw instrument signals. It requires capture of the completed output — the result file, the processed spectrum, the measurement record — at the point the instrument produces it. The distinction matters: real-time signal streaming serves process analytical technology (PAT) use cases; I1 serves data governance.

I1 also does not prohibit scientists from continuing to export data manually for their own analysis. It requires that the governed pipeline does not depend on manual export as its ingestion mechanism.

Example — Chromatography Data System

A chromatography data system (CDS) writes result files to a configured output directory upon sequence completion. An integration processing pipeline monitors that directory, detects the new result set within 30 seconds, ingests the native .raw files with their original timestamps, and registers the acquisition in the governed index with links to the instrument serial number, column lot, mobile phase composition, and operator ID from the CDS audit trail.

Antipattern

A scientist exports chromatography results to Excel every Friday afternoon, uploads the spreadsheet to SharePoint, and a scheduled job scrapes the SharePoint folder overnight. The data enters the governed system 1–7 days after generation, with no link to run parameters, no original file format, and a creation timestamp that reflects the export, not the measurement.

Relationship to other principles

I1 is the first step in the ICAD sequence. Without point-of-creation capture, I2 (native format preservation) has no original format to preserve — only an export artifact. I3 (industrialized integration) compounds the value of I1 by making each new instrument connection reusable. And every downstream principle — Contextualize, Analyze, Decide — depends on the provenance chain that I1 initiates.