I4

I4: New data sources are onboarded in days, not months

New data sources are onboarded in days, not months — the integration pipeline is a factory, not a project.

What does this mean?

I4 is the operational consequence of I3. When integration infrastructure is industrialized — when converter patterns, validation modules, workflow orchestration, regression suites, and metadata schemas accumulate across builds — the time to connect a new instrument collapses from months to days. AI-assisted converter generation handles the format-specific work; the engineer reviews and validates. This is not an aspiration; it is a measurable outcome of the factory model.

In a project-based integration model, onboarding a new instrument involves requirements gathering (2–4 weeks), development (4–8 weeks), validation (2–4 weeks), and deployment (1–2 weeks). Total: 3–6 months. In a factory model, AI drafts the converter, the engineer validates, and the factory's accumulated infrastructure handles the rest. Total: days to low single-digit weeks.

Why speed matters

Speed is not a convenience metric. It is an operational requirement driven by two realities of pharmaceutical R&D:

  1. Instrument deployment outpaces integration capacity. Capital equipment procurement cycles are 8–16 weeks. If integration takes 3–6 months beyond procurement, new instruments operate outside the governed data pipeline for their first months of use — producing data that must be retroactively integrated or manually managed.
  2. Portfolio-scale operations demand breadth. A global pharma R&D organization operates thousands of instruments across dozens of sites. At 3–6 months per integration, full coverage is a multi-year program. At days per integration, it becomes achievable within a single fiscal planning cycle.
Operational test

Measure the elapsed time from "new instrument arrives at the site" to "first data point appears in the governed pipeline." If this exceeds two weeks for an instrument type that already has a technique-family integration, I4 is not satisfied.

The configuration boundary

AI-assisted factories can handle most native vendor formats, which shifts the boundary between engineering and configuration. Factory-speed onboarding introduces a distinction between the two:

  • Engineering is required when the factory encounters a fundamentally new technique family, communication protocol, or data architecture it has never handled. This is I3 territory — building new reusable components. Engineering timelines of weeks are acceptable for genuinely novel technique families.
  • Configuration is required when the factory already has components for the technique family and AI-assisted generation can draft the converter from sample output. An engineer validates the mapping, the regression suite confirms correctness, and the instrument is onboarded. Configuration timelines of days are the I4 requirement.

Because AI-assisted generation handles most format variations within a technique family, the configuration boundary moves earlier in a factory's lifecycle than it would with purely manual engineering. After the first few builds within a technique, subsequent instruments from different vendors are typically configuration tasks — not engineering projects. I4 applies to the configuration case; the engineering case is governed by I3's compounding requirement.

Example — multi-site instrument fleet

A pharmaceutical company acquires 8 new plate readers from two vendors for four R&D sites. The integration factory already has: a plate reader technique converter, validation rules for dose-response curve completeness and well layout checks, and a regression test suite from prior plate reader builds. AI-assisted generation drafts converters for each vendor's native format from sample output; the engineer validates the mappings. Onboarding all 8 instruments requires configuring site-specific output directories, operator groups, and laboratory identifiers. All 8 are producing governed data within 10 business days of installation.

Relationship to other principles

I4 is the culmination of the Integrate sequence. I1 (capture at source), I2 (preserve native format), and I3 (industrialize the build) create the conditions that make I4 possible. Without these foundations, fast onboarding requires cutting corners — skipping provenance, losing native formats, or building throwaway integrations. I4 at speed and at quality is only possible because I1–I3 are already in place.

I4 also creates the precondition for Contextualize (C). The faster instruments are integrated, the sooner their data enters the contextualization pipeline — and the sooner the organization's scientific context graph grows richer.