A real-time, AI-ready architecture converting fragmented health data into a continuously learning, sovereign platform for clinical, operational, and research intelligence.
Google Cloud Reference ArchitectureClick any box to view detailed deep-dive
Clinical outcomes, clinician corrections, and user interactions feed back through the pipeline — retraining models, updating knowledge graphs, and improving AI accuracy over time. Every data point makes the system smarter.
Electronic Health Record systems communicating via HL7 Version 2 messaging — the dominant real-time clinical data source feeding the unified pipeline.
Inpatient & Ambulatory EHR
Bridges, Care Everywhere
Millennium platform
Real-time feeds via HCI
Expanse / 6.x
NPR & DR interfaces
TouchWorks, Sunrise
Ambulatory & acute
Cloud-native ambulatory
athenaClinicals APIs
VA VistA, DoD Genesis,
regional & specialty EHRs
| HL7v2 Segment | Trigger Event | FHIR R4 Resource | Key Fields Mapped |
|---|---|---|---|
| PID | ADT^A01/A04/A08 | Patient | MRN, name, DOB, gender, address, telecom, identifiers |
| PV1 + PV2 | ADT^A01/A02/A03 | Encounter | Class, location, period, participant (attending), status |
| ORC + OBR | ORM^O01 / OML^O21 | ServiceRequest | Code, requester, status, priority, specimen requirements |
| OBX | ORU^R01 | Observation | LOINC code, value, units, reference range, interpretation |
| OBR (Radiology) | ORU^R01 | DiagnosticReport | Study code, results, conclusion, imaging references |
| RXE / RXA | RDE^O11 / RAS^O17 | MedicationRequest / MedicationAdministration | Drug (RxNorm), dose, route, frequency, prescriber |
| DG1 | ADT^A01/A03 | Condition | ICD-10 code, category (encounter/problem-list), onset |
| AL1 | ADT^A01/A08 | AllergyIntolerance | Substance, reaction, severity, clinical status |
| SCH + AIS | SIU^S12 | Appointment | DateTime, participant, location, status, serviceType |
| TXA + OBX | MDM^T02 | DocumentReference | Type, author, date, content (base64 / URL), status |
| IN1 + IN2 | ADT^A01 | Coverage | Payor, subscriber, group, period, type |
HL7v2 messages streamed as they occur for near-zero-latency ingestion. Supports ADT, ORU, ORM events with sub-second delivery.
EHRs with FHIR R4 APIs (Epic USCDI, Cerner Ignite) push resources directly, bypassing HL7v2 translation.
Rhapsody, Mirth Connect, or InterSystems HealthShare handles routing, filtering, and protocol translation before cloud ingestion.
Initial data migration and periodic bulk refreshes using FHIR $export or flat-file extracts for historical backfill.
HL7v2 versions 2.3 through 2.8 coexist. Z-segments (custom extensions) vary per vendor and site, requiring per-source mapping.
MRN fragmentation across facilities. Requires MPI (Master Patient Index) or EMPI resolution before deduplication in the lakehouse.
Local codes vs. standard terminologies (LOINC, SNOMED, RxNorm). Healthcare Data Engine handles mapping but requires curation.
Out-of-order delivery and duplicate messages. Pipeline must handle idempotency, sequencing, and late-arriving corrections (A08).
Every message contains PHI. Must enforce encryption in transit (TLS/MLLP-S), at rest (CMEK), and de-identification for research.
EHR downtimes require message queuing and replay. Dead-letter queues and reconciliation jobs ensure zero data loss.
Medical imaging systems communicating via DICOM protocol — the primary source for radiology, cardiology, and pathology pixel data feeding the unified pipeline.
Centricity / Edison
Enterprise imaging archive
IntelliSpace PACS
Multi-modality support
syngo.via / teamplay
AI-ready platform
Hyland / Fuji / IBM
Long-term image storage
CT, MRI, US, XR, PET
Mammo, Path slides
Cardiology CVIT, Derm
Ophthalmology, Dental
| Tag | Name | Level | Notes |
|---|---|---|---|
| (0010,0020) | PatientID | Patient | MRN; critical for cross-system matching |
| (0020,000D) | StudyInstanceUID | Study | Globally unique study identifier |
| (0020,000E) | SeriesInstanceUID | Series | Groups images by acquisition sequence |
| (0008,0018) | SOPInstanceUID | Instance | Unique per image/object |
| (0008,0060) | Modality | Series | CT, MR, US, XR, PT, MG, SM |
| (0008,0020) | StudyDate | Study | Date of imaging examination |
| (0008,0090) | ReferringPhysician | Study | Ordering clinician name |
| (0018,0015) | BodyPartExamined | Series | CHEST, HEAD, ABDOMEN, etc. |
| (7FE0,0010) | PixelData | Instance | Bulk pixel data; largest element |
| (0002,0010) | TransferSyntaxUID | Meta | Encoding: Explicit VR, JPEG2000, etc. |
| DICOM Source | FHIR R4 Resource | Key Fields Mapped |
|---|---|---|
| Study | ImagingStudy | StudyInstanceUID, modality list, numberOfSeries/Instances, started, endpoint |
| Structured Report (SR) | DiagnosticReport / Observation | Coded findings, measurements, conclusion, performer |
| Patient Tags | Patient | PatientID, name, DOB, gender mapped to FHIR Patient resource |
| Order (AccessionNumber) | ServiceRequest | Accession, requested procedure, referring physician, priority |
| Series | ImagingStudy.series | Modality, body site, laterality, number of instances, UID |
| Instance | ImagingStudy.series.instance | SOPClass, instance number, WADO-RS endpoint for retrieval |
Native DICOMweb (STOW-RS, WADO-RS, QIDO-RS) direct to Cloud Healthcare API DICOM Store. RESTful, standards-based.
For legacy PACS using C-STORE/C-FIND. DIMSE proxy translates traditional DICOM networking to DICOMweb for cloud ingestion.
Bulk DICOM archive migration. Upload Part 10 files to Cloud Storage, then import into DICOM Store via batch job.
Pub/Sub notifications on new DICOM instances trigger downstream pipelines: metadata extraction, de-identification, AI inference.
CT/MR studies can be 1+ GB. Requires chunked uploads, resumable transfers, and efficient network utilization for cloud migration.
Lossy vs. lossless compression (JPEG2000, JPEG-LS). Lossy acceptable for viewing but not for AI training or primary diagnosis.
Patient name/DOB baked into pixel data (ultrasound overlays, scanned docs). Requires OCR-based pixel scrubbing for de-identification.
Ultrasound clips, cardiac cine MRI, fluoroscopy — multi-frame objects need special handling for storage, viewing, and AI processing.
Vertex AI inference requires pixel extraction, normalization, and pre-processing. Transfer syntax conversion may be needed.
Patients imaged at multiple facilities. StudyInstanceUIDs differ; requires MPI matching and study linking across PACS systems.
Laboratory Information Systems generating orders, specimens, and results via HL7v2 and FHIR — the highest-volume discrete clinical data source.
Enterprise LIS
Chemistry, Heme, Micro, BB
Integrated with Epic EHR
AP & CP modules
Oracle Health LIS
General & Anatomic Path
MEDITECH integrated lab
Expanse & legacy
Quest Diagnostics, LabCorp
Send-out results via HL7v2
i-STAT, glucometers, ABG
Bedside testing, rapid results
| HL7v2 Segment | FHIR R4 Resource | Code System | Key Fields Mapped |
|---|---|---|---|
| OBR | DiagnosticReport | LOINC | Panel code, status, effectiveDateTime, performer, conclusion |
| OBX | Observation | LOINC | Code, valueQuantity, referenceRange, interpretation (H/L/A), status |
| SPM | Specimen | SNOMED | Type, collection dateTime, source site, condition, container |
| ORC + OBR | ServiceRequest | LOINC / local | Code, requester, priority, status, authoredOn, specimen requirements |
| OBX (Micro) | Observation (component) | SNOMED | Organism, antibiotic, MIC value, interpretation (S/I/R) |
| OBX (AP narrative) | DiagnosticReport.presentedForm | LOINC | Pathology report text (synoptic/narrative), attachment |
ORU^R01 results streamed in real time. ORM^O01 orders captured for order-result linkage. Sub-second delivery for critical values.
Modern LIS platforms expose FHIR R4 endpoints. DiagnosticReport and Observation resources pulled or pushed directly.
Quest, LabCorp, and specialty reference labs return results via HL7v2 or FHIR. Routed through interface engine for normalization.
Point-of-care devices (i-STAT, glucometers) transmit results via device middleware to Pub/Sub for real-time capture.
Result status F → C (corrected). Pipeline must handle updates, maintain audit trail of original vs. corrected values.
Culture results arrive over days: preliminary → organism ID → susceptibilities. Must link all updates to single order.
Real-time alerting on critical values (K+ > 6.5, Hgb < 7). Delta checks detect instrument errors. Pipeline must support < 5 min latency.
Chemistry/heme = discrete numeric. Pathology = narrative text. NLP/embeddings required for AP reports to be AI-queryable.
Local test codes must map to LOINC for interoperability. 60-80% auto-mapped; remainder requires manual curation. Ongoing maintenance.
Ranges differ by lab, instrument, age, sex. Must capture per-result ranges, not global defaults, for accurate interpretation.
Consumer wearables, remote patient monitoring devices, and hospital IoT sensors generating continuous time-series health data at massive scale.
Dexcom G7, Abbott Libre
Reading every 5 min, 288/day
Apple Watch, AliveCor, Zio Patch
ECG, rhythm detection
Fitbit, Garmin, Oura
Steps, sleep, HRV, calories
BP cuffs, pulse ox, scales
Cellular-connected home devices
Philips IntelliVue, GE CARESCAPE
HR, SpO2, BP, temp, 1/sec
Alaris, Baxter, Draeger
Drug rates, vent settings, alarms
Dexcom, Fitbit, Withings expose REST APIs. Cloud Functions poll or receive webhooks, normalize, and push to Pub/Sub.
Bedside monitors → local IoT gateway (Capsule, Bernoulli) → HL7v2 or MQTT to Pub/Sub for real-time streaming.
Patient-facing apps write FHIR Observation resources (BP, weight, glucose) directly to Cloud Healthcare API FHIR Store.
Batch export from device platforms (Fitbit data export, CGM CSV downloads). Cloud Storage → Dataflow batch processing.
Motion artifacts, poor sensor contact, environmental interference. Requires signal quality scoring and filtering before clinical use.
Bluetooth dropouts, Wi-Fi dead zones, cellular coverage gaps. Must handle store-and-forward with gap reconciliation.
Device clocks drift. Multiple devices per patient with different time sources. Must normalize to UTC with known accuracy.
90%+ of monitor alarms are non-actionable. AI must filter noise, detect true deterioration patterns, and suppress false positives.
Wearable adherence drops over time. Missing data windows must be flagged, not treated as normal. Engagement tracking needed.
99%+ of readings are normal. Storage cost optimization via tiered storage (hot/warm/cold) and intelligent downsampling is essential.
Sequencing platforms and bioinformatics pipelines generating variant calls, gene expression profiles, and pharmacogenomic data for precision medicine.
Short-read sequencing
WGS, WES, targeted panels
Long-read HiFi sequencing
Structural variants, phasing
Real-time long-read
Rapid turnaround, portable
GATK, BWA-MEM2, DRAGEN
Alignment + variant calling
Sample tracking, clinical
interpretation, reporting
PGx platforms (CPIC)
Drug-gene interaction testing
| Genomic Source | FHIR R4 Resource | Key Fields Mapped |
|---|---|---|
| VCF Variant | Observation (variant) | Gene, DNA change (HGVS), protein change, zygosity, allele frequency |
| Sequence Data | MolecularSequence | Reference sequence, coordinate system, quality scores, repository |
| PGx Star Alleles | Observation (haplotype) | Gene (CYP2D6), allele name (*1/*4), metabolizer phenotype |
| PGx Recommendation | Task (medication-recommendation) | Drug, action (adjust dose/avoid), evidence level, CPIC guideline |
| Clinical Report | DiagnosticReport (genetics) | Conclusion, variant list, interpretation (P/LP/VUS/LB/B), performer |
| Panel / Test Order | ServiceRequest | Panel code, specimen, requester, reason (condition), priority |
FASTQ and BAM files uploaded to Cloud Storage with lifecycle policies. Multi-region for durability, Nearline/Archive for cost optimization.
Cloud Batch runs GATK/DeepVariant workflows. Auto-scaling VMs, preemptible instances for cost. WDL/Nextflow orchestration.
Variant Transforms loads VCF into BigQuery. Join with ClinVar, gnomAD for annotation. SQL-based variant filtering and cohort queries.
Hail on Dataproc for large-scale cohort analysis: GWAS, burden tests, PCA. Scales to millions of variants across thousands of samples.
Single WGS = 100+ GB raw. 50K samples/year = 5+ PB. Requires tiered storage, compression (CRAM), and efficient transfer.
WGS alignment + calling: 2-24 hours per sample. Requires auto-scaling compute (Cloud Batch) and spot/preemptible instances for cost.
40-60% of variants are VUS (Variants of Uncertain Significance). Requires ongoing reclassification as databases update.
As reference databases (ClinVar, gnomAD) update, prior results need re-annotation. Must maintain pipeline versioning and audit trail.
Incidental findings, right not to know, family implications. Consent management and result disclosure policies vary by institution.
Reference genomes biased toward European ancestry. gnomAD coverage varies by population. Equity implications for variant calling accuracy.
Administrative claims data and Social Determinants of Health providing the financial, utilization, and social context layer for population health and equity analytics.
Availity, Change Healthcare
X12 837/835 transaction hub
CMS Blue Button 2.0 (FHIR)
State Medicaid feeds
UHC, Anthem, Aetna, Cigna
EDI 837P/837I feeds
Regional health exchanges
ADT notifications, claims
Demographics, income, education
By FIPS / ZIP / tract
Food desert research atlas
Low access / low income tracts
County health estimates
Social Vulnerability Index
Area Deprivation Index
Housing, social services
| Source | FHIR R4 Resource | IG / Profile | Key Fields Mapped |
|---|---|---|---|
| X12 837 | Claim | CARIN BB | Type, provider, diagnosis, procedure, total, item lines |
| X12 835 (EOB) | ExplanationOfBenefit | CARIN BB | Payment, adjudication, adjustments, patient responsibility |
| X12 270/271 | Coverage | DaVinci PDex | Payor, subscriber, group, period, type, beneficiary |
| Payer / Provider | Organization | US Core | NPI, name, type, address, active status |
| SDoH Screening | QuestionnaireResponse | Gravity SDOH | Questionnaire ref, items, answers, authored date |
| SDoH Need | Condition | Gravity SDOH | Category (sdoh), code (Z-code), clinicalStatus, evidence |
| SDoH Referral | ServiceRequest / Task | Gravity SDOH | Category, code, status, requester, performer (CBO), for (patient) |
X12 837/835 or CSV flat files from clearinghouses. Batch upload to Cloud Storage, parsed by Dataflow, loaded into BigQuery.
CMS Blue Button 2.0, payer Patient Access APIs. ExplanationOfBenefit resources pulled via FHIR R4 into Cloud Healthcare API.
Census/ACS, ADI, SVI, USDA datasets loaded into BigQuery. Joined to patient records by FIPS code, ZIP, or census tract.
AHC-HRSN / PRAPARE screening responses flow from EHR via HL7v2 or FHIR. Z-codes captured in ADT/DG1 segments.
30-90 day delay from service to paid claim. Denials, resubmissions, and adjustments create multiple versions. Must handle retroactive changes.
Z-code capture is 5-10%. Screening adoption is uneven. Geocoded indices are proxies, not individual-level data. Gaps in rural areas.
Patient addresses may be PO boxes, shelters, or outdated. Census tract assignment requires geocoding services and address standardization.
Area-level deprivation (ADI) = risk. Individual screening = need. Both needed but different. Risk does not equal individual experience.
Patients have claims across multiple payers (Medicare + commercial). No universal patient ID. Requires probabilistic matching and deduplication.
Patients may not consent to sharing social needs data. Sensitive categories (domestic violence, substance use). Must respect preferences.
The unified ingestion pipeline: healthcare-native APIs, event-driven messaging, Apache Beam processing, and clinical data harmonization on GCP.
One topic per data type enables independent scaling, filtering, and consumer isolation.
| Feature | Configuration | Purpose |
|---|---|---|
| Exactly-Once Delivery | enable_exactly_once_delivery: true | No duplicate processing downstream |
| Message Ordering | ordering_key: patient_id | In-order per patient for ADT events |
| Dead-Letter Topics | max_delivery_attempts: 5 | Failed messages routed for triage |
| Push Subscriptions | push_endpoint: Cloud Run URL | Low-latency alert triggers |
| Pull Subscriptions | ack_deadline: 60s | Dataflow streaming consumption |
| Retention | message_retention: 7d | Replay window for reprocessing |
Immutable distributed datasets — each step produces a new PCollection
Element-wise transforms — HL7v2 parsing, FHIR mapping, validation
Fixed (1-min), sliding (5-min/1-min), session (30-min gap) windows
Event-time progress tracking — handle late data with allowed lateness
Broadcast lookup tables — terminology maps, facility configs
Failed elements routed to BigQuery error table + DLQ topic
| Capability | Detail | Output |
|---|---|---|
| Patient Matching (EMPI) | Probabilistic + deterministic matching on name, DOB, SSN, MRN | Golden patient_id |
| FHIR Harmonization | Normalize heterogeneous FHIR into canonical R4 profiles | Conformant FHIR bundles |
| Terminology Normalization | Map local codes → SNOMED CT, LOINC, RxNorm, ICD-10 | Standard coded values |
| Data Quality Rules | Completeness, validity, consistency checks per resource type | Quality score + flags |
| Longitudinal Assembly | Merge records across sources into single patient timeline | Unified patient record |
| De-identification | Safe Harbor / Expert Determination for research datasets | De-identified FHIR |
Immutable landing zone in BigQuery and Cloud Storage. Source-of-truth copies for audit, compliance, and reprocessing.
Data written once, never modified. Append-only ingestion preserves original fidelity.
Exact copy of upstream data. All downstream zones derive from raw — enables full recompute.
Every record timestamped with ingestion metadata. Supports HIPAA audit and regulatory review.
When transformation logic changes, replay from raw. No need to re-extract from source systems.
| Column | Type | Description |
|---|---|---|
| resource_type | STRING | Patient, Encounter, Observation, Condition, etc. |
| id | STRING | FHIR resource ID (server-assigned UUID) |
| meta_last_updated | TIMESTAMP | Server-side last modified timestamp |
| meta_version_id | STRING | Resource version for optimistic concurrency |
| resource_json | JSON | Full FHIR R4 resource payload |
| source_fhir_store | STRING | Cloud Healthcare API FHIR store path |
| ingestion_timestamp | TIMESTAMP | Pipeline ingestion time (partition key) |
| Column | Type | Description |
|---|---|---|
| message_id | STRING | Unique message control ID (MSH-10) |
| message_type | STRING | ADT, ORM, ORU, SIU, MDM, etc. |
| trigger_event | STRING | A01, A03, O01, R01, etc. |
| sending_facility | STRING | MSH-4 sending facility identifier |
| sending_application | STRING | MSH-3 sending application name |
| raw_message | STRING | Original pipe-delimited HL7v2 message |
| parsed_segments | JSON | Structured JSON of all segments (MSH, PID, PV1, OBX...) |
| message_datetime | TIMESTAMP | MSH-7 message date/time |
| ingestion_timestamp | TIMESTAMP | Pipeline arrival time (partition key) |
| Column | Type | Description |
|---|---|---|
| claim_id | STRING | Payer-assigned claim identifier |
| claim_type | STRING | Professional (837P), Institutional (837I), Dental (837D) |
| service_date_from | DATE | Service start date |
| service_date_to | DATE | Service end date |
| dx_codes | ARRAY<STRING> | ICD-10-CM diagnosis codes (primary + secondary) |
| px_codes | ARRAY<STRING> | CPT/HCPCS procedure codes |
| billed_amount | NUMERIC | Total billed amount |
| allowed_amount | NUMERIC | Payer-allowed amount |
| payer_name | STRING | Insurance payer identifier |
| raw_x12 | STRING | Original X12 transaction content |
| ingestion_timestamp | TIMESTAMP | Pipeline arrival time (partition key) |
Prefix structure: gs://project-raw/{source}/{type}/{YYYY}/{MM}/{DD}/
source_system — originating system IDdata_type — dicom, genomics, documentphi_flag — true/falseingestion_date — ISO 8601 arrival dateretention_class — hot, warm, cold| Check Type | Tool | Example Rule |
|---|---|---|
| Schema Validation | Dataplex Data Quality | All required columns present, correct types |
| Completeness | Dataplex Data Quality | patient_id NOT NULL, message_type NOT NULL |
| Duplicate Detection | Dataform assertion | COUNT(DISTINCT message_id) = COUNT(*) |
| Freshness Monitoring | Dataplex Data Quality | MAX(ingestion_timestamp) within last 15 minutes |
| Range Validation | Dataplex Data Quality | ingestion_timestamp between source_time and NOW() |
| Volume Anomaly | Cloud Monitoring | Daily row count within 2 stddev of trailing 30-day mean |
Every raw dataset/bucket registered as a Dataplex asset within the healthcare lake. Auto-discovery scans for new tables.
Automated tagging: source system, data classification (PHI/PII/public), ingestion date, owner team, retention policy.
BigQuery policy tags on PHI columns (SSN, name, DOB). Data Catalog taxonomy enforces access via IAM.
Dataplex lineage captures raw → curated → enriched provenance. Integrated with Dataform DAGs.
Normalized, deduplicated, quality-controlled healthcare data ready for analytics and downstream enrichment.
Standard terminologies (SNOMED, LOINC, RxNorm, ICD-10). Consistent schemas across sources.
EMPI-resolved patient identity. One golden record per patient, encounter, observation.
Dataform assertions + Dataplex DQ rules enforce integrity. Quality score per record.
Flat, queryable tables optimized for BigQuery. Partitioned and clustered for performance.
| Column | Type | Description |
|---|---|---|
| patient_id | STRING | EMPI-resolved universal patient identifier |
| mrns | ARRAY<STRUCT> | All known MRNs [{mrn, facility, active}] |
| given_name | STRING | Patient first name (best-known) |
| family_name | STRING | Patient last name (best-known) |
| date_of_birth | DATE | Date of birth |
| gender | STRING | Administrative gender |
| race | STRING | OMB race category |
| ethnicity | STRING | OMB ethnicity category |
| address | STRUCT | Primary address (line, city, state, zip) |
| primary_pcp | STRING | Primary care provider NPI |
| risk_scores | STRUCT | {hcc_score, lace_score, cci_score} |
| last_encounter_date | DATE | Most recent encounter date |
| insurance | ARRAY<STRUCT> | Active coverage [{payer, plan, member_id, type}] |
| is_deceased | BOOLEAN | Deceased flag |
| updated_at | TIMESTAMP | Last curated-zone update timestamp |
| Column | Type | Description |
|---|---|---|
| encounter_id | STRING | Unique encounter identifier |
| patient_id | STRING | FK to patient_master |
| encounter_type | STRING | ambulatory, emergency, inpatient, virtual |
| encounter_class | STRING | AMB, EMER, IMP, VR (FHIR class codes) |
| facility_id | STRING | Facility / location identifier |
| department | STRING | Department name |
| admit_date | TIMESTAMP | Admission or check-in time |
| discharge_date | TIMESTAMP | Discharge or check-out time |
| attending_npi | STRING | Attending provider NPI |
| diagnoses | ARRAY<STRUCT> | [{icd10, description, rank, type}] |
| procedures | ARRAY<STRUCT> | [{cpt, description, date}] |
| disposition | STRING | Discharge disposition code |
| Column | Type | Description |
|---|---|---|
| observation_id | STRING | Unique observation identifier |
| patient_id | STRING | FK to patient_master |
| encounter_id | STRING | FK to encounters (nullable for ambulatory) |
| loinc_code | STRING | LOINC observation code |
| display_name | STRING | Human-readable observation name |
| value_numeric | FLOAT64 | Numeric result (if applicable) |
| value_text | STRING | Text result (if non-numeric) |
| units | STRING | UCUM unit of measure |
| reference_range | STRING | Normal reference range |
| abnormal_flag | STRING | H, L, HH, LL, A, N |
| effective_date | TIMESTAMP | Clinically relevant date/time |
| source_system | STRING | Originating system identifier |
| Rule Type | Tool | Example |
|---|---|---|
| Not Null | Dataform assertion | patient_id, encounter_id, loinc_code must be non-null |
| Valid Range | Dataform assertion | Heart rate 20-300, temp 90-110F, SpO2 50-100% |
| Referential Integrity | Dataform assertion | All encounter.patient_id exists in patient_master |
| Code System Validation | Dataform assertion | loinc_code matches LOINC reference table |
| Completeness Score | Dataplex DQ | % of required fields populated per record |
| Timeliness | Dataplex DQ | Curated table refresh < 30 min after raw arrival |
All curated tables partitioned on primary date column (admit_date, effective_date, updated_at). Enables efficient time-range queries.
Clustering on patient_id collocates patient data for fast $everything-style queries across encounters, observations, conditions.
Pre-computed aggregations: active_patients, recent_admissions, pending_results. Auto-refreshed by BigQuery.
BigQuery BI Engine reservations on high-traffic curated tables for sub-second Looker dashboard queries.
ML features, embeddings, cohorts, and research marts — the AI-ready layer of the healthcare lakehouse.
Pre-computed risk scores, utilization metrics, temporal aggregations ready for model training and inference.
Vector representations of clinical notes, imaging, and lab panels for semantic search and similarity.
Pre-built patient cohorts for clinical trials, quality measures, and population health programs.
Disease-specific and operational data marts optimized for analytics and Looker dashboards.
| Entity Type | Key Features | Online Serving | Offline Serving |
|---|---|---|---|
| patient | risk_scores, demographics, utilization_30d, med_count, last_a1c, insurance_type | < 10ms (Bigtable) | BigQuery export |
| encounter | los_hours, icu_flag, diagnosis_count, procedure_count, ed_to_admit_min | < 10ms (Bigtable) | BigQuery export |
| provider | panel_size, avg_los, readmit_rate, specialty, quality_scores | < 10ms (Bigtable) | BigQuery export |
| Table | Key Columns | Embedding Model | Dimensions |
|---|---|---|---|
| clinical_note_embeddings | note_id, patient_id, encounter_id, embedding_vector, model_version, note_type | Med-PaLM / Gemini | 768 / 1024 |
| imaging_embeddings | study_id, series_id, patient_id, embedding_vector, modality, body_part | Med-PaLM Vision | 1024 |
| lab_panel_embeddings | patient_id, panel_date, embedding_vector, panel_type, lab_count | Custom Vertex AI | 256 |
| patient_summary_embeddings | patient_id, embedding_vector, summary_date, model_version | Gemini | 768 |
| Column | Type | Description |
|---|---|---|
| cohort_id | STRING | Unique cohort identifier |
| cohort_name | STRING | Human-readable name (e.g., "T2DM A1c > 9") |
| criteria_definition | JSON | Structured inclusion/exclusion criteria |
| patient_ids | ARRAY<STRING> | Matching patient IDs |
| patient_count | INT64 | Cohort size |
| creation_date | TIMESTAMP | When cohort was computed |
| irb_number | STRING | Associated IRB protocol (if research) |
| refresh_schedule | STRING | daily, weekly, one-time |
| created_by | STRING | Requesting user / team |
Tumor registry, staging, treatment lines, genomic variants, outcomes by regimen.
Echo metrics, cath lab data, LVEF trends, HF readmissions, anticoagulation adherence.
A1c trajectories, insulin dosing, complication rates, eye/foot exam compliance.
ED wait times, OR utilization, bed turnover, discharge delays, staffing ratios.
30-day readmission rates by DRG, payer, provider. Risk-stratified cohorts.
Risk stratification tiers, care gaps (screenings, vaccines), SDoH indices, HEDIS measures.
| Component | Detail | Storage |
|---|---|---|
| Feature Snapshots | Point-in-time feature values at prediction timestamp | BigQuery (versioned) |
| Label Tables | readmission_30d, mortality_inpatient, sepsis_onset, deterioration_6h | BigQuery |
| Train/Val/Test Splits | Temporal split (train < 2024, val = 2024-H1, test = 2024-H2) | BigQuery + GCS |
| Dataset Versioning | Dataplex lineage tracks dataset provenance per model version | Dataplex metadata |
| Data Cards | Dataset documentation: size, demographics, label distribution, known biases | Vertex AI Metadata |
Convert all healthcare data types into dense vector representations enabling semantic search, similarity matching, and cross-modality reasoning across the clinical data ecosystem.
| Model | Data Type | Dimensions | Use Case |
|---|---|---|---|
| text-embedding-005 | General text | 768 | Clinical notes, discharge summaries, guidelines |
| text-multilingual-embedding-002 | Multilingual text | 768 | Patient-facing materials, consent forms |
| Med-PaLM Embeddings | Clinical text | 768 | H&P notes, radiology/pathology reports, medical Q&A |
| Health AI Dev Foundations | Medical imaging | 1024 | X-ray, CT, pathology slide embeddings |
| Custom Fine-Tuned (Vertex AI Training) | Domain-specific | 768/1024 | Org-specific terminology, specialty notes, lab panels |
| multimodalembedding@001 | Image + text | 1408 | Cross-modal search: text query → image results |
Scheduled and event-driven embedding generation for all new and updated clinical data via Dataflow orchestration.
On-demand embedding for new documents and agent queries via Vertex AI online prediction endpoints.
Incremental updates to vector indices as new embeddings arrive, ensuring near-real-time search availability.
| Service | Vector Capability | Search Algorithm | Best For | Latency |
|---|---|---|---|---|
| BigQuery | VECTOR type, VECTOR_SEARCH() | Cosine / Dot Product / Euclidean | Analytical queries, cohort similarity, SQL joins with vectors | Seconds (analytical) |
| Vertex AI Vector Search | Managed ANN index, deployed endpoints | ScaNN (Scalable Nearest Neighbors) | Low-latency serving, real-time agent retrieval, RAG pipeline | < 10ms (p99) |
| AlloyDB | pgvector extension, ANN index | IVFFlat / HNSW | Transactional + vector hybrid, app-embedded search | < 50ms |
| Spanner | K-Nearest Neighbors (approx) | Cosine distance built-in | Global-scale transactional with vector search | < 20ms |
Track distribution shift in embedding space over time. Alert when new data deviates significantly from training distribution.
Track model_id + model_version per vector. Support side-by-side versions during migration. Vertex AI Model Registry for lineage.
On model update, trigger batch re-embedding of existing corpus. Dataflow job with BigQuery source, write-back with new model_version.
Compare embedding quality across models using retrieval precision/recall on curated eval sets. Vertex AI Experiments for tracking.
UMAP / t-SNE projections for visualization and debugging. Stored as 2D/3D coordinates for dashboarding in Looker.
Embedding expiration aligned with source data retention policies. Automated cleanup via BigQuery scheduled queries.
Connect structured and unstructured healthcare data to LLMs via retrieval-augmented generation for grounded, accurate clinical AI with full citation and source attribution.
| Data Store | Source | Search Mode | Content Indexed |
|---|---|---|---|
| Clinical Corpus | BigQuery | Semantic + Keyword (Blended) | Clinical notes, discharge summaries, radiology/pathology reports |
| Guidelines Corpus | Cloud Storage | Semantic | Clinical pathways, protocols, formulary rules, order set docs |
| FHIR Store | Cloud Healthcare API | FHIR Search + Semantic | Structured patient data (conditions, meds, observations, encounters) |
| Research Corpus | Cloud Storage / URLs | Semantic + Faceted | PubMed abstracts, internal publications, trial protocols |
Respect section boundaries in clinical notes. Each chunk maps to a logical section (HPI, Assessment, Plan, ROS). Preserves clinical context within chunks.
Fixed-size chunks (512 tokens) with 64-token overlap for unstructured documents. Ensures no context is lost at boundaries.
Every chunk retains: patient_id, encounter_date, note_type, author, section_name. Enables filtered retrieval by patient, date range, or note type at query time.
Parent-child chunk structure: document summary (parent) + section chunks (children). Search children, return parent context for richer grounding.
| Capability | GCP Service | Configuration | Impact |
|---|---|---|---|
| Search-based grounding | Vertex AI Grounding API | dynamic_retrieval_config | Connects Gemini to Vertex AI Search results at inference time |
| Citation generation | Gemini + Grounding | grounding_metadata in response | Every claim in response linked to source document + chunk |
| Hallucination reduction | Grounding score threshold | grounding_score ≥ 0.7 | Reject or flag low-confidence answers; fall back to "I don't know" |
| Retrieval parameters | Vertex AI Search API | top_k=10, relevance_threshold=0.5 | Tune precision/recall trade-off per use case |
| Multi-turn context | Vertex AI Conversation | follow_up_search enabled | Maintain retrieval context across multi-turn clinical dialogues |
Query across a patient's full longitudinal record. Retrieve relevant notes, labs, and meds to answer clinician questions with citations.
Retrieve guidelines matching patient context (conditions, labs, meds). Generate concordance assessment and recommended actions.
Match clinical data to payer medical necessity criteria. Auto-generate supporting documentation from patient record.
Find evidence for clinical questions by searching PubMed and internal research corpus. Summarize findings with study citations.
Each indexed chunk inherits access permissions from source. Vertex AI Search enforces ACLs at retrieval time based on user identity.
User role (physician, nurse, admin) determines retrievable document types. Enforced via IAM + custom metadata filters on search queries.
Output guardrails prevent PHI leakage in responses to unauthorized users. DLP API integration for real-time PII/PHI detection.
Every retrieval logged: who queried, what was retrieved, what was returned. Cloud Audit Logs + BigQuery for compliance reporting.
Access policies tied to purpose (treatment, payment, operations, research). HIPAA minimum necessary enforced at query scope.
Emergency override for restricted records with mandatory justification logging and post-access review workflow.
Encode medical ontologies, clinical pathways, and operational rules as a graph to validate and contextualize AI reasoning with structured medical knowledge.
| Service | Type | Query Language | Best For |
|---|---|---|---|
| Neo4j Aura on GCP | Managed graph DB (GCP Marketplace) | Cypher | Full ontology encoding, multi-hop traversals, pathway validation |
| Spanner Graph | Graph layer on Cloud Spanner | Spanner Graph Query | Global-scale, strongly consistent graph + relational hybrid |
| Memorystore (Redis Graph) | In-memory graph | Cypher subset | Cached frequent traversals, low-latency lookups at inference |
| BigQuery + Graph Analytics | Analytical graph | SQL + GRAPH_PATH() | Batch graph analytics on large-scale clinical datasets |
Encode pathways (sepsis bundle, ACS protocol, diabetes management) as directed graphs with decision nodes, time constraints, and required actions.
Medication → ingredient → interaction edges with severity levels (critical, major, moderate, minor). Used at inference to validate AI medication recommendations.
Time-zero recognition → lactate draw (30 min) → blood cultures (before abx) → broad-spectrum antibiotics (1 hr) → fluid resuscitation (30 mL/kg if hypotensive) → reassess.
Chest pain → 12-lead ECG (10 min) → troponin draw → STEMI pathway (cath lab activation) or NSTEMI pathway (risk stratification) → anticoagulation → cardiology consult.
| Access Method | GCP Service | Protocol | Use Case |
|---|---|---|---|
| Direct graph queries | Neo4j Aura (Bolt) | Bolt protocol / Cypher | Complex traversals, ontology exploration, ad-hoc queries |
| REST endpoints | Cloud Run | HTTPS / JSON | Agent tool calls: validate_medication, check_pathway, lookup_code |
| Cached lookups | Memorystore (Redis) | Redis protocol | Frequent traversals cached: drug interactions, code lookups |
| Agent tool integration | Vertex AI Agent Builder | Tool / Function Calling | Graph queries exposed as callable tools for Gemini agents |
| Batch analytics | BigQuery + Dataflow | SQL + Graph export | Bulk ontology analysis, mapping coverage reports |
SNOMED CT releases biannually. RxNorm monthly. ICD-10 annual updates. Automated pipelines ingest new releases and update graph nodes/edges.
Every ontology update creates a versioned snapshot. Enables rollback and point-in-time queries. Stored in Cloud Storage as Neo4j dumps.
Pathway updates require clinical committee review. Approval workflow in Cloud Workflows with human-in-the-loop before graph promotion.
Every node/edge tracks: source ontology, version, last_updated, provenance. Queryable for audit and compliance.
Scheduled Cypher queries detect orphan nodes, broken relationships, and circular hierarchies. Alerts via Cloud Monitoring.
Maintain MAPS_TO edges across ontologies (SNOMED↔ICD-10, LOINC↔CPT). Validate mapping coverage on each release cycle.
Intelligent data fabric providing consistent, policy-controlled access to distributed healthcare data while appearing unified to AI agents and users.
Patient records, encounters, observations, conditions, medications. FHIR-native views.
DICOM metadata, radiology reports, pathology slides. Linked to clinical context.
ADT census, scheduling, staffing, supply chain, billing. Real-time event streams.
De-identified cohorts, OMOP CDM tables, trial registries. IRB-controlled access.
VCF files, variant annotations, pharmacogenomics panels. Stored in Cloud Storage + BigQuery.
| Access Pattern | GCP Service | Consumers | Use Cases |
|---|---|---|---|
| FHIR R4 REST | Cloud Healthcare API | EHR apps, SMART-on-FHIR, CDS Hooks | Patient read/write, clinical data exchange |
| REST / GraphQL | Cloud Run + Hasura/Apollo | Internal apps, dashboards | Flexible queries over BigQuery curated views |
| Search & RAG | Vertex AI Search | AI agents, clinician search | Semantic search across clinical documents + notes |
| Feature Serving | Vertex AI Feature Store | ML models, prediction agents | Low-latency feature vectors for real-time inference |
| External API Gateway | Apigee | External partners, HIEs, payers | Rate limiting, auth, analytics for external consumers |
| Role | IAM Binding | Access Scope | Mapped Group |
|---|---|---|---|
| Clinician Viewer | roles/healthcare.fhirResourceReader | Own patients, assigned unit | grp-cardiology, grp-oncology, etc. |
| Researcher Analyst | roles/bigquery.dataViewer | De-identified datasets only | grp-research-approved |
| Operations Admin | roles/bigquery.dataEditor | Operational tables, dashboards | grp-ops-managers |
| AI Agent Service Account | roles/aiplatform.user + custom | Scoped per agent type, purpose-bound | sa-clinical-agent@proj.iam |
| External Partner | roles/healthcare.fhirResourceReader | Specific FHIR resources via Apigee | grp-external-payer-feeds |
| Attribute | Source | Examples |
|---|---|---|
| User Role | IAM + Google Groups | clinician, researcher, ops-admin, AI-agent |
| Purpose | Request header / token claim | treatment, payment, operations, research |
| Data Sensitivity | Dataplex tags + DLP classification | PHI, de-identified, public, restricted |
| Patient Consent | FHIR Consent resource | opt-in research, restrict substance-abuse records |
| Context | IAM Conditions | Time of day, IP range, device posture |
Threat detection, vulnerability scanning, compliance posture management
IAM Recommender: least-privilege suggestions, unused role alerts
Network traffic analysis, anomaly detection, forensic investigation
Looker dashboard: PHI findings, de-id coverage, inspection job status
Centralized security analytics, correlation rules, incident response
Autonomous AI agents that monitor, reason, and recommend within clinical workflows — integrated into EHR, always grounded, always auditable.
Recommendations suppressed below configurable confidence threshold. Low-confidence outputs routed to human review queue.
Every clinical recommendation validated against curated Knowledge Graph (drug interactions, contraindications, guidelines).
High-risk actions (medication changes, code blue alerts, diagnosis) require clinician confirmation before execution.
Every recommendation includes reasoning chain: evidence sources, feature contributions, guideline references.
Clinician overrides logged with reason. Override patterns analyzed for model improvement and safety signal detection.
| Integration Pattern | Standard | Use Case | Direction |
|---|---|---|---|
| App Launch | SMART-on-FHIR | Agent UI embedded in EHR context (patient, encounter) | EHR → Agent |
| Decision Support | CDS Hooks | patient-view, order-select, order-sign hook triggers | EHR → Agent → EHR |
| Event Subscription | FHIR Subscriptions | New lab result, admission, medication order triggers agent | EHR → Agent |
| Notification Write-Back | FHIR CommunicationRequest | In-basket messages, alerts, task assignments to care team | Agent → EHR |
| Documentation Write-Back | FHIR DocumentReference | AI-generated notes posted back for clinician review | Agent → EHR |
| Model | Platform | Role | Use Cases |
|---|---|---|---|
| Gemini | Vertex AI | Reasoning & generation | Agent orchestration, note generation, differential diagnosis |
| Med-PaLM | Vertex AI | Clinical Q&A | Medical knowledge retrieval, clinical question answering |
| Custom ML Models | Vertex AI Training | Specialized prediction | Deterioration, readmission, sepsis, LOS prediction |
| Ensemble Scoring | Vertex AI Endpoints | Combined inference | Multi-model consensus for high-stakes clinical decisions |
AI agents optimizing hospital operations — bed management, staffing, throughput, supply chain, and revenue cycle.
| Data Source | Feed Type | GCP Ingestion | Agents Consuming |
|---|---|---|---|
| ADT Feed (Real-time Census) | HL7v2 ADT^A01-A03 | Cloud Healthcare API → Pub/Sub → BigQuery | Bed, Throughput, Staffing |
| Scheduling Systems | SIU messages / API | Dataflow → BigQuery | Throughput, Staffing |
| HR / Timekeeping | Batch / API (Kronos, Workday) | Cloud Storage → Dataflow → BigQuery | Staffing |
| Materials Management | ERP API (Infor, SAP) | Cloud Run connector → BigQuery | Supply Chain |
| Billing / Claims | 837/835 EDI, DFT | Dataflow → BigQuery | Revenue Cycle |
| Patient Satisfaction | Survey API (Press Ganey) | Cloud Functions → BigQuery | Throughput, All |
Real-time ops command center: census, throughput, staffing, revenue cycle KPIs. Role-based views for CNO, CMO, CFO.
Alerts to charge nurses, bed managers, department directors via mobile (Firebase Cloud Messaging).
EVS cleaning triggers, patient transport requests, equipment setup — auto-generated on discharge/transfer events.
Purchase orders, inventory adjustments, staffing schedule changes pushed to ERP systems via Cloud Run connectors.
Target 40% reduction in boarding hours through predictive bed assignment and discharge acceleration.
5-10% improvement in prime-time OR utilization via case duration prediction and turnover optimization.
15-25% reduction in premium labor (agency, overtime) through predictive staffing and float pool optimization.
20-30% reduction in expired supplies through demand-driven par levels and expiration alerts.
2-5% increase in net revenue via charge capture improvement, denial prevention, and faster A/R collection.
AI agents that accelerate clinical research — cohort discovery, literature analysis, trial matching, and population health pattern recognition.
| Access Tier | Data Type | Controls | Use Case |
|---|---|---|---|
| De-identified (Safe Harbor) | 18 identifiers removed via DLP | Open to approved researchers | Cohort discovery, feasibility, population analytics |
| Limited Dataset | Dates + zip3 retained | DUA required, IRB-approved | Longitudinal studies, temporal pattern analysis |
| Honest Broker | Re-linkable via broker only | Broker intermediary, audit trail | Multi-source data linkage, registry enrollment |
| Synthetic Data | Generated via Vertex AI | No restrictions | Model development, algorithm testing, education |
| Identified (PHI) | Full patient data | IRB + patient consent + CISO approval | Interventional trials, direct patient contact |
| System | Integration | GCP Connector | Purpose |
|---|---|---|---|
| REDCap | REST API | Cloud Run connector | Electronic data capture for prospective studies |
| i2b2 / OMOP CDM | BigQuery views | Native BigQuery tables | Standard research data models, OHDSI tool compatibility |
| OHDSI Tools (Atlas, Achilles) | WebAPI | Cloud Run + BigQuery OMOP | Cohort definitions, data quality, characterization |
| SAS / R / Python | BigQuery connectors | bigrquery, pandas-gbq, SAS/ACCESS | Statistical analysis in researcher's preferred tool |
| ClinicalTrials.gov | REST API | Cloud Functions → BigQuery | Trial eligibility criteria ingestion for matching |
| PubMed | E-utilities API | RAG index (Vertex AI Search) | Literature search, evidence retrieval for agents |
Ontologies (SNOMED, LOINC, RxNorm, ICD-10) for query expansion and concept mapping.
36M+ biomedical abstracts indexed in Vertex AI Search for RAG-powered literature retrieval.
400K+ trial records with structured eligibility criteria for automated patient matching.
Local table schemas, field definitions, valid values. Ensures accurate text-to-SQL generation.
Standardized phenotype definitions for reproducible cohort queries across institutions.
Access granted per approved research protocol. Treatment data vs. research data separated at IAM and VPC-SC level.
Every query logged in BigQuery audit tables: who, what, when, which dataset, under which IRB protocol.
Column-level access: researchers see only fields required by their protocol. Enforced via BigQuery policy tags.
Automated risk assessment before data export. K-anonymity and l-diversity checks via Cloud DLP.
FHIR Consent resources integrated: patients opting out of research excluded from query results automatically.
Real-time operational, clinical, and executive analytics powered by BigQuery, surfaced through Looker and Looker Studio across the health system.
Managed Looker instance
or Looker Core on GKE
Git-managed models on top of
BigQuery curated & enriched zones
Lightweight self-service
dashboards & ad-hoc reports
Embedded analytics in
EHR portals & custom apps
In-memory acceleration
sub-second query response
ML predictions surfaced
as Looker metrics (risk scores)
| LookML Component | BigQuery Target | Purpose | Key Details |
|---|---|---|---|
| model: clinical | bq_curated.clinical_* | Clinical domain explores | Encounters, conditions, observations, medications |
| model: operations | bq_curated.ops_* | Operational metrics | Census, throughput, capacity, staffing tables |
| model: finance | bq_curated.rev_cycle_* | Revenue cycle & cost | Claims, charges, payments, denials, A/R aging |
| model: research | bq_enriched.research_* | De-identified research cohorts | Cohort tables, genomic summaries, trial enrollment |
| derived_table (PDT) | bq_scratch.pdt_* | Expensive computed metrics | Readmission flags, risk scores, rolling aggregates |
| access_filter | user_attributes | Row-level security | Filter by facility_id, department, user role |
| aggregate_awareness | bq_curated.agg_* | Query acceleration | Pre-aggregated daily/weekly/monthly rollups |
BigQuery streaming buffer ingests events in real time. Looker dashboards auto-refresh at configurable intervals for operational views.
Threshold-based alerts (e.g., ED boarding > 4h) delivered via email, Slack, or PagerDuty. Scheduled report PDFs for leadership.
System-level → facility → unit → patient-level drill paths. Cross-dashboard linking for root cause analysis.
Vertex AI predictions (risk scores, demand forecasts, readmission probability) written to BigQuery and surfaced as Looker metrics.
| Layer | Mechanism | GCP Service | Details |
|---|---|---|---|
| Authentication | SSO / SAML 2.0 | Google Workspace / Cloud Identity | Federated login, MFA enforced |
| Looker Roles | role-based access | Looker IAM Groups | Admin, developer, viewer, embed-user roles |
| Model Access | model_set | LookML project | Users see only permitted models (clinical, finance, etc.) |
| Row-Level Security | access_filter | BigQuery + LookML | User sees only their facility/department data |
| Content Access | folder permissions | Looker folders/boards | Dashboard visibility controlled by folder ACLs |
| Query Guardrails | query cost limits | BigQuery Reservations | Slot-based quotas, per-user query byte limits |
Embed interactive dashboards directly in EHR portals, custom web apps, and patient portals with SSO pass-through.
Trigger downstream workflows from dashboard data: generate outreach lists, push to CRM, create tasks in care management systems.
Build custom React-based applications hosted inside Looker for specialized workflows (e.g., clinical registry management).
Lightweight self-service dashboards for business users who need ad-hoc exploration without LookML complexity.
Surface AI insights and platform capabilities directly within the clinician's EHR workflow — zero context switching.
App launched from EHR context
Patient + encounter pre-populated
App launched independently
User selects patient context
patient/*.read, user/*.read
launch/patient, launch/encounter
Patient ID, Encounter ID
User identity & role
Registered with EHR vendor
Client ID, redirect URIs, scopes
Short-lived access tokens
Refresh token rotation
BigQuery curated data exposed as standard FHIR endpoints. The EHR reads enriched/computed data as if it were a native FHIR server.
AI recommendations require clinician approval before writing back. Audit trail and undo capability enforced on every write.
| EHR Vendor | App Marketplace | SMART Support | CDS Hooks | Key Notes |
|---|---|---|---|---|
| Epic | App Orchard / Gallery | Full (Hyperdrive web) | Supported | USCDI v3, Bulk FHIR, embedded via Hyperspace/Hyperdrive |
| Oracle Health (Cerner) | code Console | Full (Ignite APIs) | Supported | Ignite FHIR R4, Millennium HL7v2 feeds, open.epic equivalent |
| MEDITECH | Greenfield | SMART R1 (expanding) | Limited | Expanse FHIR R4, Greenfield SMART for Expanse web |
| athenahealth | Marketplace | FHIR R4 | Roadmap | Cloud-native, strong API-first approach, REST APIs |
| Use Case | FHIR Resource | Source | Safety Controls |
|---|---|---|---|
| Risk score documentation | Observation | Vertex AI prediction | Clinician approval required, audit log |
| Care plan creation | CarePlan | AI agent recommendation | Human review, undo within 24h |
| Order recommendation | ServiceRequest | CDS Hook suggestion | Clinician must sign, no auto-ordering |
| Note generation | DocumentReference | Ambient documentation | Clinician edits & co-signs before commit |
| Problem list update | Condition | Diagnostic decision support | Suggestion only, clinician confirms |
Patient-facing digital experiences grounded in the AI platform — safe, personalized, and accessible across all channels.
Agent never provides a diagnosis. Symptom triage routes to appropriate care level (ER, urgent care, PCP, self-care) with disclaimers.
Agent cannot prescribe, adjust, or recommend stopping medications. All medication queries reference existing prescription data only.
Keywords (chest pain, suicidal, can't breathe) trigger immediate 911/crisis line redirect. No further conversation on emergency topics.
Clinical concerns beyond agent scope escalated to nurse triage line or provider message. Patient can request human at any time.
All answers grounded in patient's FHIR data and vetted content (MedlinePlus, institutional patient education). Hallucination detection active.
Every clinical response includes disclaimer: "This is not medical advice. Contact your provider for medical decisions." Confidence score shown.
| Layer | GCP Service | Purpose | Details |
|---|---|---|---|
| Frontend | Firebase Hosting | Web portal (React / Flutter Web) | CDN-backed, HTTPS, responsive design |
| Mobile | Flutter (iOS + Android) | Cross-platform native app | Push notifications via Firebase Cloud Messaging |
| API Backend | Cloud Run | Serverless APIs | Auto-scaling, min instances for latency |
| FHIR Data | Cloud Healthcare API | Patient records (US Core FHIR) | FHIR R4, SMART scopes, consent-aware |
| Conversational AI | Dialogflow CX + Vertex AI | NLU + reasoning | Multi-turn, multilingual, context-aware |
| Session State | Firestore | Conversation history & context | Real-time sync, TTL-based expiration |
| Async Tasks | Cloud Tasks + Pub/Sub | Notifications, reminders, background jobs | Scheduled medication reminders, follow-ups |
| Authentication | Google Identity Platform | Patient login (OIDC) | MFA, ID proofing, social login, SMS OTP |
| Translation | Cloud Translation API | Multi-language support | 140+ languages, medical term-aware |
Real-time translation of portal content and agent conversations. Multilingual Gemini handles complex medical term translation.
Medical jargon automatically simplified to patient-friendly language. Reading level targeting (6th-8th grade).
Screen reader compatible, keyboard navigable, high contrast mode, resizable text. Tested with assistive technologies.
SMS and voice channel fallback for patients without smartphones or reliable internet. Caregiver proxy access with verified authorization.
All data encrypted in transit (TLS 1.3) and at rest (CMEK). BAA with Google Cloud. PHI access logged and auditable.
Opt-in required for AI features. Granular preferences: AI assistant on/off, data sharing, research participation.
FHIR $export for patient data download. Machine-readable format (FHIR JSON, C-CDA). Compliant with 21st Century Cures Act.
Age-appropriate access. Guardian proxy with verified authorization. Adolescent confidentiality rules per state law.
Patient controls data sharing scope: within health system only, HIE participation, research opt-in/out. Preferences enforced at API layer.
Automated breach detection (Cloud DLP, Security Command Center). Notification workflows per HIPAA Breach Notification Rule (60-day window).