What does it take to add an AI scribe to an EHR?

At a minimum you're wiring together four things, whether you build the engine or partner for it: Audio capture in your client — recording or streaming the encounter with consent and metadata. A documentation engine that turns that audio into a structured clinical note. A clinician review-and-sign step inside your UI before anything is final. Write-back into the chart — saving the note and structured data to the patient record, often via FHIR.

Should we build our own AI scribe or partner with one?

Decide by whether the scribe is your core product or a feature your users expect: Build if documentation is your differentiator and you'll invest in templates, guardrails, and clinician UX for years. Partner if you want to ship the feature in weeks and keep the note layer maintained for you. Most EHRs treat the scribe as a feature, not a product — which is why a white-label partnership is usually the faster, cheaper path.

How does an AI scribe integrate with an EHR technically?

The cleanest pattern is a SMART on FHIR app launched in context, with FHIR for read and write: Launch the scribe in the patient/encounter context so it knows who and what it's documenting. Read context (patient, problems, medications) via FHIR to ground the note. Write the signed note and structured data back as FHIR resources — DocumentReference, Condition, MedicationStatement, Encounter.

What about HIPAA, PHI, and a BAA when adding an AI scribe?

Every component touching audio or notes is in scope for PHI. Before real data flows, confirm: A signed BAA with the scribe vendor and any sub-processors in the path. Encryption in transit and at rest, least-privilege access, and end-to-end audit logging. Retention and training terms in writing — how long audio and notes are kept, and whether your data trains the vendor's models.

How long does it take to add an AI scribe to an EHR?

It depends entirely on the path you choose: Partner / white-label: typically weeks — you integrate capture, a review UI, and FHIR write-back around a finished-note API. Build it yourself: typically quarters — you also build and maintain the note-generation layer, templates, and guardrails. If speed to a shippable feature matters, partnering is usually the shorter road — see our partner program.

How to Add an AI Scribe to Your EHR: Build vs. Partner

Key Takeaways

Adding an AI scribe to an EHR means wiring four pieces: audio capture, a documentation engine, a clinician review-and-sign step, and write-back into the chart.
The real decision is build vs. partner. Building means owning the note-generation layer — templates, guardrails, evaluation, and clinician UX — as a multi-year investment; partnering makes the scribe a feature you ship in weeks.
Technically, the cleanest integration is a SMART on FHIR launch in encounter context, writing the signed note and structured data back as FHIR resources. A documentation API or partnership slots into that pattern.
Keep a clinician in the loop by design: render the draft note in your UI, make the human the one who signs, and only then write back to the chart.
Put a signed BAA, encryption, least-privilege access, and clear retention/training terms in place for every vendor that touches PHI before you go live.

We work with EHR and platform teams adding documentation at Twofold, and the conversation almost always lands on the same fork: do we build the AI scribe ourselves, or partner for it? This guide lays out both paths honestly — what each one actually costs, how the integration works, and how to decide.

Watch: why building medical transcription is harder than it looks — and when to use an API or partner instead.

Transcription is the easy part. The hard, ongoing part is turning a conversation into a defensible clinical note and getting it safely into the chart. That's the work this article is really about.

The two paths: build vs. partner

The hero image above frames the choice. Building means you own the documentation engine end to end. Partnering means you integrate a finished‑note engine and focus your effort on the EHR‑side experience. Both are legitimate — they just suit different products.

Build it yourself

You integrate (or train) medical ASR, build the note‑generation layer — summarization, specialty templates, safety guardrails — own the BAA chain, and maintain models and templates as clinical language drifts. You get maximum control at the cost of a multi‑year, multi‑discipline investment.

Partner and embed

You integrate one API or SDK that returns a finished note, white‑label the experience, and inherit the vendor's BAA and security posture. Your engineering shrinks to capture, a review UI, and write‑back. You trade some control for a feature you can ship in weeks.

The integration architecture

However you source the engine, the EHR‑side architecture is similar. The diagram below shows the shape: your EHR holds patient and encounter context, your platform captures audio and shows the draft note, and a documentation API returns the note plus structured data — which you write back to the chart.

An integration diagram showing how an AI scribe connects to an EHR via a SMART on FHIR launch. Three cards left to right: 'EHR — patient and encounter context (FHIR)', then a coral arrow labeled 'context' to 'Your platform — captures audio, shows the draft note', then a coral arrow labeled 'audio' to a highlighted 'Documentation API — returns note plus structured data.' A dashed line labeled 'FHIR write-back' loops from the API back to the EHR, and chips below read DocumentReference, Condition (ICD-10), MedicationStatement, and Encounter.

Launch in context with SMART on FHIR

Launch the scribe as a SMART on FHIR app inside the patient and encounter context. That single step gives the documentation engine the grounding it needs — who the patient is, what the encounter is — and keeps the clinician in one workflow instead of juggling tabs.

Read context, then capture audio

Pull relevant context (problems, medications, recent encounters) via FHIR so the note is grounded in the chart, then capture or stream the encounter audio from your client with consent and metadata attached.

Return a note, review, then write back

The engine returns a structured note and data. Render it for clinician review, let the human edit and sign, and only then write back to the chart — typically as FHIR resources like DocumentReference (the note), Condition (problems), MedicationStatement (medications), and Encounter.

What you build either way

Even with a partner handling the note engine, the EHR side is yours to build and own. Budget for it:

Audio capture that's reliable on real networks — buffer locally, normalize sample rates, attach encounter metadata.
A clinician review UI that shows the draft note (ideally beside its source) and makes signing fast and deliberate.
FHIR write-back that maps the note and structured data to the right resources and handles failures gracefully.
Audit trails linking the final note to the audio and transcript it came from.

Keeping a clinician in the loop

The most important design decision is that a human signs the note. A generated note is a draft until a licensed clinician takes accountability for it. Render the draft clearly, make low‑confidence sections obvious, never auto‑finalize, and write back to the chart only after the clinician signs. Capture edits — the diff between draft and signed note is the most honest signal of where the engine falls short.

HIPAA, PHI, and your BAA

Every component that touches audio, a transcript, or a note is in scope for PHI — the documentation engine, your hosting, your logging, even analytics. Sign a BAA with each, encrypt in transit and at rest, enforce least‑privilege access, and log access end to end. Pin down retention and training terms in writing. Our take on the security posture this requires is on our security page.

A pragmatic recommendation

If a clinical scribe is your core differentiator and you'll fund it for years, building gives you maximum control. For most EHRs and platforms, though, the scribe is a feature users expect — and the note‑generation layer is a large, ongoing investment that isn't your core product. In that case, partnering ships the feature in weeks and keeps the clinical layer maintained for you.

That's the gap our partner program is built to fill — a white‑label scribe you embed in your EHR — and the same engine is available as a medical speech-to-text and documentation API if you'd rather build the surface yourself.

Sources & further reading

Twofold — Partner & white-label program

Twofold — Medical speech-to-text & documentation API

HL7 — FHIR

SMART on FHIR — App launch

HHS — HIPAA for Professionals

FAQ

Frequently asked questions

What does it take to add an AI scribe to an EHR?
At a minimum you're wiring together four things, whether you build the engine or partner for it:
- Audio capture in your client — recording or streaming the encounter with consent and metadata.
- A documentation engine that turns that audio into a structured clinical note.
- A clinician review-and-sign step inside your UI before anything is final.
- Write-back into the chart — saving the note and structured data to the patient record, often via FHIR.
Should we build our own AI scribe or partner with one?
Decide by whether the scribe is your core product or a feature your users expect:
- Build if documentation is your differentiator and you'll invest in templates, guardrails, and clinician UX for years.
- Partner if you want to ship the feature in weeks and keep the note layer maintained for you.
- Most EHRs treat the scribe as a feature, not a product — which is why a white-label partnership is usually the faster, cheaper path.
How does an AI scribe integrate with an EHR technically?
The cleanest pattern is a SMART on FHIR app launched in context, with FHIR for read and write:
- Launch the scribe in the patient/encounter context so it knows who and what it's documenting.
- Read context (patient, problems, medications) via FHIR to ground the note.
- Write the signed note and structured data back as FHIR resources — DocumentReference, Condition, MedicationStatement, Encounter.
What about HIPAA, PHI, and a BAA when adding an AI scribe?
Every component touching audio or notes is in scope for PHI. Before real data flows, confirm:
- A signed BAA with the scribe vendor and any sub-processors in the path.
- Encryption in transit and at rest, least-privilege access, and end-to-end audit logging.
- Retention and training terms in writing — how long audio and notes are kept, and whether your data trains the vendor's models.
How long does it take to add an AI scribe to an EHR?
It depends entirely on the path you choose:
- Partner / white-label: typically weeks — you integrate capture, a review UI, and FHIR write-back around a finished-note API.
- Build it yourself: typically quarters — you also build and maintain the note-generation layer, templates, and guardrails.
- If speed to a shippable feature matters, partnering is usually the shorter road — see our partner program.
See the partner program