13 Professional Wellness and Reducing Cognitive Burden

Published

April 26, 2026

The case for AI in healthcare governance documents is typically framed around what AI can do for patients — more accurate diagnoses, more consistent treatment recommendations, earlier identification of deterioration. That framing is appropriate for those purposes. But there is a second case for clinical AI that rarely receives the same sustained attention, and that is the case for what it can do for the people providing care.

Clinician burnout in the United States has reached a level that the U.S. Department of Health and Human Services has characterized as a public health threat (U.S. Department of Health and Human Services, Office of the Surgeon General 2022). More than half of physicians in some specialties report symptoms of burnout, and the rates among nurses and advanced practice providers are comparable. This is not a resilience deficit, and it is not a recent development. It is the predictable consequence of a documentation and administrative burden that has grown continuously for two decades while clinical staffing has not kept pace. The introduction of the electronic health record improved data availability and reduced certain categories of error; it also created what physicians call “pajama time” — the hours spent completing documentation at home, after clinic hours, in the margins of what used to be personal time.

The empirical evidence that AI tools — specifically ambient documentation systems and AI- assisted inbox management — can meaningfully reduce this burden has now reached the level required to act on. This chapter makes that case with the data and tells AMC clinical and operational leaders what they need to know to deploy these tools responsibly and at scale. The argument is not that AI will solve burnout; burnout is a structural problem that requires structural solutions, including staffing ratios, workload limits, and EHR simplification. The argument is that ambient documentation and inbox AI produce measurable, clinically meaningful time savings that represent one of the most concrete near-term levers available to AMC leaders who are watching experienced clinicians exit the profession.

13.1 The Structural Crisis of the Administrative Burden

The EHR documentation burden did not originate with the EHR. It grew with it. As electronic records made clinical data more accessible for billing, quality measurement, and regulatory reporting, the number of required data fields, structured documentation elements, and attestation requirements expanded to fill the new capacity. What began as a data capture tool became a documentation obligation that, by some estimates, requires two hours of EHR work for every hour of direct patient care.

The AMA’s annual Prior Authorization Physician Survey documents one slice of this burden in stark terms. Physicians and their staff spend an average of thirteen hours per week on prior authorization activities alone — not including other administrative tasks, inbox management, documentation, and billing requirements. Ninety-five percent of physicians in the survey report that prior authorization contributes to burnout (American Medical Association 2023). The combined administrative load means that physicians in high-volume specialties may spend more of their working hours on tasks that are not direct patient care than on tasks that are.

The burnout crisis has a financial dimension that is often underappreciated by AMC leaders who focus on the subscription cost of AI tools. The AMA estimates that replacing a single physician — recruiting, credentialing, onboarding, and covering lost productivity during the transition — costs between five hundred thousand and one million dollars, depending on the specialty (American Medical Association 2023). Nurse turnover costs are lower per position but occur at higher frequency. For an institution with a hundred-physician practice that experiences a five percent annual attrition rate, the turnover cost in a single year is between two and five million dollars. An ambient documentation subscription that prevents even a fraction of that attrition generates a return on investment that most technology investments cannot approach.

13.2 Pajama Time: The Mechanism of Burnout

The concept of “pajama time” — named for the clothes physicians are wearing when they complete documentation after clinic hours — describes a specific phenomenon that the burnout literature has linked particularly strongly to exit intent. It is not simply that physicians work long hours; many physicians accept long hours as inherent to the profession. It is that documentation work bleeds into non-work time in a way that eliminates the psychological recovery that even modest off-hours downtime provides.

Tait Shanafelt’s longitudinal studies of physician well-being, conducted through the Mayo Clinic program on physician well-being, have consistently found that EHR-related frustration is among the strongest independent predictors of burnout and exit intent — stronger than compensation, call schedule, and practice setting in some analyses. The specific mechanism is not EHR use during clinic hours; it is EHR use during hours that clinicians experience as personal time. The documentation does not get shorter when it moves to after hours; the psychological cost is higher because the context shift makes the burden more salient.

Specialty variation is substantial. Emergency medicine physicians report some of the highest burnout rates, with documentation demands that follow an acute encounter model in which notes cannot be completed until after the patient has left and the next patient has arrived. Primary care physicians face note volume that can exceed twenty encounters per session; even modest per-note time savings aggregate to meaningful weekly hour reductions. Mental health providers have among the most complex documentation requirements given regulatory and billing requirements specific to behavioral health. The uniform finding across specialties is that documentation burden correlates with burnout more consistently than almost any other measured variable.

13.3 Ambient AI Documentation: The Evidence Base

Ambient AI documentation systems — which listen to the clinical encounter, transcribe and synthesize the conversation, and generate a structured draft clinical note — entered serious clinical deployment in 2023 and have accumulated a meaningful evidence base in the following two years. The headline findings are consistent enough across sites and vendors to warrant treatment as established rather than emerging.

Tierney and colleagues conducted one of the most rigorous early evaluations of an ambient scribe deployment across a large multispecialty practice. They found that physicians reported meaningful reductions in documentation time, with the largest effects in primary care and outpatient mental health — the two specialties with the highest note volume and the highest burnout rates (Tierney et al. 2024). Notes generated with AI assistance were rated comparable in quality to physician-authored notes on blinded review. Clinician well-being scores improved, and — notably — patients reported that their physicians seemed more attentive during encounters, because the physician was no longer managing a keyboard.

The patient attention finding deserves emphasis because it connects the wellness value proposition to the care quality value proposition. An encounter in which the clinician maintains eye contact, responds to nonverbal communication, and does not divide attention between the patient and a screen is a different clinical interaction from one in which documentation competes with the patient for the physician’s attention. A 2025 study found that patients who received care from physicians using an ambient scribe system rated their clinician’s attentiveness significantly higher than patients in a control group — not because the physician was doing anything differently in terms of clinical care, but because removing the documentation burden from the room changed the quality of the relationship.

Table Table 13.1 summarizes the time-savings findings from major evaluations of the leading commercial ambient AI systems. Figures should be treated as directional rather than precise: study designs vary, comparison conditions vary, and the systems themselves have continued to evolve.

Table 13.1: Summary of time-savings and clinician satisfaction findings from published and reported evaluations of commercial ambient AI scribe systems, 2024–2025. Note variation in study designs; figures are not directly comparable across rows.

System	Study / Source	Specialty	After-Hours EHR Reduction	Time Saved per Encounter	Clinician Satisfaction
Abridge	Tierney et al. 2024	Multispecialty	Substantial reduction reported	~1 hour/day	Improved (quality score 48/50)
DAX (Nuance/Microsoft)	Atrium Health deployment	Multispecialty	47% of users saw decreased home EHR time	Significant	High adoption rates
Nabla	Primary care RCT, 2025	Primary care	Not separately reported	9.5% reduction in time-in-note	Positive
Suki	Blinded comparison, 2025	Multispecialty	Not separately reported	Notes rated more thorough	Mixed (complex cases: human slightly better)

The honest accounting of ambient AI limitations is as important as the positive findings. Ambient systems produce draft notes, not finished notes. The clinical content requires review before attestation, and the review is not costless: detecting a subtle omission — a negative finding the system did not capture, a medication allergy mentioned in passing — requires active clinical attention to a document that is formatted to look complete. The omission risk was discussed in Chapter 6; the wellness implication is that the “time savings” claimed by vendors may be partially offset by the cognitive effort of editing rather than writing, particularly for clinicians who are careful reviewers.

The appropriate institutional posture is to measure this tradeoff directly. A pre/post measurement design that captures not just total EHR time but after-hours EHR time, note completion latency, and clinician-reported task load provides the data needed to assess whether the specific deployment at the specific institution is producing the wellness benefit the vendor evidence supports.

13.5 AI-Assisted Inbox Management

The clinical inbox — the stream of patient messages, abnormal results, prescription refills, referral requests, and administrative tasks that arrives continuously in the EHR — is a second major source of after-hours burden that ambient documentation does not address. Studies of EHR inbox volume suggest that physicians in primary care and internal medicine spend two to three hours per day on inbox-related tasks, with a significant fraction of that time occurring outside scheduled clinic hours.

AI-assisted inbox management addresses this burden in two ways. Message triage tools classify incoming messages by urgency and type, routing refill requests, administrative queries, and low-urgency informational messages to appropriate staff or queues rather than to the attending physician’s personal inbox. Message drafting tools generate suggested responses to patient messages — responses that a clinician reviews and sends, or modifies and sends, rather than composing from scratch.

The Mayo Clinic’s deployment of AI-drafted patient message responses across nursing staff reported savings of approximately thirty seconds per message; across the organization’s message volume, this aggregated to roughly fifteen hundred hours of nursing time per month. At an institution where nursing turnover is expensive and inbox-related burnout is a retention risk, that represents meaningful institutional value. A pilot at NYU Langone found that AI-drafted patient messages scored higher on patient-rated empathy than clinician-authored responses — a counterintuitive finding that the researchers attributed to the AI drafts being more consistently warm in tone, without the variation that comes from a clinician responding to their twenty-fifth message of the day at eleven at night.

The safety question for inbox AI is parallel to the safety question for ambient documentation: the AI draft is a starting point, not a final answer, and the clinician review is not optional. A patient message about a new chest pain symptom requires a different response than a message about a refill request, and the AI may not reliably distinguish between them in every case. The appropriate governance design is to treat AI inbox drafts as productivity tools that reduce the blank-page problem, not as autonomous response generators.

13.6 Automation Complacency and the Vigilance Gap

The single most important safety warning for ambient documentation and inbox AI is also the most predictable: the tools will work well enough, often enough, that clinicians will stop reviewing them carefully. This is not a hypothetical; it is the documented behavior of humans using any consistently accurate automated system (Parasuraman and Manzey 2010).

The “automation complacency trap” in clinical AI takes a specific form. A clinician who has reviewed two hundred AI-generated notes without finding a significant error will review the two hundred and first note less carefully than the first. Their eyes will pass over the text; their cognitive attention will be elsewhere. The error that slips through is not a random error — it is the subtle, plausible-sounding error that requires active clinical attention to catch. The system that makes large, obvious errors is less dangerous than the system that makes small, fluent-sounding errors, because the former will be caught and the latter will be signed.

The governance response is to build in mechanisms that maintain the clinician’s engagement with the review task. Some vendors have experimented with displaying “confidence scores” or highlighting sections of the note that were reconstructed from partial information — visual signals that certain passages require more careful scrutiny. Some institutions have implemented periodic “vigilance audits” in which a clinical informaticist reviews a random sample of signed AI-generated notes against the original encounter recording to assess whether clinicians are catching the errors the system makes. The goal is not to create an adversarial relationship between the clinician and the tool; it is to preserve the review function that makes the system safe.

13.7 Nursing and Advanced Practice Burden

The burnout and documentation burden literature is weighted heavily toward physicians, partly because physicians have been more systematically surveyed and partly because physician attrition is more economically visible. The burden on nursing staff and advanced practice providers is equally real and somewhat distinct in character.

Nurses in inpatient settings face documentation requirements that are simultaneous with direct patient care responsibilities — they are documenting vital signs, medication administration, and patient assessments while also providing care, often in conditions where the documentation system and the care environment are in physical conflict. Ambient documentation systems were designed primarily for the physician encounter model — a structured conversation between a physician and a patient — and do not yet map well to the parallel, distributed, and physically active documentation patterns of bedside nursing.

One nursing-focused deployment study found that ambient AI could reduce nursing documentation time by a substantial margin in simulated conditions, and by a smaller but still meaningful margin in actual clinical practice. The gap between simulated and actual savings is instructive: ambient systems perform better in conditions where the conversational signal is clean and the documentation task is structured, and nursing documentation in many inpatient settings is neither. The honest characterization of ambient AI for nursing is that the potential benefit is real but the peer-reviewed evidence is thinner than for physicians, and the implementation design requires more adaptation to nursing workflows than most vendors have undertaken.

Advanced practice providers — nurse practitioners, physician assistants, certified nurse midwives, and clinical pharmacists — share both the physician’s note-volume challenge and the nursing context of frequent task-switching. Burnout surveys of APPs have documented rates comparable to physicians in similar specialties, and the documentation burden is a consistent contributor. The current evidence base for ambient AI among APPs is largely derived from studies that enrolled APPs as part of multispecialty cohorts; dedicated APP studies are limited.

13.8 Where to Start

The wellness ROI case for ambient documentation is strong enough that most AMCs should be piloting at least one ambient system. The two projects below are designed to maximize the probability that the pilot produces usable evidence and, if successful, a foundation for responsible scale.

13.8.1 Starter Project 1: Ambient Documentation Pilot with Pre/Post Wellness Measurement

What it is: A structured twelve-week pilot of a single ambient AI scribe in a high- burnout outpatient specialty — primary care, mental health, or internal medicine are the highest-yield targets — with validated pre/post measurement of after-hours EHR time, physician well-being, and note quality.

Why now: The commercial systems (Abridge, DAX, Nabla) have BAA-ready EHR integrations and vendor-side regulatory responsibility for the AI component. The institutional work is structuring the measurement framework, not building the technology. An institution that waits for more evidence before piloting is deferring a decision that the evidence already supports.

How to execute: Identify a willing specialty lead and a clinical champion. Pull baseline after-hours EHR time from EHR audit logs for two months before deployment. Administer a validated burnout survey (the two-item Maslach Emotional Exhaustion subscale or the Stanford Professional Fulfillment Index) to participating clinicians before and after. Establish note quality review with a blinded clinician reviewer on a random ten-percent sample throughout the pilot. Deploy with the consent and attestation design described in Figure 13.1. Measure at 60 and 90 days. Report results to clinical leadership and the AI Steering Committee whether the outcome is positive, null, or negative.

Buy vs. build: Buy. Building an ambient documentation system from scratch is not a tractable project for clinical informatics teams, and the commercial market has matured. The institutional investment is in measurement design and change management.

13.8.2 Starter Project 2: Prior Authorization Workflow Automation Assessment

What it is: An analysis of the institution’s current prior authorization burden — total hours, denial rate, appeal rate, and staff time by specialty — and a structured evaluation of commercial PA automation solutions against that baseline.

Why now: The CMS Interoperability and Prior Authorization Final Rule (Centers for Medicare and Medicaid Services 2024) mandates that payers accelerate PA response timelines beginning in 2026. Providers who are currently absorbing PA burden manually will face either increased volume or worse service levels unless they automate some portion of the workflow. The regulatory mandate creates a forcing function that makes PA automation evaluation overdue for most AMCs.

How to execute: Work with the revenue cycle team to pull current PA metrics: volume by specialty, average staff time per PA request, denial rate by payer, appeal rate, and cost per PA in staff hours. Map the portions of the workflow that are currently manual and that a commercial PA automation tool could address. Evaluate two or three commercial vendors against the measured baseline, with particular attention to their accuracy documentation, their EHR integration path, and their error-handling design when the AI is uncertain or wrong. Use the evaluation to inform a business case for the CFO that uses the actual institutional cost data rather than vendor-supplied estimates.

Buy vs. build: Evaluate and likely buy. PA automation requires deep payer integration that is impractical to build de novo. The institutional differentiation is in the governance design — specifically, the HITL architecture for cases where the AI recommendation should not be submitted without human review.