flowchart TD
A[Unit requests AI tool] --> B{Is it generative AI?}
B -- No --> Z[Standard IT procurement]
B -- Yes --> C{Does it contact PHI?}
C -- Yes --> D[Privacy/HIPAA review\nBAA required]
C -- No --> E{Used in hiring/\npromotion decisions?}
D --> E
E -- Yes --> F[Bias audit required\nNYC LL144 / CO SB 24-205 check]
E -- No --> G[Security review\nData classification check]
F --> G
G --> H{Enterprise platform\nalready available?}
H -- Yes --> I[Configure existing tenant\nno new vendor]
H -- No --> J[Vendor evaluation\ndata training prohibition required]
I --> K[Pilot with defined\nsuccess metrics]
J --> K
K --> L[Domain lead approval\nAISC notification]
L --> M[Deployment with\naudit logging]
M --> N[Quarterly review]
9 Business Operations
At most academic medical centers, the first place generative AI takes hold is not the clinic or the research lab. It is the finance department’s revenue cycle team, the HR business partner drafting job descriptions after hours, the communications writer who discovered that a rough prompt produces a passable first draft in thirty seconds. Business operations — the finance, HR, supply chain, IT, marketing, philanthropy, facilities, and legal functions that keep an AMC running — are where AI adoption tends to move fastest and where institutional governance tends to lag furthest behind.
This gap is partly structural. Clinical AI requires FDA clearance or institutional validation before it can touch a patient. Research AI is constrained by IRB requirements and publication ethics. Business-operations AI faces none of these formal gatekeepers, which makes it both the easiest domain to start in and the one most likely to produce a compliance incident before the institution has thought through its policies.
The argument of this chapter is simple: business operations is the right place to build early institutional capability with generative AI, and the regulatory exposure is higher than most operations leaders assume. Getting both of those things right at the same time is possible — it requires a governance-first procurement approach, a clear data classification framework, and the discipline to deploy an enterprise-grade platform before employees have reason to route around it.
9.1 A Domain Unlike the Others
The four domains of this framework — education, research, clinical, and business operations — differ not just in what they do but in the regulatory regimes that govern what AI can do inside them. Clinical AI operates under the most demanding oversight: every tool that influences a clinical decision is potentially a medical device subject to FDA regulation, and deployment without local validation is a patient-safety risk. Research AI is constrained by the norms of research integrity, the requirements of IRBs for human-subjects data, and the intellectual-property rules of journals and funders. Educational AI sits inside a web of academic integrity policies and accreditation requirements.
Business operations has fewer of these formal guardrails, which creates a false sense of safety.
Median U.S. hospital operating margins reached 1.3 percent in 2025 and turned negative at the start of 2026, with labor accounting for 84 percent of total hospital expenses (Kaufman Hall 2026). The nursing workforce gap approaches 500,000 vacancies nationally, with 40 percent of currently employed nurses reporting plans to leave the profession by 2029 (American Hospital Association 2025). Operations leaders reaching for AI-assisted administrative tools against that backdrop are making a reasonable call. The compliance obligations attached to those tools do not diminish because no clinical oversight body is in the room.
The absence of FDA clearance requirements does not mean an absence of legal risk. Three areas carry particular exposure.
The first is employment. Automated systems used to screen, rank, or evaluate employees and candidates are subject to anti-discrimination law regardless of whether they involve AI. The EEOC’s Strategic Enforcement Plan for fiscal years 2024–2028 explicitly names algorithmic discrimination in employment as an enforcement priority (U.S. Equal Employment Opportunity Commission 2024). New York City’s Local Law 144, which took effect in July 2023, mandates annual independent bias audits for any automated employment decision tool used in hiring or promotion for New York City-based roles (Local Law 144 2021). Colorado’s SB 24-205, signed in 2024, imposes “reasonable care” requirements on developers and deployers of high-risk AI systems, including those used in employment contexts (SB 24-205 2024). An AMC with employees or applicants in these jurisdictions — which includes most major academic medical centers — needs to account for these requirements before deploying any HR automation tool.
The second is consumer protection and marketing. The Federal Trade Commission’s Operation AI Comply, announced in September 2024, documented enforcement actions against companies making deceptive claims about AI capabilities in consumer-facing contexts (Federal Trade Commission 2024). Patient-facing communications that overstate AI accuracy or imply diagnostic capability without adequate disclosure carry both FTC and state consumer protection risk.
The third is HIPAA — specifically, the risk that an employee uses a business-operations AI tool in a way that incidentally exposes protected health information. A revenue cycle coordinator who pastes a patient name and insurance ID into a public AI chatbot to draft a denial appeal letter has created a potential HIPAA breach, regardless of how routine the business context feels. The HHS Office for Civil Rights has stated that nondiscrimination requirements extend to algorithmic decision support tools in administrative contexts under the Section 1557 final rule (U.S. Department of Health and Human Services, Office for Civil Rights 2024). The business case for an enterprise-grade AI platform is partly a compliance case.
9.2 What Business Operations AI Actually Does
Generative AI’s footprint in AMC business operations is already broad. The use cases vary considerably in maturity and risk, and a realistic function-by-function account is more useful than a general endorsement.
Finance and revenue cycle. The revenue cycle is among the most labor-intensive administrative functions in any hospital, and also among the most data-structured — claims have standard formats, denial reasons follow defined codes, and appeal letters follow recognizable templates. AI tools are most mature here. Denial management, prior authorization drafting, and coding review are all areas where commercial AI-enabled platforms have demonstrated measurable improvement in denial overturn rates and coder productivity. The risk is vendor lock-in and the assumption that an AI platform’s performance in another institution’s EHR will replicate in yours. Local validation against your payer mix and documentation patterns is necessary before attributing cost savings to the tool.
Financial planning and analysis benefits more from AI-assisted modeling and scenario generation than from automation. An FP&A analyst who can ask a language model to explain the variance in a budget line, generate competing budget scenarios from a shared set of assumptions, or draft the narrative for a board presentation will work faster and produce better prose — but the model will not catch a misclassified expense or a projection built on an invalid assumption. AI augments the analyst; it does not replace the financial judgment.
Human resources. The clearest near-term opportunity is in drafting: job descriptions, offer letters, standard policies, and employee Q&A responses. These are tasks that consume significant HR staff time and produce documents that are largely similar from one iteration to the next. An enterprise AI tool configured to follow the institution’s style, terminology, and legal requirements (pay equity language, for example) can reduce the time from request to first draft dramatically.
The clearest risk is in screening. Any tool that ranks candidates, filters applications, or produces scores used in hiring decisions is potentially an automated employment decision tool under the legal definitions of NYC Local Law 144 and related state laws. The bias-audit requirement applies. An institution that deploys an AI-based screening tool without an independent audit and a documented remediation process is exposed. The safer path in 2025–2026 is to use AI for drafting and research tasks — generating interview question banks, summarizing employment law updates, drafting performance review templates — while keeping screening and ranking decisions human-driven.
Supply chain and contracting. Contract review and vendor risk assessment are natural fits for generative AI: the task requires reading large volumes of structured text, extracting specific terms, and comparing them against standards. AI tools that summarize contract terms, flag non-standard clauses, and generate redlines against a baseline template are available from multiple vendors and are being used in healthcare procurement. The risk here is primarily one of accuracy and audit trail. A contract reviewed by AI and approved by a human who did not re-read the AI’s summary is a contract reviewed by no one in any legally meaningful sense. The tool produces a starting point; a qualified human closes it.
IT operations. Ticket triage, incident summarization, code review, and internal documentation generation are all areas where IT teams at major AMCs are already using AI, often through GitHub Copilot, Microsoft 365 Copilot, or custom deployments on internal codebases. The productivity gains for experienced developers and system administrators are real and well-documented. The risk for IT operations specifically is prompt injection — the vulnerability by which malicious input to an AI system can cause it to execute unintended actions. Any AI tool that can invoke shell commands, API calls, or database queries based on natural-language input must be treated as a security surface, not just a productivity tool.
Marketing and communications. Content drafting, translation, accessibility reformatting, and social media response generation are areas where the productivity case is straightforward and the risk is manageable. The FTC’s enforcement posture means that AI-generated content making clinical or research claims needs a human review step before publication. The deeper issue is brand and voice consistency: a communications team that uses AI to draft a hundred pieces of content without a style framework will produce a hundred pieces that sound subtly generic. A deliberate style guide and a review process that treats AI output as a first draft, not a final product, addresses most of this.
Philanthropy and development. Donor research summarization, draft cultivation letters, and grant prospect identification are all tasks that consume significant staff time at AMC development offices and are tractable for AI. The risk is relationship sensitivity: a donor who receives a letter that was obviously AI-generated from a public database — especially one that contains a factual error about their history with the institution — is a donor at risk of disengagement. AI-generated development communications need a personal review step before they leave the institution.
Facilities and legal. Work-order routing, maintenance request summarization, and facilities policy Q&A are low-complexity AI use cases with low risk. Legal is more sensitive: AI tools for contract drafting and policy research are useful, but anything that constitutes legal advice or professional judgment remains the responsibility of licensed counsel.
Table 9.1 summarizes the function-by-function landscape.
| Function | Mature use cases | Risk level | Key constraint |
|---|---|---|---|
| Revenue cycle | Denial drafting, coding review | Medium | Local validation required |
| HR | JD drafting, policy Q&A | Medium–High | Bias audit if screening |
| Supply chain | Contract review, redlining | Medium | Accuracy + audit trail |
| IT operations | Ticket triage, code review | Medium | Prompt injection risk |
| Marketing | Content drafting, translation | Low–Medium | FTC and brand review |
| Philanthropy | Donor research, letter drafting | Low | Relationship sensitivity |
| Facilities | Work-order routing, policy Q&A | Low | Minimal |
| Legal | Contract research, drafting | Medium | Professional judgment stays human |
9.3 The Regulatory Layer
Most AMC operations leaders have read about HIPAA. Fewer have encountered the employment AI laws, the FTC’s AI enforcement posture, or the specific nondiscrimination requirements that apply to algorithmic decision tools. A useful organizing frame is that the regulatory exposure in business operations AI falls into three areas: employment decisions, patient-facing communications, and data handling.
The employment area is the most actively litigated. The EEOC has been explicit that Title VII, the ADA, and the ADEA apply to automated employment decisions, and that employers cannot shift liability to a third-party AI vendor (U.S. Equal Employment Opportunity Commission 2024). NYC Local Law 144 requires employers using automated employment decision tools for NYC-based roles to conduct and publish annual independent bias audits and to notify candidates before the tool is used (Local Law 144 2021). Colorado’s SB 24-205 requires “deployers” of high-risk AI systems (which include employment AI) to implement risk management programs, conduct impact assessments, and notify consumers when they are subject to an algorithmic decision (SB 24-205 2024). An AMC that operates in New York, Colorado, or is a federal contractor faces compliance obligations that are already in effect. Several other states — Illinois, Texas, Utah — have enacted or proposed similar legislation.
The Federal Trade Commission’s “Operation AI Comply” matters for AMC marketing teams in particular. The FTC’s enforcement actions targeted companies that made false or unsubstantiated AI capability claims in customer-facing contexts, including claims that AI had reviewed or validated health-related content (Federal Trade Commission 2024). Patient communications that describe AI tools as “reviewed by clinical AI” or “AI-verified” without substantiated validation claims invite FTC scrutiny. The safer approach is to describe what the AI actually did (drafted, organized, translated) rather than making accuracy claims.
The HIPAA exposure in business operations is subtler than the exposure in clinical domains, but it is real. Revenue cycle, HR (benefit eligibility, medical leave records), and even marketing (patient testimonials, program enrollment data) are all areas where patient information can appear in business-operations workflows. The HHS Section 1557 nondiscrimination rule makes explicit that the nondiscrimination requirements extend to algorithmic tools used in the administration of health programs, not just to clinical decisions (U.S. Department of Health and Human Services, Office for Civil Rights 2024). A revenue cycle AI tool that systematically produces less aggressive appeal letters for Medicaid patients than for commercially insured patients is potentially a Section 1557 violation, regardless of whether that disparity was intentional.
The OMB’s Memorandum M-24-10 on advancing AI governance, while directed at federal agencies, establishes the federal government’s expectations for institutions receiving significant federal funding — which describes every AMC (Office of Management and Budget 2024). Its requirements around rights-impacting and safety-impacting AI systems, and its emphasis on impact assessments and transparency, set a standard that AMC AI governance programs should be prepared to meet or justify deviating from.
9.4 The Shadow-IT Problem
The single most common business-operations AI event at an AMC today is not a deliberate deployment: it is an employee using a consumer AI tool — ChatGPT, Claude.ai, Gemini — with institutional data, without authorization, because no better option is readily available.
The dynamics are predictable. Administrative staff are under pressure to produce more documentation with fewer resources. Consumer AI tools are free, fast, and effective for exactly the drafting, summarizing, and reformatting tasks that dominate their workload. The institutional alternatives — if they exist — require tickets, approvals, and onboarding processes that take weeks. The employee uses the consumer tool once, finds it dramatically faster, and makes it a habit.
The problem is not that employees are careless. The problem is that the institution has not given them a compliant, convenient alternative. The solution is not a prohibition — prohibition without an alternative produces the same behavior plus an incentive to hide it. The solution is a gateway: an institutionally managed AI platform that is as easy to reach as the consumer tools, covered by appropriate data handling agreements, and integrated into the workflows employees actually use.
This is the core argument for enterprise AI tenancy, which is addressed in the next section. The point here is that the governance problem and the security problem in business operations AI are, at bottom, the same problem: the institution has not built a path of least resistance that is also a path of least risk.
9.5 Enterprise AI Platforms
The practical procurement question for most AMCs in 2025–2026 is not whether to build a large language model, but which enterprise platform to deploy and how to configure it. Three platforms dominate the healthcare market: Microsoft 365 Copilot, Google Workspace AI, and ChatGPT Enterprise. Anthropic’s Claude for Enterprise is a smaller but growing option. Each offers a different trade-off between integration depth, data handling controls, and administrative overhead.
Microsoft 365 Copilot is the obvious choice for institutions already running Microsoft 365, which describes most AMCs. It runs inside the existing Microsoft tenant, inherits existing Entra ID (Azure Active Directory) identity and access management, and operates under Microsoft’s existing HIPAA Business Associate Agreement. The data does not leave the tenant for model training. The limitation is that Copilot’s performance is tightly coupled to the quality of the institution’s Microsoft 365 data hygiene — it surfaces documents, emails, and Teams conversations that are accessible to the user, which means poorly organized SharePoint sites and unclassified document libraries produce low-quality responses. Copilot is a tool that amplifies whatever information architecture the institution already has.
Google Workspace AI integrates Gemini models into Gmail, Docs, Sheets, and Meet for institutions on Google Workspace. Google offers a BAA for covered entities and expressly excludes customer data from training for Workspace customers. Like Copilot, it inherits the existing identity infrastructure. Its strongest use cases in AMC business operations are in drafting and document summarization.
ChatGPT Enterprise gives users access to GPT-4 models within an isolated organizational tenant, with a commitment that data is not used for OpenAI model training. It does not currently offer a HIPAA BAA, which limits its use to data that does not constitute protected health information. For many business-operations tasks this is not a constraint — job descriptions, vendor correspondence, and internal policy drafts rarely contain PHI — but revenue cycle and HR use cases that might touch patient data require a compliant alternative.
Claude for Enterprise (Anthropic) offers similar data handling commitments and organizational controls. It has been adopted at a smaller number of healthcare institutions, but its comparative strength in long-context tasks (reading and summarizing long contracts or policy documents) makes it worth evaluating for legal and supply chain applications specifically.
Table 9.2 compares the platforms on the dimensions that matter most for an AMC deployment decision.
| Platform | HIPAA BAA | Data residency | No training on customer data | Identity integration | Best fit |
|---|---|---|---|---|---|
| Microsoft 365 Copilot | Yes | US available | Yes | Entra ID (SAML/OIDC) | M365-native AMCs |
| Google Workspace AI | Yes | US available | Yes (Workspace customers) | Google Workspace SSO | Google Workspace AMCs |
| ChatGPT Enterprise (OpenAI) | Available (2024+) | US | Yes | SAML SSO | PHI-containing workflows with signed BAA |
| Claude for Enterprise (Anthropic) | Available | US | Yes | SAML/OIDC | Long-context tasks; contract review; policy Q&A |
The buy-versus-build question in business operations has a clearer answer than in clinical or research domains. Building a custom large language model for revenue cycle or HR tasks is not a reasonable investment for any AMC. Even fine-tuning an open-weight model on institutional data requires infrastructure, ML engineering capacity, and an ongoing maintenance commitment that very few AMC IT organizations have. The right question is not build-versus-buy, but which enterprise platform to configure and how deeply to integrate it into existing workflows. The exception is retrieval-augmented generation (RAG) over internal document repositories — this is closer to configuration than model development, it runs on commodity infrastructure, and it produces meaningful value for internal policy Q&A, contract search, and HR knowledge bases.
9.6 Governance as Procurement
The NIST AI Risk Management Framework (National Institute of Standards and Technology 2023) organizes AI risk management into four functions: Govern (establish policies and roles), Map (identify and categorize risks), Measure (analyze and assess), and Manage (prioritize and treat). In the context of business operations AI procurement, these functions translate to a procurement process that is itself the governance mechanism.
The key insight is that most AMCs already have vendor risk management processes: security reviews, legal review of contract terms, privacy impact assessments for tools handling patient data. The task is not to build new infrastructure from scratch but to add AI-specific steps to existing workflows. The additions are modest: a bias audit requirement for any tool used in employment screening, a PHI-exposure assessment for any tool handling revenue cycle or HR data, a data-training prohibition clause for any AI vendor agreement, and an audit log requirement for all enterprise AI tool usage.
Figure 9.1 illustrates a workable procurement flow for business-operations AI.
A few principles govern this flow. First, the preference for existing enterprise platforms over new vendors is explicit. The institution already has a Microsoft or Google tenant, already has SSO configured, already has data handling agreements in place. Using those tenants for a new use case is not a procurement event; it is a configuration decision. The governance overhead is lower, the data handling is already contracted, and the identity integration is already done. Second, any tool used in employment decisions requires a bias audit regardless of where in the institution it is used and regardless of whether the vendor claims their product is “bias-free.” Third, every deployment ends with a defined success metric and a quarterly review cadence — not because every business-operations AI tool is high-stakes, but because without review, tools that are not working accumulate rather than being replaced.
The ISO/IEC 42001 Artificial Intelligence Management System standard (ISO/IEC 42001 2023) provides an international governance framework for AI management that complements the NIST AI RMF. Institutions seeking to demonstrate AI governance maturity to partners, accreditors, or regulators will find that a 42001-aligned management system maps well to NIST and requires no duplication of effort.
9.7 What Success Looks Like
Measuring the value of business-operations AI is harder than it sounds, not because the tools produce no value but because the value shows up in forms that are difficult to isolate. Time savings are rarely captured systematically; productivity improvements appear as caseload increases, not headcount reductions; quality improvements in documents produced are hard to quantify. The AMC that waits for a clean ROI calculation before deploying AI will still be waiting when its administrative staff have long since found their own tools.
A workable measurement approach focuses on three levels. At the output level, track task-level time-to-completion for the specific workflows where AI is deployed — time from ticket receipt to denial letter, time from job requisition to posted description, time from contract receipt to first redline. At the quality level, track human review override rates (what fraction of AI outputs are substantially changed by the reviewing human) and error rates. At the compliance level, track the number of unauthorized AI tool uses detected and the trend over time.
None of these metrics requires a formal ROI model. Each requires only that the deployment was scoped, that baseline measurements were taken before AI was introduced, and that someone is responsible for tracking the numbers after deployment. The institutional discipline to take the baseline measurement is, in practice, the hardest part.
9.8 Where to Start: Two Starter Projects
The preceding sections describe a governance framework and a set of use cases. This section describes what to actually do in the next ninety days. The goal is not to deploy everything — it is to deploy something well, learn from it, and build institutional capability that the next deployment can inherit.
9.8.1 Project 1: Enterprise AI Gateway Deployment
What it is. Deploy the AI capabilities already included in your existing Microsoft 365 or Google Workspace contract to a defined group of business-operations users — a pilot cohort of 20–50 people across finance, HR, and communications. Configure it within the existing tenant (no new vendor, no new security review), establish a simple usage policy, enable audit logging, and run for 90 days.
What you need to start. A Microsoft 365 E3/E5 or Google Workspace Business license already in effect, an IT administrator with tenant configuration access, a data classification policy that tells users what data they can put into the tool, and a willing domain lead in one of the three pilot functions. Nothing else is required.
What done looks like. After 90 days: audit logs are running, the pilot cohort has a shared understanding of what the tool can and cannot be used for, at least three specific workflow improvements have been documented with before/after time measurements, and no unauthorized PHI events have occurred. The 90-day report goes to the AISC with a recommendation on whether to expand.
Build or buy? Configure. This is not a procurement event. It is turning on a capability already licensed and paying for a governance and usage framework around it.
9.8.2 Project 2: Internal Policy Q&A Chatbot (RAG over Internal Documents)
What it is. Build a retrieval-augmented generation system over a defined corpus of institutional policies — HR policies, revenue cycle coding guidelines, facilities maintenance procedures, or any high-traffic internal knowledge base. Users ask questions in natural language; the system retrieves the relevant policy text and generates an answer, with the source document cited.
Why this one. Policy Q&A is high volume, low stakes, and naturally auditable. The current alternative — searching a SharePoint site or calling HR — is time-consuming and produces inconsistent answers. The AI system can be evaluated straightforwardly: does it retrieve the right policy section, and does its answer match what the policy actually says?
What you need to start. A curated set of no more than 50–100 policy documents in a shared drive, access to an embedding API (Azure OpenAI embeddings or Google Vertex AI embeddings are available within the existing enterprise tenant), a vector store (pgvector on an existing PostgreSQL instance, or a cloud-native option), and one developer with two to four weeks of focused time. No specialized ML expertise is required; RAG over a small document corpus is well within the capability of a competent generalist developer following published patterns.
Build or buy? Build. This is one case in business operations where building is clearly preferable: the corpus is institution-specific, the commercial alternatives for internal policy Q&A are expensive, and the technical complexity is low. A developer who builds this system gains skills and understanding that transfer directly to the next RAG project — research literature summarization, clinical protocol Q&A, EHR documentation search.
What done looks like. A working system that answers 80% of test questions correctly against the policy corpus, with cited sources, running at acceptable latency. A defined process for adding new policies and reindexing. Usage logs showing which questions are being asked most frequently (which is itself valuable institutional intelligence about where policy documentation is unclear).