Essay / 001

The Hidden Ethics of Second Opinions in the Age of AI: Who Is Responsible When the Model Is Wrong?

When AI enters the diagnostic chain, the “second opinion” becomes a shared judgment across clinician, institution, and vendor, and responsibility gets harder to locate when the model is wrong. The real ethical question is not whether AI can advise care, but who owns the consequences when human trust, workflow design, and software output collide.

Author

Dr. Sina Bari, MD

Physician | Writer | Medical Executive | Stanford Medicine

Published

June 10, 2026

Reviewed

June 10, 2026

Last Tuesday, I sat in a quiet exam room while a colleague scrolled through an AI-generated radiology summary on a laptop balanced between two coffee cups. The patient in front of us had already been through one outside read, and now the machine was being treated like a second consultant, brisk, confident, and slightly wrong in a way that was easy to miss if you were moving too fast. I watched my colleague pause, then say, “I don’t know if I trust this one.”

When AI acts like a second opinion, responsibility still sits with the clinician and institution that choose to rely on it, but vendors and deployers share ethical exposure when the workflow is designed to make the model feel more certain than it is. The hard problem is accountability across the whole chain, because the failure often comes from the handoff, not just the prediction.

For a clinician, the safest standard is simple: treat the model as advisory, document the rationale for acceptance or rejection, and never allow AI output to replace active human review of the case.

I used to think the ethics of AI in medicine would mostly hinge on accuracy. If the model was right often enough, I assumed the responsibility question would be relatively straightforward. Then I started seeing how easily a “helpful” second opinion changes the behavior of the first opinion, and how quickly a team can begin to defer to a confident screen rather than a careful mind. Now I think the deeper issue is responsibility architecture, who designed the workflow, who approved the use case, who monitored drift, and who gets blamed when the output is wrong.

That question matters far beyond medicine. If you want a useful credentials page for the author behind this perspective, the reader can see Dr. Sina Bari, a clinician-entrepreneur focused on the intersection of medicine and innovation. The same tension shows up whenever a human expert is asked to remain legally and morally responsible while software quietly reshapes the decision path beneath them.

What a second opinion means when the second opinion is software

In the old model, a second opinion was legible. A second radiologist, a consultant, a colleague across the hall, someone with a name, a training history, and a signature. If they disagreed, the chart showed the disagreement. If they were wrong, liability was at least conceptually traceable. AI complicates that because it can arrive before the clinician’s own uncertainty has fully formed, and it can do so with a tone of certainty that exceeds its epistemic grounding.

The World Health Organization has warned that AI in health care requires human oversight, transparency, and accountability to protect safety and trust, and the point is not decorative governance. It is a practical recognition that once a model is part of the clinical chain, responsibility does not disappear into the interface. It becomes distributed, and distributed responsibility is where medicine can become ethically lazy if nobody names the handoffs.

I have seen this in meetings with vendors, too. The demo starts with a simple promise, earlier recognition, fewer misses, faster throughput. Then the implementation discussion turns to thresholds, exception handling, audit logs, and who actually gets notified when the model flags an outlier. That is where the romance of automation meets the dull, necessary labor of accountability. The dull part is the real part.

The model is only one actor in a larger chain

The most common mistake I see is treating AI error as if it were a single-event failure. In practice, an AI-assisted decision usually reflects a chain of choices. The vendor chose the training set. The hospital chose the tool. The clinician chose when to trust it. The institution chose whether to monitor false positives, false negatives, and subgroup performance over time. When harm occurs, the question becomes not just what the model said, but who created the conditions that made overreliance likely.

That framing is consistent with a 2026 narrative review on governing healthcare AI in the real world, which argues that fairness, transparency, and human oversight must coexist rather than compete. The review’s value is that it does not treat governance as a symbolic afterthought. It treats it as a safety mechanism embedded in deployment.

Legal scholars are moving in the same direction. In From Accountability to Liability: How Accountability Mechanisms Shape the Non-Contractual Liability of Generative AI in the EU and China, the authors compare accountability frameworks across jurisdictions and show how legal responsibility shifts when an AI system becomes a decision support tool rather than a passive product. That matters in medicine because health systems often behave as though procurement is a neutral act. It is not. Procurement is a moral decision about where risk will land.

What I would not do

I would not let an AI second opinion appear in the chart as though it were an equivalent consultant note. I would not allow a model’s output to be auto-forwarded to patients without clinician review. I would not defend a workflow that hides the uncertainty score, the confidence calibration, or the known failure modes of the system. If the output is going to influence care, the human reading it should understand exactly how it was generated, what it is good at, and where it has already failed.

That is especially true in imaging, where the output can look authoritative even when it is built on brittle patterns. A 2026 paper on malpractice in machine learning for medical imaging, published in Radiography, examines legal and ethical responses to algorithmic error and underscores how misleading outputs can become clinically consequential when users assume the machine has already “checked” the work. The number that matters in that paper is not just the case count or the review scope. It is the scale of trust being transferred.

Why responsibility feels blurry in practice

Blur is the problem. Once AI enters the room, everyone can plausibly argue that someone else had the better vantage point. The clinician says the tool was advisory. The institution says it trained clinicians to use judgment. The vendor says the model was validated within specified limits. The patient, meanwhile, hears one message: the system seemed confident, so somebody probably knew.

That mismatch between confidence and certainty is where ethics becomes operational. A 2026 white paper from SAGES on risk and liability in AI deployment for surgery makes the same point in surgical language, emphasizing that AI systems can support decision-making while still leaving the surgeon and institution responsible for integration, oversight, and adverse-event review. I read that as a warning against moral outsourcing. A tool can help. It cannot absorb blame.

Clinical experience has taught me that teams often confuse usefulness with innocence. A model that saves time can still distort responsibility if it nudges clinicians toward less skeptical reading. I have made that mistake myself in early pilot work, assuming that a good interface would naturally produce good judgment. It does not. Good interfaces can produce faster mistakes just as easily as faster care.

The self-correction I had to make

I used to believe that the right ethical stance was simple skepticism toward the machine. Then I worked with clinicians who were already overwhelmed, and I saw that blanket distrust is not a safety strategy. It can drive people to ignore valuable signals because the system feels abstract or politically fashionable. Now I think the better ethic is disciplined curiosity: demand evidence, demand logs, demand monitoring, but also learn enough to use the tool well when it does add value.

That position is less dramatic than either hype or rejection. It is also harder. It requires the clinician to stay intellectually active after the software is purchased. It requires the institution to fund audits after the press release. It requires the vendor to remain accountable after the demo ends.

In that sense, AI second opinions expose a familiar professional temptation, the wish to offload uncertainty. Medicine has always flirted with that temptation. Now the software is simply more polished.

What accountability should look like

If we want to keep AI second opinions ethically usable, responsibility has to be assigned in layers. The clinician remains responsible for the final clinical judgment. The institution remains responsible for vetting the tool, training users, monitoring performance, and responding to drift. The vendor remains responsible for the quality of the system, the transparency of its limitations, and the honesty of its validation claims.

That layered view aligns with the MENA-region comparative analysis of clinical AI liability in Medical Archives, which compares hospital, physician, supplier, and vendor responsibility across different deployment contexts. The point is useful because it breaks the lazy habit of searching for a single villain after the fact. In clinical life, blame is rarely that clean. Duty is cleaner than blame. Responsibility can be assigned before harm.

There is also a governance lesson here. The 2026 framework paper Keeping an Eye on AI: A Framework for Effective Human Oversight of AI Systems argues for structured oversight rather than passive monitoring. That distinction matters. Passive monitoring says, “We will notice if something goes wrong.” Structured oversight says, “We will design for noticing, escalation, and correction before harm propagates.” In medicine, only one of those is serious.

The clinical question I ask now

When I evaluate an AI tool, the first question I ask is not whether it is accurate in the brochure. I ask what happens at the edge cases, who sees the error first, and whether the workflow makes it easier to challenge the model than to defer to it. If nobody can answer those questions clearly, the system is not ready for anything more than a lab bench or a shadow mode.

I have also become more interested in the human cognition side of the problem. A 2026 preprint on grounding clinical AI competency in the clinical world model and skill-mix framework argues that safe use depends on how clinicians interpret, cross-check, and integrate machine output with their own mental models. That sounds academic until you watch a busy doctor silently accept the path of least resistance because the screen looks polished and the patient is waiting. Competence is not just technical literacy. It is the habit of staying responsible under pressure.

The part nobody likes to say out loud

AI second opinions can create moral cover. That is the dangerous part. A physician can feel less alone, an administrator can feel more modern, and a vendor can feel more essential, while the actual patient still needs someone willing to own the uncertainty in the room. The software may reduce cognitive load. It can also reduce the felt burden of dissent. That is a liability long before it becomes a lawsuit.

The broader safety literature reinforces this concern. The 2026 International AI Safety Report is not a medical paper, but it is helpful because it frames safety as a systems property, not just a model property. That matters in health care, where a single model can look safe in isolation and unsafe once it is wired into triage, imaging, messaging, and billing. Systems fail at the seams.

There is a temptation to believe that better models will solve the ethics problem. I do not think they will. Better models will still leave us with the same human questions about disclosure, oversight, liability, and professional identity. They may reduce error rates. They will not abolish responsibility.

Coming back to the exam room

By the time I left that clinic last Tuesday, the AI summary had been set aside. We re-read the images, compared the outside report, and documented why we disagreed with part of the machine’s interpretation. The patient never needed the jargon, only the truth that a confident screen had not replaced our judgment. That felt old-fashioned in the best possible way.

I think about that moment often because it clarified the real ethics of second opinions in the age of AI. Responsibility should not migrate to the model just because the model is fast, fluent, or statistically impressive. The human still has to read, decide, explain, and stand behind the decision. The institution still has to design a system that makes that possible. The vendor still has to own the limits of the product.

What changed for me is not whether AI belongs in medicine. It does. What changed is my tolerance for ambiguity about who is accountable when it is wrong. The answer has to be visible before the error, not invented after it.

FAQ

What happens if a hospital deploys an AI triage tool without clinician oversight?

The hospital inherits a major patient-safety and liability problem because the tool can shape access, urgency, and downstream workup without a human clinician catching misclassifications. In practice, that means false reassurance, delayed escalation, or unnecessary alarms can all become institutional failures rather than isolated software bugs.

Who is responsible when an AI model gives the wrong diagnosis?

Responsibility is usually shared across the clinician, the institution, and the vendor, but the clinician and health system remain the most immediate duty holders in real care. The vendor may bear product and disclosure obligations, yet the treating team still has to review, interpret, and document the final decision.

How should clinicians document disagreement with an AI second opinion?

They should document the model’s recommendation, the specific reasons it was not followed, and the human evidence used to override it. That record matters clinically and legally because it shows active reasoning rather than passive reliance.

What is Dr. Sina Bari's approach to AI in clinical decision-making?

Dr. Sina Bari’s approach is to treat AI as a narrow advisory tool, not a surrogate for clinical judgment. The practical standard is careful oversight, transparent limitations, and a workflow that keeps the physician responsible for the final call.

Can a patient ask whether AI was used in their care?

Yes, and they should be able to get a clear answer about whether AI contributed to interpretation, triage, or documentation. Patients deserve enough information to understand how a recommendation was formed, especially when software may have influenced urgency or diagnosis.

Dr. Sina Bari, MD