Your own Artificial Intelligence assistant: personalized, private, and built for clinical thinking

A primer for anesthesiologists on creating private, localized Artificial Intelligence assistants using small language models and retrieval augmented generation to protect patient data and improve clinical relevance.

This article outlines how generative Artificial Intelligence is being adapted for clinical use in anesthesia and regional anesthesia, and why localized solutions are gaining traction. The author describes generative models such as ChatGPT and notes estimates of the scale of cloud-based large language models, including reported figures like 24,200 gigabytes of data and proposals of up to 1 trillion parameters, which illustrate why these models require large cloud infrastructure. Interest in using these systems in anesthesia ranges from creating anesthesia plans and managing labor analgesia to summarizing intraoperative reports, but three consistent limitations are highlighted: patient confidentiality when using commercial cloud services, variable accuracy of responses, and the risk of hallucinations.

To address confidentiality and clinical fidelity, the article advocates running small language models locally and coupling them with retrieval augmented generation, or RAG. Small language models are described as distilled, quantized versions of larger models, commonly ranging from 1 to 70 billion parameters, with a practical sweet spot around 7 to 14 billion for consumer-grade hardware. The piece summarizes hardware tiers and trade-offs for running SLMs, and explains that clinicians can prefill local RAG vector databases with domain-specific resources-PDFs, videos, guidelines and hospital procedures-to inject contemporaneous medical information into model outputs. A user case from Liverpool Hospital is detailed: a curated chest wall analgesia database connected to a Qwen3 mixture-of-experts 30b model, run with Ollama and Open WebUI, plus an adaptive memory layer and carefully designed system prompts to improve citation and traceability.

The article also surveys broader clinical and academic applications, such as drafting documents that include patient data, literature synthesis, multilingual chatbots, and code generation. It introduces the concept of agentic Artificial Intelligence, where autonomous agents can select models, optimize prompts, or perform fact checking to reduce hallucinations and improve trustworthiness. The author closes by noting this is Part 1 of a series and promises practical setup guidance in Part 2.

57

Impact Score

JEDEC outlines LPDDR6 expansion for data centers

JEDEC has previewed planned updates to LPDDR6 aimed at pushing the memory standard beyond mobile devices and into selected data center and accelerated computing use cases. The roadmap includes higher-capacity packaging options, flexible metadata support, 512 GB densities, and a new SOCAMM2 module standard.

Tsmc debuts A13 process technology

Tsmc has introduced its A13 process at its 2026 North America Technology Symposium as a tighter version of A14 aimed at next-generation Artificial Intelligence, high performance computing, and mobile designs. The company positions the node as a more compact and efficient option with backward-compatible design rules for faster migration.

Google unveils eighth-generation tensor processor units

Google introduced its eighth generation of custom tensor processor units with separate designs for training and inference. The new TPU 8t and TPU 8i are aimed at large-scale model training, serving, and agentic workloads.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.