Alaska court system’s Artificial Intelligence probate chatbot exposes limits of government tech

Alaska’s courts spent more than a year building an Artificial Intelligence chatbot to guide people through probate, only to find that hallucinations, tone issues, and testing burdens made deployment far harder than expected.

Alaska’s court system has spent more than a year developing a generative Artificial Intelligence chatbot, the Alaska Virtual Assistant, to help residents navigate probate, the legal process of transferring a deceased person’s property, and the effort has exposed how difficult it is for government agencies to safely apply Artificial Intelligence in high-stakes settings. The project was initially scoped as a brief effort, as one consultant said that it “was supposed to be a three-month project,” but it has stretched to “well over a year and three months” because the team insisted on extensive due diligence to avoid harmful mistakes. Court leaders, including administrative director Stacey Marz, concluded that unlike other technology projects where a minimum viable product can be improved over time, a probate chatbot must be held to a much higher standard of accuracy because people could act on incomplete or wrong information in ways that could seriously damage an individual, family, or estate.

The Alaska Virtual Assistant was envisioned as a low-cost, cutting-edge counterpart to the state’s family law helpline, replicating the kind of guided self-help a human facilitator provides on issues such as divorce or protective orders. Funded initially by a grant from the National Center for State Courts and technically built by Tom Martin, a lawyer and founder of the law-focused Artificial Intelligence company LawDroid, the chatbot’s design forced the team to confront choices about personality, tone, and behavior that have become central to modern Artificial Intelligence systems. Early user testing revealed that an overly empathetic style, including repeated condolences to grieving users, frustrated people who said they were tired of hearing “I’m sorry for your loss,” leading the team to strip such language from the chatbot’s responses. More seriously, the team battled hallucinations in which the system confidently invented details, such as telling people to seek help from a non-existent law school in Alaska, even though the chatbot was supposed to draw only from a curated court probate knowledge base.

To measure whether the Alaska Virtual Assistant was ready, the project team created a bank of 91 probate-related questions, such as which form to use to transfer a deceased relative’s car title, but they found that running and manually reviewing this 91-question test was too time-consuming given the stakes. They eventually narrowed the evaluation to a set of just 16 questions that blended previously missed items, complex scenarios, and basic questions expected to be common, allowing more manageable human review of the chatbot’s performance. Cost considerations also shaped the effort, as new Artificial Intelligence model iterations have sharply reduced usage fees and one test setup showed that 20 Alaska Virtual Assistant queries would cost only about 11 cents, which the team regards as crucial for cash-strapped courts. At the same time, reliance on evolving systems like OpenAI’s GPT models means the court will need regular checks and prompt or model updates instead of treating the chatbot as a set-and-forget tool. Despite the delays, recurring hallucinations, and scaled-back expectations compared to human facilitators, the Alaska Virtual Assistant is scheduled for launch in late January, with Marz expressing cautious optimism that future model improvements could eventually boost both accuracy and completeness, even as she describes the current process as extremely labor-intensive despite widespread generative Artificial Intelligence hype about revolutionizing self-help and democratizing access to the courts.

55

Impact Score

Chinese photonic chips claim 100x speed gains over Nvidia in specialized generative artificial intelligence tasks

Chinese researchers are reporting photonic artificial intelligence accelerators that can run narrowly defined generative workloads up to 100x faster than Nvidia GPUs, highlighting the potential of light-based computation for task-specific performance and efficiency. The experimental chips, ACCEL and LightGen, target vision and generative imaging rather than general-purpose artificial intelligence training or inference.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.