Legal artificial intelligence in the large language model era reshapes data, modeling and evaluation

Large language models are transforming legal artificial intelligence from narrow task solvers into core components of data pipelines, modeling frameworks and evaluation workflows across key legal tasks.

Legal artificial intelligence is entering a new phase as large-scale language models become central to how legal data is processed, how models are built and how systems are evaluated. Legal artificial intelligence is defined as the use of artificial intelligence technologies to automate various legal tasks, and recent advances in large language models have significantly enhanced its capabilities. Large language models no longer only improve accuracy on benchmark tasks but now assume multiple roles across the lifecycle of legal natural language processing, from data construction and augmentation to inference-time reasoning and post hoc assessment of system behavior.

The survey introduces a role-based schema that categorizes the involvement of large language models along three main perspectives: data, modeling and evaluation. On the data side, large language models are used to generate synthetic legal questions, summaries, rationales and annotations, to assist in data cleaning, to support data augmentation, and to help structure unorganized legal text into machine-usable formats, thereby addressing the scarcity and imbalance of labeled legal corpora. On the modeling side, large language models function as task solvers through prompting and fine-tuning, as reasoning engines that incorporate legal principles such as syllogistic or deontic logic, as components in hybrid neuro-symbolic or retrieval-augmented architectures, and as collaborative partners to smaller domain models. On the evaluation side, large language models are enlisted to provide relevance judgments in retrieval, to annotate legal factors or argument structures, to assess hallucinations and factuality in generated content, and to enable more nuanced qualitative analyses of legal system performance.

Using this schema, the survey systematically reviews work in three major categories of legal tasks: legal classification, legal retrieval and legal generation. In legal classification, large language models support judgment prediction, topic and rule classification, contract clause analysis and multi-label document tagging, often through prompt learning or as part of multi-stage frameworks. In legal retrieval, they enhance case law and statute retrieval by reasoning over implicit concepts, collaborating with graph-based methods, and serving as judges or rerankers in competitions such as COLIEE. In legal generation, they power abstractive summarization, question answering, judgment and court view generation, regulatory summarization and simulation of courtroom debates. A detailed quantitative comparison of effectiveness across roles and tasks indicates that the impact of large language models depends strongly on their assigned role and the characteristics of the underlying legal task, highlighting both the promise and the limitations of current approaches and pointing to open challenges in robustness, interpretability, ethics and domain adaptation.

60

Impact Score

Anumana wins FDA clearance for pulmonary hypertension ECG Artificial Intelligence tool

Anumana has received FDA 510(k) clearance for an Artificial Intelligence-enabled pulmonary hypertension algorithm designed for use with standard 12-lead electrocardiograms. The company says the software can help clinicians spot early signs of disease within existing workflows and without moving patient data outside the health system environment.

Anu Bradford on tech sovereignty and regulatory fragmentation

Anu Bradford argues that Europe is wavering in its role as the world’s digital rule-setter just as governments everywhere move toward more state control over technology. Global companies are being pushed to treat geopolitical risk, data sovereignty, and Artificial Intelligence governance as core strategic issues.

Mistral launches text-to-speech model

Mistral has expanded its Voxtral family with a text-to-speech system aimed at enterprise voice applications. The company is positioning the open-weights model as a flexible alternative for organizations that want more control over deployment, cost and customization.

UK Parliament opens workforce inquiry on Artificial Intelligence

A UK Parliament committee is examining how Artificial Intelligence is changing business and work, with a focus on both economic opportunity and labour disruption. The inquiry is seeking evidence on government priorities as adoption expands across the economy.

Windows 11 tightens kernel trust for older drivers

Microsoft is changing Windows 11 kernel policy so new drivers must be signed through the Windows Hardware Compatibility Program. Older trusted drivers will still be allowed in some cases to preserve compatibility during the transition.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.