Large language model feature engineering reshapes insurance pricing

Large language models are giving actuaries new ways to engineer pricing features from both structured and unstructured data, but they also introduce fresh governance, bias and fairness challenges.

Insurance pricing is emerging as a rich field for applying large language models in feature engineering, allowing actuaries to generate new predictive variables from existing data and external sources. Feature engineering is framed as adding new columns to datasets that better describe each observation, with large language models expanding what is possible by embedding their own learned knowledge and by processing unstructured inputs such as text and images. Despite concerns about hallucination, the inherent validation steps in model fitting provide a safeguard, shifting the key risks from data scarcity and anti-selection toward governance, bias and operational controls.

The approach is broken into four main types of large language model derived features. First, factual descriptors use models as scalable domain experts to assign ordinal risk groupings or answer detailed questions about attributes, such as model-specific car features, across thousands of levels in seconds instead of hours. Second, subjective descriptors extract broad, socially informed judgments, for example identifying “boy racer” cars that carry higher risk but lack an explicit, stable list, replacing weeks of manual sentiment analysis. Third, interaction-style features classify observations across combinations of existing variables, effectively flagging high risk patterns and helping close the interaction gap between generalised linear models and tree-based machine learning methods. Fourth, multimodal models can distil large volumes of unstructured external data, such as property images similar to Google Street View, into rich signals about roof condition, maintenance, surroundings or even lifestyle proxies that traditional pricing models cannot easily capture.

Implementation starts with clear thinking about true underlying risk factors, such as driving ability or propensity to take risks in motor insurance, and then crafting prompts that tie new features intuitively to those factors. Practically, actuaries send factor levels to an application programming interface with prescribed response scales, then convert results into mapping tables that can be merged into modelling datasets, as illustrated by a car model example where a copilot tool outputs risk, “boy racer” likelihood and coolness scores. Static mappings are usually preferred for cost and speed, with real time scoring reserved for cases where new levels appear frequently, such as addresses. New features are tested with standard statistical validation and dropped if they do not improve performance, though incorrect classification at the individual level can increase price volatility. The method also demands stringent ethical and legal scrutiny: letting models infer features from names or personal information is flagged as unacceptable, and there is explicit concern that stereotypes and protected class differences embedded in training data will taint features like perceived speeding propensity. As pricing sophistication and segmentation increase, affordability pressure on higher risk groups is expected to rise, likely inviting more regulatory attention, while reliance on external data vendors may fall as internal teams use large language model tooling. At the same time, opaque, biased large language model derived factors are identified as a significant operational risk for fair pricing, even as they offer a potential lifeline for traditional generalised linear models by enriching their feature space.

57

Impact Score

Document fraud defenses in the era of generative artificial intelligence

Insurers are facing industrialized document fraud powered by generative artificial intelligence, forcing a shift from manual checks and isolated tools to multi-layered detection pipelines tightly integrated with investigation teams. A combination of provenance analysis, content validation, artificial intelligence generation detectors and investigator-friendly workflows is emerging as the core defense strategy.

Nvidia halts China focused H200 production and shifts capacity to Rubin

Nvidia has stopped producing its China targeted H200 Hopper GPU at TSMC after building a large inventory, as export and import restrictions from the United States and China slow deployment. The company is now reallocating some manufacturing and packaging capacity toward its next generation Rubin chips.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.