Dutch researcher advances artificial intelligence for hidden structured data insights

January 18, 2026

Dutch researcher Madelon Hulsebos is developing table representation learning techniques that allow artificial intelligence systems to understand what tables mean, aiming to unlock long overlooked structured data and democratise access to insights across organisations.

Organisations are sitting on large quantities of structured data in relational databases and spreadsheets that remain underused, because specialists still spend much of their time on repetitive work such as cleaning tables, extracting features and linking datasets. Dutch researcher Madelon Hulsebos, based at the Centrum Wiskunde & Informatica (CWI) in the Netherlands, is tackling this by developing “table representation learning”, a method that enables artificial intelligence to interpret what tables mean rather than simply search them by column names. After a PhD at the University of Amsterdam and postdoctoral work at the University of California, Berkeley, she now leads the Table Representation Learning Lab at CWI, guiding a team of three PhD students, two postdocs and six master’s students.

Backed by an NWO AiNed Fellowship Grant under the National Growth Fund programme, Hulsebos launched the DataLibra project, which runs from 2024 to 2029 and aims to build practical tools that make querying organisational data as simple as a web search. She argues that artificial intelligence can lower the barrier by allowing users to ask questions in natural language rather than learning programming, business intelligence tools and relational database concepts. The challenge is that each system uses different column names and logic, which limits traditional techniques such as SQL and pattern matching, so her work focuses on models that generalise from context in order to identify and combine relevant tables, moving from basic information retrieval to what she terms “insight retrieval”. Hulsebos stresses that full automation is not the goal, because users must be able to understand and explain why a specific answer was produced, making transparency, iteration and robustness central design requirements.

Hulsebos sees table representation learning as a way to automate the commonly cited “80% data work and 20% modelling” split in data science, freeing experts to focus on more critical questions while also empowering non-specialists to query relational databases directly in plain language. She is sceptical of many current vendor claims about artificial intelligence powered analytics, pointing to benchmarks where success rates are often zero and highlighting the need for systems that can justify their outputs rather than simply generate confident responses. The importance of context and explanation came into focus in a recent collaboration with the United Nations Humanitarian Data Centre on detecting sensitive data in humanitarian datasets. Together with master’s student Liang Telkamp, she developed two mechanisms: one that reasons over full data context to reduce false positives, and a “retrieve then detect” approach that dynamically links datasets to relevant policies and protocols so that assessments change as conflicts or situations evolve. Quality Assessment Officers at the UN found the contextualised explanations from large language models particularly valuable for navigating long information sharing protocols, and Telkamp’s work was recognised with the Amsterdam AI Thesis Award.

For Hulsebos, the UN project illustrates the broader organisational problem of making data both accessible and comprehensible, including understanding sensitivities before information is published on data sharing portals that could feed model training sets. She wants to surface unknown datasets and combinations so people can uncover insights they did not realise were possible, reducing the need to route every question through business intelligence or data science teams. In her view, dependence on dashboards and SQL queries introduces delays until a delivered insight is no longer timely, so she focuses on artificial intelligence powered systems that shorten “speed to insight” by allowing everyone from sales staff to CEOs to query data directly. Concrete tools are in development: one PhD student is building components to automate dataset retrieval and support structured query language generation, with first open source versions expected within the next two months. An earlier tool, DataScout, created during her time at the University of California, Berkeley, already showed in user studies that task based search with large language models helped data scientists find relevant datasets faster than traditional keyword based data platforms, addressing situations where gathering the right data for a machine learning model could otherwise take two weeks to a month.

Source

55

Impact Score

Latest News

NVIDIA and Doosan broaden physical Artificial Intelligence partnership

June 8, 2026

NVIDIA and Doosan Group are expanding work across robotics, autonomous equipment, power infrastructure and advanced materials. The partnership links NVIDIA accelerated computing platforms with Doosan businesses serving industrial automation, energy systems and data center hardware.

Chatbot liability suits test Artificial Intelligence safety law

June 8, 2026

A Florida lawsuit targeting ChatGPT’s maker signals a new product liability threat for Artificial Intelligence companies. The fight could turn on unsettled questions about platform immunity, speech protections, causation, and federal safety rules.

YouTube’s Artificial Intelligence remix tool raises creator economy concerns

June 8, 2026

YouTube’s Gemini-powered remixing tools promise easier creation and broader reach, but creators, marketers and lawyers are questioning consent, copyright exposure and brand safety.

Artificial Intelligence charging method targets longer EV battery life

June 8, 2026

A Chalmers University of Technology charging approach uses Artificial Intelligence to adapt fast charging to battery condition. The method aims to extend electric vehicle battery life without adding meaningful charging time.

Canada pushes Artificial Intelligence sovereignty strategy

June 8, 2026

Canada has unveiled an Artificial Intelligence for All strategy focused on reducing reliance on foreign cloud and Artificial Intelligence providers. The plan mirrors the EU’s new sovereignty push and sets targets for adoption, infrastructure and jobs.

Dutch researcher advances artificial intelligence for hidden structured data insights

55

Impact Score

Latest News

NVIDIA and Doosan broaden physical Artificial Intelligence partnership

Chatbot liability suits test Artificial Intelligence safety law

YouTube’s Artificial Intelligence remix tool raises creator economy concerns

Artificial Intelligence charging method targets longer EV battery life

Canada pushes Artificial Intelligence sovereignty strategy

Contact Us