Artificial intelligence faces book shortage as libraries weigh data partnerships

Artificial Intelligence models are hungry for more data, but access to books is becoming a new bottleneck for chatbot training.

Artificial intelligence systems, particularly those powering modern chatbots, rely heavily on vast amounts of textual data to improve their conversational skills and understanding of human knowledge. Historically, much of this information came from the open internet—blog posts, forums, social media, and websites. However, as these resources are increasingly exhausted or restricted, the focus is shifting toward book content, which provides richer, edited, and in many cases, copyrighted material.

This hunger for text has driven technology companies to pursue partnerships with libraries and publishers. Books offer a trove of knowledge that differs from the brief, conversational tone of internet text. For artificial intelligence developers, gaining structured, high-quality book data can enhance the depth and reliability of chatbots, making them more useful and authoritative. However, these efforts raise issues around copyright, ethics, and access. Libraries—traditional stewards of knowledge—now find themselves at the intersection of competing interests: supporting educational technology while defending author rights and public access.

Negotiations between data-hungry firms and libraries are complex, often involving questions about licensing, privacy, and the fair compensation of authors. As the industry wrestles with these challenges, the decisions made in the next few years could reshape how books are consumed, how artificial intelligence learns, and how society balances information access with creator rights. The future of artificial intelligence knowledge, it seems, depends as much on the willingness of libraries and publishers to collaborate as it does on technological advances themselves.

78

Impact Score

Google launches Gemini Omni for conversational video editing

Google has introduced Gemini Omni, a video model that edits and generates clips through natural conversation using text, images, audio, and existing footage. The first public version, Gemini Omni Flash, is now rolling out across the Gemini app, Google Flow, and YouTube Shorts.

Regulators use Artificial Intelligence to scrutinize disclosures

US, UK, and European regulators are using or exploring Artificial Intelligence tools to detect disclosure problems and monitor firms more effectively. Compliance specialists say supervisors may now be ahead of financial institutions in some areas of technological sophistication.

Pope Leo frames Artificial Intelligence as a media power struggle

Pope Leo XIV’s first encyclical casts Artificial Intelligence as a moral question of power, labor, and collective responsibility, offering publishers a framework for negotiating with technology companies. The broader media landscape is also shifting as AP supplies election data to ChatGPT, YouTube expands labeling of Artificial Intelligence video, and search traffic declines for publishers.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.