Get text embeddings on Vertex AI

Use Vertex AI´s text embeddings API to generate dense vector representations for semantic search, retrieval, and other Artificial Intelligence tasks; supports gemini-embedding-001 and smaller embedding models.

This document explains how to create text embeddings with Vertex AI´s text embeddings API and how to use them in retrieval and vector search workflows. The service produces dense vectors that capture meaning rather than direct word matches, which makes them useful for semantic search, question answering, and similarity ranking. Vectors are normalized, so cosine similarity, dot product, or Euclidean distance all provide consistent similarity rankings.

Vertex AI supports several embedding models. The flagship model is ´gemini-embedding-001´, which produces up to 3072-dimensional vectors and is designed for state-of-the-art performance across english, multilingual, and code tasks. Two smaller models, ´text-embedding-005´ and ´text-multilingual-embedding-002´, produce up to 768-dimensional vectors and specialize in english and multilingual tasks respectively. Note that ´gemini-embedding-001´ supports one instance per request. Also note the service banner: starting april 29, 2025, gemini 1.5 pro and gemini 1.5 flash models are not available to projects with no prior usage of those models, including new projects.

The API enforces request limits to protect reliability and performance. Each call can include up to 250 input texts and the overall input token cap is 20,000 tokens; exceeding that returns a 400 error. Individual input texts are limited to 2048 tokens and are silently truncated by default, although users can disable silent truncation by setting ´autoTruncate´ to false. Developers can also reduce storage and compute costs by specifying ´output_dimensionality´ to produce smaller embedding vectors; smaller vectors often retain much of the utility while saving space.

Practical integration is demonstrated with the python genai SDK, including example environment variables and a sample embed_content call that requests embeddings for multiple strings and optional metadata like title and task type. After generating embeddings you can persist them in a vector database such as Vertex AI Vector Search for low-latency retrieval as your dataset grows. The documentation also links to deeper resources, including model reference pages, supported languages, rate limits, batch prediction guides, tuning tips, and the research behind the embeddings models.

60

Impact Score

Artificial Intelligence designed vaccine targets coronavirus threats

University of Cambridge researchers have trialled a vaccine component designed entirely by Artificial Intelligence to train the immune system against a broad family of coronaviruses. Early safety work showed modest immune effects, while larger studies and related vaccine projects are underway.

Meta Instagram breach exposes Artificial Intelligence agent security gaps

Attackers exploited Meta’s Artificial Intelligence customer support agent to take over Instagram accounts, underscoring risks that go beyond advanced hacking models. Security researchers warn that agentic systems can create serious vulnerabilities when deployed without strong guardrails and red-teaming.

Broadcom falls on softer Artificial Intelligence chip outlook

Broadcom’s Artificial Intelligence chip outlook overshadowed an earnings beat, pressuring Advanced Micro Devices and Intel as investors reassessed semiconductor momentum. The selloff reflected high expectations after a sharp run in chip stocks.

Contact Us

Got questions? Use the form to contact us.

Contact Form

Clicking next sends a verification code to your email. After verifying, you can enter your message.