Other Types
Filter
  1. All Blogs
  2. Product | Algolia
  3. Ai | Algolia
  4. E-commerce | Algolia
  5. User Experience | Algolia
  6. Algolia
  7. Engineering | Algolia

Automating keyword generation with BigQuery, Vertex AI, and Algolia

Published:

At Native Instruments, we create virtual instruments and effects — the building blocks of modern music production. Our sounds power everything from cinematic scores to electronic tracks, but here’s the catch: our catalog isn’t just violins or pianos. It’s over a thousand virtual instruments, textures, and effects, many of which defy simple description.

That’s a challenge when your users rely on search to find “their sound.” You can’t search for what you can’t describe. In this blog, I’ll show you how we went about building a solution. You can also watch my presentation on this topic which I gave at Algolia DevCon:

The challenge: When metadata fails music

Take our virtual instrument Electric Mint. It emulates a 1960s solid-body electric guitar and comes with hundreds of patterns. Now, imagine a producer hears a funky riff and searches our site. They might type funk, guitar, or surf rock. But they’ll never think to type “Electric Mint.”

Our old metadata model — product name, description, and category — couldn’t bridge that gap. The result? Frustrated users and a no-result rate hovering around 20%.

We needed a way to make our catalog discoverable by intuition, not just by exact terms.

The idea: Let AI listen and describe

The insight was simple: our manual tagging process could never scale or capture the richness of our products. But AI could.

We built an automated keyword generation pipeline that uses BigQuery, Vertex AI, and Algolia Connectors to generate rich, descriptive keywords.

The result? A 10x reduction in no-result pages: from 20% to just 2%. 

Step 1: Building the foundation in BigQuery

Our data already lived in multiple systems — web shop, CMS, shared files. We centralized all of it into BigQuery, our data warehouse, to enable structured transformations and joins.

Using dbt (data build tool), we described our data transformations as SQL models. Think of dbt as “infrastructure as code” for your data.

We cleaned, renamed, and normalized product data to create a consistent, ready-to-transform dataset. This became the staging layer for everything downstream, including our AI keyword generation.

Step 2: Generating AI keywords with Vertex AI

This is where the magic happened. Using Vertex AI’s BigQuery integration, we created a transformation called ML.GENERATE_TEXT() on our dataset, essentially prompting the Gemini model directly inside BigQuery.

Our prompt looked like this:

“Generate 15 keywords that can be used for text search on our website.

The keywords should be descriptive and include competitor products, related instruments, artists, genres, movies, albums, and generally related terms that would make it easy to find the product. Output just the keywords separated by commas.”

That’s it. Simple, powerful, and context-rich.

Each product record got its own set of AI-generated keywords, which we stored in a new column, llm_keywords. For Electric Mint, the model added terms like Stratocaster, surf rock, Pulp Fiction, and Jimi Hendrix. Suddenly, the search felt alive.

And because the model runs in-context, it even generates localized keywords. Not just translations, but culturally relevant equivalents in French, German, and beyond.

Step 3: Syncing to Algolia with connectors

Finally, we published our enriched dataset to Algolia using the BigQuery Connector.

Within seconds, Algolia indexed all fields including our new llm_keywords to make them fully searchable.

Now, when users type funk, Hendrix, or surf rock, Electric Mint surfaces instantly. That’s the magic of hybrid human/AI discovery.

Results: From 20% to 2% no-result rate

This pipeline didn’t just improve search, it transformed how users interact with our catalog.

We went from 1 in 5 searches returning nothing to 1 in 50. Engagement metrics followed: more clicks, more time on site, and better conversion.

And all of it was automated, scalable, multilingual, and repeatable.

What’s next

Our AI keyword generation pipeline is now part of our standard content flow. Every new product automatically receives descriptive keywords that evolve with the catalog.

We’re also exploring how this framework can support semantic and hybrid search, blending textual and vector signals for even deeper relevance.

If you’re struggling with abstract or hard-to-describe products — whether that’s sounds, fashion, or art — this approach can help you go from unsearchable to unmissable.

Cheers!

Recommended

Get the AI search that shows users what they need