Algolia takes steps to prevent bias by curating training data and employing rigorous validation processes. We use state-of-the-art public datasets that minimize the risk of bias during pre-production and apply strict curation to customer data in production to detect anomalies, such as bot attacks or power user activity. Furthermore, Algolia has ethical use policies to ensure that AI models are only used for approved purposes, such as producing search results and recommendations, while preventing misuse.
"Algolia AI” refers to a suite of advanced AI-powered features designed to enhance search relevance and recommendations. These include NeuralSearch, Dynamic Re-Ranking, Dynamic Synonym Suggestion, Query Categorization, AI Personalization, and certain Recommend features (such as "Frequently Bought Together" and "Looking Similar").
Algolia AI leverages a wide range of technologies, from the most simple and proven ones (such as regressions or collaborative filtering) to the most bleeding edge models (such as Large Language Models or “LLM“). Deep learning models including LLMs are typically pre-trained by trusted corporations such as Google or Microsoft. Depending on the specific needs of our customers, models may be fine tuned for better quality and relevance, for example to excel in the e-commerce space.
Algolia AI uses your subscriber data (such as Search Requests and Events, as defined in our Algolia Glossary) to fine-tune models for your specific use case. This data helps Algolia provide more relevant search results and recommendations to your end-users. However, Algolia ensures strict privacy and data security protocols, including anonymization and encryption, to safeguard your subscriber data, as further described in our Security Measures.
Yes, but only within your specific production environment. Algolia uses your data to train the AI models that serve your search and recommendation features. However, your data is not used to train models for other customers, and your training data remains isolated.
The nature of Algolia AI services is that they rely on customer data to provide tailored results. Therefore, it is not possible to opt out of model training since your data is essential for fine-tuning AI models to fit your specific needs.
No, Algolia does not share your data with other customers. All data used for training and model fine-tuning is strictly siloed, ensuring that your data is only used for the services provided to you and is not shared across customers.
Algolia enforces privacy protections through strict compliance with global data privacy regulations, including GDPR. We apply data minimization techniques, pseudonymization (such as replacing IP addresses with userTokens), and ensure that personal data (PII) is removed or anonymized before being used in any AI training processes. Our AI models do not contain PII, except in the case of user tailored features like AI Personalization.
Algolia secures AI models through robust encryption, both in transit and at rest (AES-256 encryption). We use strict role-based access control (RBAC) and least-privilege principles to limit access to sensitive data. Customer data is logically segregated in the cloud, and models are trained in isolation. Our security protocols, including regular auditing, follow industry-leading standards such as SOC 2, ISO27001, and ISO27017.
Yes. Algolia offers a suite of generative AI tools (“Generative Experiences”) that help our customers build new web experiences for their end users. Below are answers to commonly asked questions about Algolia's Generative Experiences.
What is Algolia's Generative Experiences
Algolia's Generative Experiences refers to tools, including Guides, that enable Algolia customers to apply generative AI capabilities to their Algolia records. Generative Experiences is currently powered by third-party large language model (LLM) providers via an API.
Can third-party LLM providers train on my data?
No. Our agreements with our third-party LLM providers do not permit the third-party LLM providers to train their models on data that belongs to our customers. Algolia may use your data to fine-tune LLMs provided by third parties in order to provide Generative Experiences tools
Are Algolia Generative Experiences inputs and outputs considered Subscriber Data?
When using Generative Experiences, inputs provided by customers and outputs are Subscriber Data. Although Generative Experiences uses leading LLMs to generate content, because of the developing nature of generative AI technology, outputs can contain inaccuracies, inconsistencies or other errors. We recommend that customers implement human review of any output generated by Generative Experiences.