Skip to main content

How to Optimize Algolia Search for a Docusaurus i18n Site

When implementing Algolia DocSearch on a multi-language (i18n) site built with Docusaurus, a common challenge is that search results contain a mix of multiple languages. This often leads to a poor user experience, such as English pages appearing in search results when a user is searching in Japanese.

This article serves as a guide, providing concrete steps to solve this problem and achieve an optimal search experience that aligns with the site's display language.

What you'll achieve with this guide
  • Display search results that correspond to the site's current language (Japanese/English).
  • Automatically route content to language-specific Algolia indexes based on the URL path (/ or /en/).
  • Improve search accuracy by configuring indexes tailored to the characteristics of each language (Japanese and English).

Reference Materials

The Challenge: Issues with the Default Setup

When integrating Docusaurus's i18n feature with Algolia using the standard configuration, content from all languages is indexed into a single Algolia index. This causes the following problems:

ChallengeProblem with a Single IndexSolution with Language-Specific Indexes
Mixed Search ResultsJapanese and English content are displayed without distinction, confusing users.Only content matching the current display language is shown accurately.
Reduced Search AccuracyOptimization for language-specific characteristics (word separation, plurals, etc.) is difficult.Accuracy is improved with settings like ignorePlurals tailored to each language.
Complex ManagementAll language data is centralized, making language-specific analysis and adjustments difficult.Data separation by language simplifies management and analysis.
Impact on User Experience

Search results with mixed languages hinder users from finding the information they need and can be a major factor in them leaving the site.

The Solution: Separating into Language-Specific Indexes

The key to solving this problem is to separate content into language-specific indexes using the URL structure.

The implementation proceeds in two steps:

  1. Customize the Algolia Crawler: Identify the URL path (/ or /en/) and define crawl targets for each language. Create dedicated indexes for each language (yoursite_ja, yoursite_en) and apply optimized settings to each.
  2. Configure the Docusaurus Frontend: Enable the contextualSearch option to automatically link the display language to the corresponding Algolia index.

1. Configure Language-Specific Indexes in the Algolia Crawler

First, edit the Algolia crawler configuration file to define a separate crawl process (actions) for each language. This will route content to two different indexes based on the URL path.

Managing the Configuration File

This configuration can be edited directly in the Algolia Crawler dashboard or managed locally as a config.json or .js file and applied via the CLI. The following is an example in JavaScript file format.

crawler-config.js
new Crawler({
// --- Basic Configuration ---
appId: "YOUR_APP_ID",
apiKey: "YOUR_CRAWLER_API_KEY", // Admin API key for the crawler, issued by Algolia
rateLimit: 8,
maxUrls: 5000,
schedule: "at 21:10 on Tuesday",
indexPrefix: "", // Leave empty, as this is controlled on the Docusaurus side

// --- Crawling Configuration ---
startUrls: [
"https://your-docusaurus-site.com/",
"https://your-docusaurus-site.com/en/",
],
discoveryPatterns: ["https://your-docusaurus-site.com/**"],
sitemaps: ["https://your-docusaurus-site.com/sitemap.xml"],

// --- Language-Specific Indexing Rules (Crucial Part) ---
actions: [
// ▼▼▼ Configuration for Japanese Content ▼▼▼
{
indexName: "yoursite_ja",
pathsToMatch: [
"https://your-docusaurus-site.com/**",
"!https://your-docusaurus-site.com/en/**",
],
recordExtractor: ({ $, helpers }) => {
return helpers.docsearch({
recordProps: {
lvl0: {
selectors: ".menu__link.menu__link--sublist.menu__link--active, .navbar__item.navbar__link--active",
defaultValue: "ドキュメンテーション", // Documentation
},
lvl1: ["header h1", "article h1"],
lvl2: "article h2",
lvl3: "article h3",
lvl4: "article h4",
lvl5: "article h5, article td:first-child",
content: "article p, article li, article td:last-child",
lang: { defaultValue: "ja" },
},
aggregateContent: true,
recordVersion: "v3",
});
},
},
// ▼▼▼ Configuration for English Content ▼▼▼
{
indexName: "yoursite_en",
pathsToMatch: ["https://your-docusaurus-site.com/en/**"],
recordExtractor: ({ $, helpers }) => {
return helpers.docsearch({
recordProps: {
lvl0: {
selectors: ".menu__link.menu__link--sublist.menu__link--active, .navbar__item.navbar__link--active",
defaultValue: "Documentation",
},
lvl1: ["header h1", "article h1"],
lvl2: "article h2",
lvl3: "article h3",
lvl4: "article h4",
lvl5: "article h5, article td:first-child",
content: "article p, article li, article td:last-child",
lang: { defaultValue: "en" },
},
aggregateContent: true,
recordVersion: "v3",
});
},
},
],

// --- Initial Settings for Each Language Index ---
initialIndexSettings: {
// ▼▼▼ Search settings for the Japanese index ('yoursite_ja') ▼▼▼
yoursite_ja: {
attributesForFaceting: ["type", "lang", "docusaurus_tag"],
ignorePlurals: false, // Ignoring plurals is not necessary for Japanese
minWordSizefor1Typo: 4,
minWordSizefor2Typos: 8,
searchableAttributes: ["unordered(hierarchy.lvl0)","unordered(hierarchy.lvl1)","unordered(hierarchy.lvl2)","unordered(hierarchy.lvl3)","unordered(hierarchy.lvl4)","content"],
customRanking: ["desc(weight.pageRank)","desc(weight.level)","asc(weight.position)"],
},
// ▼▼▼ Search settings for the English index ('yoursite_en') ▼▼▼
yoursite_en: {
attributesForFaceting: ["type", "lang", "docusaurus_tag"],
ignorePlurals: true, // Ignoring plurals improves accuracy for English
minWordSizefor1Typo: 3,
minWordSizefor2Typos: 7,
searchableAttributes: ["unordered(hierarchy.lvl0)","unordered(hierarchy.lvl1)","unordered(hierarchy.lvl2)","unordered(hierarchy.lvl3)","unordered(hierarchy.lvl4)","content"],
customRanking: ["desc(weight.pageRank)","desc(weight.level)","asc(weight.position)"],
},
},
});

Key Configuration Points

  • actions: This is the core of the configuration. The pathsToMatch property uses URL patterns to determine which content goes to which indexName.
    • Japanese: Targets the entire site (/**) while excluding the English path (!/en/**).
    • English: Targets only the English path (/en/**).
  • initialIndexSettings: Applies language-specific settings to each index (yoursite_ja, yoursite_en). Adjusting ignorePlurals and the typo tolerance (minWordSizefor...Typos) is key to improving search quality.

2. Dynamically Switch Indexes in Docusaurus

Next, configure the Docusaurus frontend to automatically reference the appropriate index based on the browsing language. This is done simply by modifying the algolia configuration in docusaurus.config.ts.

docusaurus.config.ts
import type { Config } from '@docusaurus/types';

const config: Config = {
// ... (Basic site configuration)

i18n: {
defaultLocale: 'ja',
locales: ['ja', 'en'],
},

themeConfig: {
// ...
algolia: {
appId: 'YOUR_APP_ID',
apiKey: 'YOUR_SEARCH_ONLY_API_KEY', // Set the "Search-Only" API key
indexName: 'yoursite', // Specify the common prefix for the index name

// This one line is the key
contextualSearch: true,
},
},
};

export default config;
The Role of contextualSearch: true

Enabling this option allows Docusaurus to automatically detect the current language context (ja or en). It then constructs the final index name (yoursite_ja or yoursite_en) by appending the locale ID as a suffix to the indexName prefix (yoursite) and sends the query.

  • When browsing a Japanese page → Searches yoursite_ja
  • When browsing an English page → Searches yoursite_en

Conclusion

By applying these settings, you can dramatically improve the Algolia search experience on your Docusaurus i18n site.

  • Optimal Search Experience: Users can search only the content in the language they are currently viewing, receiving accurate, noise-free results.
  • High Search Quality: Index settings optimized for each language provide more relevant search results.
  • Excellent Scalability: When adding new languages in the future, you can easily scale by adding a similar pattern to the crawler configuration.