Vincent Koc adds Japanese, Spanish, and Portuguese query expansion to OpenClaw's full-text search — enabling memory recall to work properly across languages by filtering language-specific stop words.
OpenClaw's memory system uses full-text search (FTS) to recall previous conversations, decisions, and context. When you ask your AI assistant "what did we decide about the project last week?", FTS finds relevant snippets to include in the prompt.
The problem: FTS relies on query expansion — breaking queries into meaningful tokens. In English, you filter out "the", "a", "is". But Japanese, Spanish, and Portuguese have their own stop words that were being included in searches, producing poor recall.
Three related commits landed today:
| Commit | Language | Change |
|---|---|---|
21cbf59 |
Japanese | Add query expansion support for FTS (#23156) |
35b162a |
Spanish, Portuguese | Add stop words (#23710) |
| #23717 | Arabic | FTS query expansion filtering (pending) |
The implementation adds language-specific stop word lists that get filtered during query expansion. For Japanese, this is particularly complex because the language doesn't use spaces between words — requiring different tokenization strategies.
This work reflects a broader pattern: as AI assistants move from English-first demos to global production use, every subsystem needs internationalization. Memory search failing in Japanese means Japanese users get a degraded experience — their assistant forgets things.
OpenClaw's memory architecture uses a combination of:
All three need to work across languages. FTS is getting fixed now. Semantic similarity already handles multiple languages via embedding models. Recency is language-agnostic.
Arabic support is already in review (#23717), and the pattern is now established for other languages. The challenge varies by language family:
Expect more language additions as the community expands. The infrastructure is now in place.