Search API meets Typesense: Luca Lusso at DrupalCon Barcelona
DrupalCon Barcelona has just finished, and as you can imagine, AI was one of the most discussed topics. One of the advantages AI can bring to our websites as customers is helping us find the right piece of content or the right product.
My presentation at the conference was about providing users with the advanced search experience they expect nowadays.
Typesense as a Search Engine for Drupal
Complex Drupal websites typically use an external search engine to index and retrieve information about content or products. Actual solutions are valid and used on dozens of websites. Still, when they are complex to set up and operate (like Solr or Elasticsearch), others may become very expensive (SaaS solutions like Algolia).
Typesense is an open-source search engine written in C++ that is very easy to deploy, fast, and developer-friendly. There's even an add-on for DDEV to start a local instance quickly.
To prepare for the talk, I scraped 25k forum posts from drupal.org, migrated them into a Drupal 11 website, and indexed them to Typesense and Solr. I index the data into Typesense using a new module that I maintain, Search API Typesense.
The first thing I demonstrated to the audience was how fast Typesense can be. While Solr is fast (WebProfiler measures ~180ms to render a page), every search and every time a facet is selected, triggers a full page reload.
With Typesense, the first search returns in ~8ms, and every subsequent interaction with the SERP reloads the results set.
Typesense supports a search-as-you-type approach, where the results are retrieved while the user is typing the search query; this provides a better user experience, as results are returned quickly.
Module's Features
During the rest of the talk, I showed all the Typesense features the Search API Typesense module implements.
Directly from the Drupal UI, you can manage API keys, configure stopword sets, and insert synonyms.
A notable feature of Typesense, usually found in high-end paid services, is the ability to push content for marketing purposes. You may want to show a specific node at a specific position in the SERP when a user searches for a specific keyword, regardless of the current ranking. Typesense calls this feature "curation", and with Search API Typesense, you can configure it directly from Drupal.
Of course, AI is the other hot topic. Internally, Typesense is also a vector database, and it can use an LLM to generate a vector representation of the contents we send to it. A vector (speaking easily) is a list of float numbers that try to represent the semantic meaning of a text. When you have vectors for content, you can ask Typesense for documents that are semantically similar to the search query. More than that, you can perform a hybrid search and find documents that match some keywords or are semantically similar.
Search API Typesense allows you to configure which fields to generate the vector, which LLM to use, and how to split your text into chunks. But we can go even further and use Typesense as a RAG engine. RAG, or Retrieval-Augmented Generation, is a technique used to converse with an LLM. You insert a search query, and Typesense uses it to retrieve a set of contents. Then, it sends an API to an LLM with the text from the retrieved content and a question like "You are an assistant for question-answering. You can only make conversations based on the provided context...". The response can then be used to print the generated answer and the set of contents used to create it. The questions are called "conversation models" in Typesense, and you can manage them using a UI provided by the Search API Typesense module.
Building a SERP
The last thing I showed in the presentation was the options a developer can use to build a search results page. Unlike standard Search API workflow, the Search API Typesense module doesn't provide a View integration, mainly because Typesense is intended for use with a JavaScript client.
Luckily, the folks at Algolia have developed an open-source JavaScript library to build complex search UIs called InstantSearch.js.
InstantSearch.js provides widgets to render a search box, a paged list of results, different types of facets, and many more. The search API Typesense module has a reference implementation of such a search experience in the index configuration, so you can interact with the search engine to test your configurations.
Conclusions
Just before the talk, I released the first beta version of the module. It's production-ready, but I want to implement some missing features before releasing the first stable version. Most notably, I want to add support for geo search, collection aliases, search preset, and analytics.
The people in the room seemed satisfied with the talk, as demonstrated by the questions they asked me. Typesense can be a useful tool for improving the search experience on Drupal.
To discover more, here's a link to the slides: https://www.slideshare.net/slideshow/drupal-module-search-api-meets-typesense/272032130