Search API meets Typesense: Luca Lusso at DrupalCon Barcelona

Search API meets Typesense: Luca Lusso at DrupalCon Barcelona
Comment

DrupalCon Barcelona has just finished, and as you can imagine, AI was one of the most discussed topics. One of the advantages AI can bring to our websites as customers is helping us find the right piece of content or the right product.

My presentation at the conference was about providing users with the advanced search experience they expect nowadays.

Luca Lusso
Luca Lusso

Typesense as a Search Engine for Drupal

Complex Drupal websites typically use an external search engine to index and retrieve information about content or products. Actual solutions are valid and used on dozens of websites. Still, when they are complex to set up and operate (like Solr or Elasticsearch), others may become very expensive (SaaS solutions like Algolia).

Typesense is an open-source search engine written in C++ that is very easy to deploy, fast, and developer-friendly. There's even an add-on for DDEV to start a local instance quickly.

To prepare for the talk, I scraped 25k forum posts from drupal.org, migrated them into a Drupal 11 website, and indexed them to Typesense and Solr. I index the data into Typesense using a new module that I maintain, Search API Typesense.

The first thing I demonstrated to the audience was how fast Typesense can be. While Solr is fast (WebProfiler measures ~180ms to render a page), every search and every time a facet is selected, triggers a full page reload.

With Typesense, the first search returns in ~8ms, and every subsequent interaction with the SERP reloads the results set.

Typesense supports a search-as-you-type approach, where the results are retrieved while the user is typing the search query; this provides a better user experience, as results are returned quickly.

Module's Features

During the rest of the talk, I showed all the Typesense features the Search API Typesense module implements.

Directly from the Drupal UI, you can manage API keys, configure stopword sets, and insert synonyms.

A notable feature of Typesense, usually found in high-end paid services, is the ability to push content for marketing purposes. You may want to show a specific node at a specific position in the SERP when a user searches for a specific keyword, regardless of the current ranking. Typesense calls this feature "curation", and with Search API Typesense, you can configure it directly from Drupal.

Of course, AI is the other hot topic. Internally, Typesense is also a vector database, and it can use an LLM to generate a vector representation of the contents we send to it. A vector (speaking easily) is a list of float numbers that try to represent the semantic meaning of a text. When you have vectors for content, you can ask Typesense for documents that are semantically similar to the search query. More than that, you can perform a hybrid search and find documents that match some keywords or are semantically similar.

Search API Typesense allows you to configure which fields to generate the vector, which LLM to use, and how to split your text into chunks. But we can go even further and use Typesense as a RAG engine. RAG, or Retrieval-Augmented Generation, is a technique used to converse with an LLM. You insert a search query, and Typesense uses it to retrieve a set of contents. Then, it sends an API to an LLM with the text from the retrieved content and a question like "You are an assistant for question-answering. You can only make conversations based on the provided context...". The response can then be used to print the generated answer and the set of contents used to create it. The questions are called "conversation models" in Typesense, and you can manage them using a UI provided by the Search API Typesense module.

Luca Lusso
Luca Lusso at DrupalCon Barcelona 2024

Building a SERP

The last thing I showed in the presentation was the options a developer can use to build a search results page. Unlike standard Search API workflow, the Search API Typesense module doesn't provide a View integration, mainly because Typesense is intended for use with a JavaScript client.

Luckily, the folks at Algolia have developed an open-source JavaScript library to build complex search UIs called InstantSearch.js.

InstantSearch.js provides widgets to render a search box, a paged list of results, different types of facets, and many more. The search API Typesense module has a reference implementation of such a search experience in the index configuration, so you can interact with the search engine to test your configurations.

Conclusions

Just before the talk, I released the first beta version of the module. It's production-ready, but I want to implement some missing features before releasing the first stable version. Most notably, I want to add support for geo search, collection aliases, search preset, and analytics.

The people in the room seemed satisfied with the talk, as demonstrated by the questions they asked me. Typesense can be a useful tool for improving the search experience on Drupal.

To discover more, here's a link to the slides: https://www.slideshare.net/slideshow/drupal-module-search-api-meets-typesense/272032130

Note: The vision of this web portal is to help promote news and stories around the Drupal community and promote and celebrate the people and organizations in the community. We strive to create and distribute our content based on these content policy. If you see any omission/variation on this please let us know in the comments below and we will try to address the issue as best we can.

Related Organizations

Related Events

Advertisement Here

Upcoming Events

Latest Opportunities

Advertisement Here