Drupal 10's Cache API: How It's Setting New Standards in Web Speed

poster
Comment

Recently, I stumbled upon an eye-opening article that placed Drupal at the top of PHP framework performance charts, especially with the latest version of PHP (8.3). This revelation sparked a question: why is Drupal, a platform known for its complexity and scale, outperforming its peers in speed and efficiency?

I wrote about this in a LinkedIn post, which garnered much attention and comments, so I thought it would be good to expand on the concept and add more details. If you want to know why Drupal 10 sets new standards for web performance, read on.

Image source: PHP Benchmarks: Real-World Speed Tests for Versions 8.1, 8.2, and 8.3

Drupal Is Big

At first glance, it seems counterintuitive. The typical Drupal 10 application, even without any advanced or custom code, is undoubtedly one of the most complex Symfony applications in the world. If you want proof, check out how many composer.json files are in Drupal core - 154 as of version 10.2. A more thorough investigation would show that many of those files are small and either only indicate requirements for PHP versions or require many of the same libraries in different combinations. Still, it's a great indicator of how advanced and complex Drupal is before you even start building.

When the modern OOP rewrite of core happened during Drupal 8's development, I remember how the community found and patched a ton of bugs in PHP and Symfony. The scope of the project was larger than anyone had really tried to do before in PHP, and so we found a lot of edge cases that no one had seen. This is a great example of how the open-source community can collaborate across projects to improve things.

So, the question is this - how does such a colossal framework manage to rival, or even surpass, the performance of streamlined platforms like Symfony and Laravel, which are many times smaller, or even get close to frameworks like CodeIgniter, which are fine-tuned for speed?

While there are undoubtedly several factors at play, my instincts tell me that a major contributor to this feat is Drupal's sophisticated caching layer - one of the best-kept secrets of the framework and an unsung hero.

Caching Is Hard

Before we get too much further into how Drupal does its thing, let's take a moment to understand why caching is such a problem in the first place. Actually, caching itself is easy - that is, just storing a copy of something. It is knowing when and how to invalidate (or delete) a cache that is hard.

There are only two hard things in Computer Science: cache invalidation, naming things, and one-off errors.

Phil Karlton, former Netscape developer

Imagine you're doing a school project on the solar system. You find a great book in the library and use it for your project. The next day, you need the book again, but instead of going to the library, you find it on your desk. That's like caching—keeping something handy so you can quickly use it again without going through the whole search process.

Now, what if that book gets updated with new information about a newly discovered planet? You'll miss out on the latest info if you keep using your old book. This is where 'cache invalidation' comes in. It's like someone telling you, "Hey, your book is outdated. Here's the new version." In computer terms, it means updating the stored (or cached) information so that you always have the most current data.

But the tricky part is knowing exactly when to replace the old information with new information. If you do it too often, it's like someone constantly swapping your book, even for tiny updates, which can be annoying and unnecessary. But if you do it too rarely, you might miss out on important new information.

Understanding the Web Application Caching Problem

In the world of web applications, this is a big deal. Apps and sites need to show the latest information, like news updates or product prices, but they also need to load fast. Balancing these twokeeping the website speedy with caching and ensuring the information is up-to-date with cache invalidationis a real brain teaser.

This is even tougher when you talk about a system like Drupal (or any modern CMS) because the content itself is usually dynamically generated. That leads to a few problems to solve:

  • Building the Content Takes Time:  The server has to receive the request for the content, figure out if you have access to it (authentication and access checks), make the proper call to the database to get the right content, make modifications as needed, generate the results as HTML, and then deliver the response to the client. The whole process only takes a second or two, but that can add up over time. This can not only be annoying to users but will have a negative impact on SEO.
  • Building the Content Doesn't Scale: Generating a page for a user may take a few seconds, but if it has to do that for every request, it can quickly max out the ability of the server or database to respond. A 1-2 second build leads to 5, 10, 30, or 60 seconds or more for each page. Eventually, the server crashes from overload. No matter how big, powerful, or numerous your servers are, even modest amounts of traffic can knock you out of commission… and that's an expensive proposition.
  • Saving the Content Takes Care: The basic idea is to save the generated response so the subsequent request can use the saved version instead of rebuilding it. However, then we have to determine the best way to save it, the fastest way to return it, and a way to organize all of these cached or saved pages in the first place. This is actually the easiest problem to solve, and even the most primitive of frameworks have some basic caching strategy.
  • Clearing the Cache Is the Real Issue: Even when we have a way to save the results, we need to delete or clear those results to get the updated version to the user. Ideally, we want this done with as little delay as possible and without requiring the developer to spend much time and energy managing the process.

So, we understand that “cache invalidation” is a complex problem, but why? Typically, this is done with time-based caches (cache expiration), which means you will always have out-of-date content somewhere. It may be ok for some applications and websites if a page has “stale” content for a few hours or more. However, modern users and organizations expect data to be up-to-date and timely, and in some cases, that may be a legal requirement.

Another approach is to “clear the cache” regularly, which is the basic strategy that earlier versions of Drupal took (<=7). This will impact the effectiveness of the caching strategy since it will need to be rebuilt more often. That means a higher number of slow page loads, more server impact, and potential scalability limits.

The best solution would be to only clear the parts of the cache that need it when there are changes to be pushed out. It would be like having a magical book that updates itself with the right information at the right time, ensuring you always have the quickest and most accurate information for your project.

Drupal 10 Is Incredibly Smart

So, when we talk about Drupal 10 and its excellent caching system, we really appreciate how it solves one of the biggest challenges in computer science and makes it easy for us. That is the most impressive part of all. Without doing anything special or taking any specific steps, Drupal 10 provides one of the most sophisticated cache invalidation systems available in the world… right out of the box.

It. Just. Works.

However, Drupal's prowess isn't just in caching page responses. It goes further by caching dynamic views at both the query and output levels, with adjustable settings for each. The brilliance of its cache tagging system allows for nuanced cache clearingnot just for individual items or bins but for related cache tags as well.

Imagine updating an image and having the system intelligently clear the cache for all content and views that utilize that image without purging the entire cache. This level of precision is a game-changer.

Moreover, Drupal doesn't stop at unauthenticated users. It extends its advanced caching capabilities to authenticated users, seamlessly managing a blend of common cache, authenticated content, and personalized data.

Building a Drupal application means you're not just creating a website. You're inheriting a state-of-the-art content management and Response caching system, pre-configured and ready for action. And for those who delve deeper, the system offers everything you need for tuning and optimization, with custom code effortlessly integrating into this robust framework.

A Technical Look at Drupal's Caching API

Drupal's caching system is a masterpiece of engineering, designed to handle the complexities of large-scale, dynamic web applications. At its core, this system is built on several key components that work in harmony to deliver this amazing performance.

  1. Cache Contexts: These are the backbone of Drupal's intelligent caching strategy. Cache contexts provide a way to vary cached content based on specific conditions or scenarios. For instance, different users might see different cached content based on their roles or geographic locations. This flexibility allows Drupal to deliver personalized experiences without sacrificing speed.
  2. Cache Tags: Drupal's cache tagging system is an elegant approach to cache invalidation. Each cache item is tagged with identifiers related to its content. When a piece of content is updated, only the cache items with matching tags are invalidated. This targeted approach prevents the need for wholesale cache clearing, which is a common pain point in less advanced systems.
  3. Architecture: Drupal's caching architecture is a layered system, allowing for different levels of caching. Each layer is optimized for specific use cases, from page caching for anonymous users to dynamic page caches for authenticated users. Additionally, integrating external caching systems like Varnish or Memcached further enhances Drupal's ability to scale and perform under heavy loads.
  4. Query and Output Level Caching for Views: One of Drupal's standout features is its ability to cache views at both the query and output levels. This means that both the database queries and the rendered output of views are cached. Such granularity ensures that Drupal can efficiently manage resources, reducing database load and speeding up content delivery.
  5. Advanced Caching for Authenticated Users: Traditionally, caching for authenticated users has been challenging due to the dynamic nature of personalized content. However, Drupal's advanced caching mechanisms handle this elegantly, caching common elements while dynamically generating user-specific content. This ensures that even logged-in users enjoy fast loading times.

As if this wasn’t good enough, there are dozens of community modules that will extend, augment, or let you adjust the caching system to suit your needs. From there, you can also create your own custom modules to let you do pretty much anything you want.

Did you know that you can configure Drupal to not only clear its own cache, but also clear “downstream” caches like Varnish, Memcachd, or your CDN? This means that Drupal’s amazing system can effectively take advantage of and manage additional layers of caching with ease.

In essence, Drupal's caching system is not just about storing and retrieving data. It's about cleverly determining what to cache, how to cache it, and when to invalidate it. Despite their complexity, this intelligent approach is a significant factor in why Drupal sites can outperform simpler platforms.

Examples of Drupal Cache Performance

Now we have a good idea about how Drupal’s caching system works, but what does it look like in practice? Let's examine a few examples to understand how this system will help.

  • e-Commerce Site Performance: Imagine an e-commerce site built on Drupal Commerce. The site experiences high traffic, especially during sales events. Thanks to Drupal's cache tagging system, only the pages displaying that product need their cache cleared when a product price is updated. 

    Additionally, you could cache prices differently for different user roles. So, a wholesaler would see lower prices than a general public consumer. This targeted approach ensures the site remains fast and responsive, even when handling frequent content updates.
     
  • News Portal with Dynamic Content: Consider a news portal that publishes articles frequently. Drupal's caching system allows for the efficient handling of dynamic content. For instance, the homepage, which aggregates the latest news benefits from Drupal's query and output level caching for views. This means that the homepage loads quickly for users, despite the constant updates, as only small parts of the page need to be refreshed.

    As new stories are published or updated, the cache can clear just the blocks on the homepage that have changed without needing to rebuild the entire page. This is a handy feature, but as your traffic increases, it can be a requirement to keep the site upespecially when breaking news can overwhelm more basic sites.
     
  • Personalized User Experiences: A university's Drupal-based website provides personalized information to students and faculty. The advanced caching for authenticated users ensures that while common elements like navigation menus are cached, personalized content like course schedules and announcements is dynamically generated. This results in a seamless and fast browsing experience for each user.

    This also means that each user only sees the information they can access. Students wouldn’t be able to see faculty-only data. This seems like a basic feature, but caching authenticated and restricted data without accidentally leaking it to other users is an incredibly difficult problem to solve. Drupal makes it easy.
     
  • High-traffic Blog: A popular blog built on Drupal handles sudden spikes in traffic with ease. The layered caching architecture, including page caching for anonymous users, ensures that the site remains available and performs well under heavy load, such as when a blog post goes viral.

    In fact, combining Drupal cache with a CDN is an easy way to create an “unkillable” site. The cached HTML served from the edge is effectively an “SSG” (statically generated site). This is widely considered the most performant and resilient website delivery pattern.
     
  • Multilingual Corporate Website: A corporate website that serves content in multiple languages benefits immensely from cache contexts. The site can cache pages in different languages and serve the appropriate version based on the user's language preference without any performance penalty.

    This also means that the cache is cleared not just as the content changes but also as the language version of the content changes. You can publish updates to the Spanish version of a page without impacting the performance of the Japanese version.

The fact that Drupal also delivers an advanced multi-language and localization system is another amazing topic worthy of a separate article.

These examples demonstrate how Drupal's caching system contributes to the performance and scalability of websites and enhances the user experience by delivering content efficiently and reliably. It's a testament to Drupal's ability to handle complex, high-traffic websites easily.

It also shows how this framework can easily be adapted to serve the needs of different types of applications and websites. That is only possible because the system is so robust and carefully designed.

Caching as a Core Feature

As we've explored the concepts behind Drupal's caching system, it's clear that this feature is not just an add-on but a core component driving Drupal's exceptional performance. From cache contexts and tags to its sophisticated architecture, Drupal's caching system stands out as an elegant example of efficiency and intelligence.

The real-world examples we've explored demonstrate the benefits of this system. Whether it's an e-commerce site managing frequent updates, a news portal handling dynamic content, or a multilingual corporate website serving diverse user groups, Drupal's caching system proves its worth time and again. It ensures that sites remain fast, responsive, and reliable, regardless of complexity or traffic volume.

Moreover, the system's ease of use is a testament to Drupal's commitment to providing powerful tools accessible to developers of all skill levels. The 'It. Just. Works.' philosophy isn't just a catchy phrase; it's a reality for those who build and maintain Drupal sites. This approach demystifies the often-overwhelming concept of web caching, making it a seamless part of the development process.

Drupal's caching system is an excellent example of how advanced technology can be harnessed to solve real-world problems. It balances complexity with usability, ensuring that both developers and end-users reap the benefits of a fast, efficient, and scalable web experience. As Drupal continues to evolve, its caching system remains a key factor in its enduring popularity and effectiveness.

In a world where PHP and Drupal are sometimes dismissed as relics of the past, they stand as testaments to the value of software evolution. Drupal 10 is amazing because it has over twenty years of continuous work by thousands of developers. The system has continued to improve and advance over time, allowing us access to an incredible framework today.

All that knowledge, power, and effort for an open-source project that remains free for anyone to use. That may be the most amazing thing of all.

Note: The vision of this web portal is to help promote news and stories around the Drupal community and promote and celebrate the people and organizations in the community. We strive to create and distribute our content based on these content policy. If you see any omission/variation on this please let us know in the comments below and we will try to address the issue as best we can.

Advertisement Here

Upcoming Events

Latest Opportunities

Advertisement Here