Caching dynamically generated html of your website can greatly speed up its Time to First Byte, decrease the load on your webservers and reduce traffic cost, but can also lead to stale content on frequently changing sites. In this post I will focus on improving the Time to First Byte while keeping the content fresh with a cache duration of a single page load.

Caching can be done at the webserver / reverse proxy level, for example with NGINX or in this case at the CDN level with cloudflare workers. Doing it with NGINX, my favorite webserver, is easier since you don't have to write a bunch of JavaScript. Using the cloudflare cache has the advantages of your site still being reachable even if your webserver / host goes offline and the cache is closer to your visitor thus reducing the distance the network packages have to travel through the internet.

You can cache html with cloudflare without using a worker by using a page rule and setting the Cache Level to Cache Everything, but this offers limited possibilities. You have to cache for at least two hours, you can't bypass the cache when sending a specific cookie without paying 200$ a month for the business plan and you can't set your own cache key without the enterprise plan.

I'm not a JavaScript developer so my code is probably not as clean and beautiful as it could be, but it works and it does so reliable as it powers the website of one of Germanys biggest tutoring companies with millions of requests per month https://www.schuelerhilfe.de/ (my current employer). Thanks goes to my brother, who is a developer, for helping me with some issues at the beginning.

Using Cloudflare workers is free for up to 100.000 requests per day. This counts all non html/text as well, since all requests go through the worker. If you need more the paid plan is also quite affordable.

Overview

The worker I created can do the following:

  • Cache Content-Type text/html when the status code matches a specific number
  • Remove any Set-Cookie header to avoid log in multiple user as the first user who cached the site and other fun stuff
  • Remove tracking query parameter like ?gclid from the cache key to improve cache hit ratio
  • Bypass the cache on specific cookies, URL paths or query parameters
  • Respond with a stale version if the backend is down or is producing errors

My worker uses the cloudflare worker cache API not the cache argument in the fetch function. The fetch function needs the appropriate paid plan to use things like custom cache keys and bypass cache on cookie.

You can find the current version of my worker in my GitHub repo: https://github.com/stephan13360/cloudflare-worker/tree/master/cache

As of the time of writing this it what the worker looks like, explained in more detail below:

Details

The worker starts at the addEventListener method and first checks if the request is a POST request and returns if it is. When the worker returns inside the addEventListener method the request will be forwarded to the origin as if there was no worker. Cloudflare by default should not cache POST request anyway I just want to make sure.

addEventListener("fetch", (event) => {
  try {
    let request = event.request;
    // bypass cache on POST requests
    if (request.method.toUpperCase() === "POST") return;
    // bypass cache specific cookies, urls, or query parameter
    if (checkBypassCache(request)) return;
    return event.respondWith(handleRequest(event));
  } catch (err) {
    return new Response(err.stack || err);
  }
});

After that the checkBypassCache function is called which consists of three loops to check if the cookies, URL path or query parameters include anything that is specified in the arrays at the top of the script. Using these arrays you can adjust when the worker will bypass the cache. In my case I'm bypassing everything that goes to my blog backend at /ghost/ and also bypass the cache when I'm logged in which sets an ghost-admin-api-session cookie. In addition I can bypass the cache by setting the cookie no_worker_cache=true. The second cookie I use to bypass the cache with my website monitoring tool so I get an alert when my blog is down, which I would not see without bypassing.

If nothing matches the request is passed to the handleRequest function.

async function handleRequest(event) {
  try {
    let request = event.request;
    let cacheUrl = new URL(request.url);
    cacheUrl = await removeCampaignQueries(cacheUrl);
    let cacheRequest = new Request(cacheUrl, request);
    let cache = caches.default;

    // Get response from origin and update the cache
    let originResponse = getOrigin(
      event,
      request,
      cache,
      cacheRequest,
      cacheUrl
    );
    event.waitUntil(originResponse);

    let response = await cache.match(cacheRequest);
    // Use cache response when available, otherwise use origin response
    if (!response) response = await originResponse;
    return response;
  } catch (err) {
    return new Response(err.stack || err);
  }
}

By default Cloudlfare uses the full URL (the cache key) to match the request to the cached object. If the query parameters have a different order or if one parameter includes a unique id, the request will not match a cached object. So if your visitor comes from Facebook or Google Ads and has a fbclid or gclid parameter with an unique id, they will never get a cache hit. To improve the cache hit ratio the removeCampaignQueries function is called and removes common tracking query parameters from the cache key, NOT the actual URL and saves it as cacheUrl.

async function removeCampaignQueries(url) {
  let deleteKeys = [];

  for (var key of url.searchParams.keys()) {
    if (key.match(TRACKING_QUERY)) {
      deleteKeys.push(key);
    }
  }

  deleteKeys.map((k) => url.searchParams.delete(k));

  return url;
}

I then call the function getOrigin to retrieve the current version from the actual webserver. This request to the origin is asynchronous so it will not block further execution of the worker. While the worker fetches the current version it tries to get the the cached version from the cache with await cache.match(cacheRequest). If a cached version is available it will return it to the client immediately without waiting for the origin request. If no cached version is available the worker waits for the promise of getOrigin to be fulfilled and returns the current version to the client.

Regardless if a cached version was available or not the getOrigin function will continue in the background. The line event.waitUntil(originResponse); makes sure that the worker will not stop until the promise is fulfilled.

async function getOrigin(event, request, cache, cacheRequest) {
  try {
    // Get response from orign
    originResponse = await fetch(request);

    // use normal cloudflare cache for non html files
    if (!originResponse.headers?.get("Content-Type")?.includes("text/html"))
      return originResponse;

    // must use Response constructor to inherit all of response's fields
    originResponse = new Response(originResponse.body, originResponse);

    if (CACHE_ON_STATUS.includes(originResponse.status)) {
      // Delete cookie header so HTML can be cached
      originResponse.headers.delete("Set-Cookie");

      // Overwrite Cache-Control header so HTML can be cached
      originResponse.headers.set(
        "Cache-Control",
        "public, s-maxage=604800, max-age=0"
      );

      // waitUntil runs even after response has been sent
      event.waitUntil(cache.put(cacheRequest, originResponse.clone()));

      return originResponse;
    } else {
      return originResponse;
    }
  } catch (err) {
    return new Response(err.stack || err);
  }
}

The getOrigin function fetches the current version with the build in fetch function. It then checks if the Content-type is text/html, otherwise it will return the fetched version immediately. Static resources like images, JavaScript or stylesheets will still be cache this way, since the fetch function uses Cloudflare's cache. So all assets that are not text/html will be cached the same way as if no worker would be used.

Next the worker checks if the origin status code matches any of the numbers in the CACHE_ON_STATUS array at the beginning of the worker, otherwise it will return the current version without caching it. This stops the worker from caching 5xx errors.

If the status code matches, the worker deletes possible Set-Cookie headers. Cloudflare will not cache html/text when a Set-Cookie header is present, so running into problems that multiple uses would get the same login token for example won't happen anyway but you would not be able to cache requests which set unnecessary cookies. Deleting the Set-Cookie Header sounds dump, why would you want this you may ask. For my Ghost Blog I wouldn't need it since it only sets cookies on the backend login page, but I have encountered different CMSs over the years, that set cookies on every page request which is often unnecessary. Typo3 which runs on https://www.schuelerhilfe.de/ is such a case. If possible disable unnecessary Set-Cookie header ad the origin, but deleting the also works. Make sure you understand why your site is setting cookies and check if you need them or not. You will definitely need them on pages where people need to login, so be sure to bypass the cache with BYPASS_PATH for example.

Next we set / overwrite the Cache-Control header so the cache API knows how long it should cache the requests. Most CMSs don't allow caching of text/html with a no-cache Cache-Control header for example. We set a s-maxage of one week, this is only honored by the cloudflare cache and ignored by clients and a maxage of 0 for the clients, since we want them to request html every time as always. Cloudflare's cache is not infinitely big and objects that have not been requested for a while will be evicted. In my tests objects stayed in the cache for 12-24 hours.

The worker then saves the origin requests in the cache and also returns the origin request to the handleRequest function in case the cache was empty until now.

Testing

Cloudflare sets two headers to check the status of the cache.

The cf-cache-status header which shows you if a response has been delivered from from cache, were HIT means it was delivered from cache and DYNAMIC (for html content) or MISS (for other static content) will indicate it was not.

The age header will give the the time in seconds since this content was cached. Since we refresh the cash on each pageview, this counter will reset every time we hit reload. When the origin is down the age header will not reset since the cache could not be refreshed from the origin.

We can also see our cache-control header here.

Conclusion

As mentioned at the beginning, this worker is used to improve Time to First Byte. It does so by caching the html on Cloudflare's edge where all static files are also cached, and delivers the html from the cache without waiting for the origin. The longer your backend needs to generate the html the more benefit you will see. For example the Typo3 on https://www.schuelerhilfe.de/ takes around two seconds, a delay which visitors will not see if the requested site is in the cache. With the amount of visitors we get, most site will always have a cached version available.

It also helps keeping the site online when the origin has problems and is either generating error pages or is completely offline.

But unlike other caches this will not save to you bandwidth or reduce load on the origin, since every requests is still passed to the origin. There is always a trade-off between content freshness and costs savings.