Sitemap

Less Load — More Requests: The Art of API Caching

13 min readJul 25, 2025

Hi! My name is Dima, I’m a Backend Developer at Doubletapp. In this article, I’ll talk about API caching (using Django Ninja as an example): how it benefits the business and when it makes sense to implement it.

Press enter or click to view image in full size

Why You Might Need Caching

As your product grows and user numbers increase, every repeated request to the server adds unnecessary load. Even if a user simply refreshes a page, or multiple users make the same request, the server processes each request from scratch — consuming resources each time.

Now imagine this: you can handle several times more user requests simultaneously, without scaling your infrastructure or rewriting your core product. How? By using caching — a technique that “remembers” identical requests and reduces server load.

Why These Tips Apply Beyond Django

Although the examples in this article use Django Ninja, the caching principles are universal and can be applied to any technology — be it Flask, FastAPI, Express.js, Laravel, or Spring.

That’s because caching relies on standard mechanisms and layers of interaction on the web:

  • HTTP protocol: headers like Cache-Control, ETag, Last-Modified, and response codes like 304 Not Modified work the same across all languages and frameworks.
  • Key-value stores: tools like Redis, Memcached, and other in-memory databases are widely used in modern applications.
  • Client-side caching: browsers and intermediary proxies (e.g., Cloudflare, Nginx) interpret HTTP headers the same way, regardless of the server’s technology.

So even if you’re using a different stack, the advice in this article will still be relevant and easy to adapt to your project. Django Ninja is used here simply as a concrete and understandable example.

Overview of Cache Types

Press enter or click to view image in full size

Effective API caching can be implemented at various levels — from the browser to server-side storage. Each type of cache solves its own task and can be used either independently or in combination. Let’s look at each type separately.

Server-Side Cache (Key-Value Stores)

Server-side cache is a temporary in-memory storage on the backend side of the application, using tools like Redis, Memcached, or similar key-value solutions. It’s one of the most flexible and controllable caching strategies, since it’s fully managed by the developer.

In practice, this means storing the results of API calls, intermediate computations, or prepared data in memory, so that repeated calls don’t require doing the same work again.

Pros:

  • Instant access to in-memory data
  • Flexible TTL (time-to-live) and key management
  • Scales well and integrates easily with any stack
  • Suitable for caching all kinds of data — from serialized JSON responses to SQL query results.

Cons:

  • May require synchronization in distributed systems
  • Requires monitoring data volume to avoid memory overuse

Server-side caching integrates well with frameworks like Django Ninja via the built-in django.core.cache module.

Example: endpoint-level caching

from django.core.cache import cache
from ninja import Router
import hashlib
import json

router = Router()

def cache_response(timeout=60):
def decorator(func):
async def wrapper(request, *args, **kwargs):
# Build a cache key based on URL and query parameters
key_source = f"{request.path}?{json.dumps(dict(request.GET), sort_keys=True)}"
key = "api_cache:" + hashlib.md5(key_source.encode()).hexdigest()

# Try to retrieve from cache
cached = cache.get(key)
if cached:
return cached

# Call the handler and store result in cache
response = await func(request, *args, **kwargs)
cache.set(key, response, timeout)
return response
return wrapper
return decorator

@router.get("/public-data")
@cache_response(timeout=300) # cache for 5 minutes
async def public_data(request):
return {"data": "expensive computation result"}

This approach works well for GET requests with deterministic results. Use caution with POST/PUT/DELETE.

Caching ORM Queries

If your API relies heavily on ORM queries, you can further optimize performance by caching the SQL queries themselves. One way to do this is with the django-cachalot library, which automatically caches ORM query results and reuses them.

Installation:

pip install django-cachalot

Setup:

Add it to INSTALLED_APPS:

INSTALLED_APPS = [
...
'cachalot',
]

Make sure to configure a cache backend (e.g., Redis):

CACHES = {
'default': {
'BACKEND': 'django.core.cache.backends.redis.RedisCache',
'LOCATION': 'redis://127.0.0.1:6379/1',
}
}

Usage example:

from myapp.models import Product

def get_products():
# This query will be automatically cached
return list(Product.objects.filter(is_active=True).order_by("name"))

django-cachalot automatically invalidates the cache when data in the database changes, and it works well for frequently repeated SELECT queries. This is a great way to optimize your API without rewriting business logic.

A note of caution: avoid using django-cachalot if a table is updated more than 50 times per minute. It won’t break, but performance will degrade, and the library may start slowing down your project instead of speeding it up. See the project documentation for details.

Cache Invalidation, Keys, and TTL

When caching API responses, it’s not enough to just store data — you also need to manage the lifecycle of that data. Cache by itself isn’t “smart” — it doesn’t know when the data on the server has become outdated or changed. So, it’s your responsibility as a developer to define how, when, and based on what criteria cache entries should be cleared or refreshed.

This involves two main aspects:

  • Invalidation — explicitly removing stale data when the system state changes (e.g., after updating a DB record).
  • TTL (Time To Live) — automatic expiration of cache entries so that they don’t live in memory forever.

If these aspects are not well-designed, caching turns from a helper into a source of bugs: users see outdated data, the system becomes overloaded with unnecessary requests, and debugging becomes chaotic.

Client-Side Cache (Browser, Proxies)

Press enter or click to view image in full size

Client-side caching happens on the user’s side — in the browser, in a mobile app, or in intermediate HTTP proxy servers (such as local CDNs). It helps avoid repeated requests to the server if the requested data was previously retrieved and is still considered fresh.

This type of cache is controlled via standard HTTP headers set by the server in its response.

Key headers include:

  • Cache-Control: controls caching rules and freshness (e.g., max-age, no-cache, public, private)
  • ETag: a hash or version of the resource used to compare the old and new copies
  • Last-Modified: timestamp indicating the last change of the resource
  • Expires: a fixed date/time until which the cached data is considered valid.

If the browser receives a response with these headers, it may skip making a new request entirely if the cached resource is still fresh.

Advantages:

  • Faster data loading on the client side
  • Reduced number of API calls
  • Easy to use — entirely based on HTTP standards.

Limitations:

  • Suitable only for public and stable data
  • Not recommended for personalized or sensitive content without additional flags (private, no-store)
  • Misconfigured headers may lead the client to use outdated data.

Client-side caching is especially useful for metadata, reference data, public collections, and other rarely changing resources. When configured properly, it enables instant responses and saves bandwidth without burdening the server.

It’s also effective for periodically updated data that doesn’t require real-time accuracy — such as rankings, statistics, aggregates, or analytics, which can be cached for several minutes or even hours without degrading the user experience.

HTTP Headers: Cache-Control, Expires

Press enter or click to view image in full size

Let’s take a closer look at the main headers used to control client-side caching.

Cache-Control

This is the primary header for controlling cache behavior. It specifies who can cache the response, for how long, and under what conditions.

Common directives:

  • public — the response can be cached by any intermediary (browser, proxy, CDN)
  • private — only the client (e.g., browser) may cache the response
  • no-cache — requires revalidation with the server on every request
  • no-store — forbids caching entirely (e.g., for tokens or sensitive data)
  • max-age=3600 — marks the response as fresh for 3600 seconds

Example:

Cache-Control: public, max-age=300

Expires

This header specifies a fixed date/time until which the response is considered fresh.

Example:

Expires: Wed, 12 Jul 2025 12:00:00 GMT

If both Cache-Control and Expires are set, Cache-Control takes precedence.

Using these headers in combination gives you full control over caching — from basic expiration to advanced validation and updating mechanisms that minimize unnecessary traffic.

Conditional HTTP Requests

Press enter or click to view image in full size

Conditional requests are a mechanism that allows the client to check whether a resource has changed on the server since it was last retrieved. If the resource remains unchanged, the server responds with status code 304 Not Modified and omits the response body — saving bandwidth and improving performance. This is especially useful for frequently accessed but rarely updated data like product lists, public profiles, metadata, categories, etc.

How It Works

The client stores a “version” of the resource (e.g., an ETag or Last-Modified timestamp). On the next request, it sends that version back using the If-None-Match or If-Modified-Since header.

  • If the data hasn’t changed — the server returns 304 Not Modified.
  • If the data has changed — the server responds with a regular 200 OK and the updated content.

When to Use Conditional Requests

Benefits

  • Reduced bandwidth usage (especially on slow connections)
  • Lower server load
  • Faster responses (less data transferred over the network)
  • Compatible with proxies, browsers, and CDNs.

ETag

ETag (Entity Tag) is a unique identifier for a version of a resource. When a client receives a response with an ETag, it can include that value in an If-None-Match header on subsequent requests. If the content hasn’t changed, the server returns a 304 Not Modified status, and the body is omitted.

When to Use

Use ETag when:

  • You need more precise change detection than Last-Modified provides (e.g., down to the byte or logical state)
  • You want to version the resource based on content rather than timestamps.

ETag is particularly useful when precision matters: files, configurations, binary resources, nested JSON structures, and anything that can change “silently” from a timestamp perspective.

Example with Django:

import hashlib, json
from django.http import JsonResponse
from django.utils.http import quote_etag

def generate_etag(data):
return quote_etag(
hashlib.md5(json.dumps(data, sort_keys=True).encode()).hexdigest()
)

@api.get("/status")
def get_status(request):
data = {"version": "1.2.3", "uptime": "120h"}

etag = generate_etag(data)
if request.headers.get("If-None-Match") == etag:
return JsonResponse({}, status=304)

response = JsonResponse(data)
response["ETag"] = etag
return response

Initial server response:

GET /api/products/42 HTTP/1.1

HTTP/1.1 200 OK
ETag: "v42-abcdef"
Content-Type: application/json
{
"id": 42,
"name": "Product A",
"price": 199.00
}

Client’s repeat request using If-None-Match:

GET /api/products/42 HTTP/1.1
If-None-Match: "v42-abcdef"

Server response if data hasn’t changed:

HTTP/1.1 304 Not Modified

Server response if data has changed:

HTTP/1.1 200 OK
ETag: "v43-xyz123"
Content-Type: application/json
{
"id": 42,
"name": "Product A",
"price": 189.00
}

Last-Modified

The Last-Modified header tells the client when the resource was last changed. It works in conjunction with the If-Modified-Since header, which the client includes in subsequent requests to check whether the cached version is still up to date. If not modified, the server returns 304 Not Modified and omits the response body — saving resources and bandwidth.

This mechanism is simpler and easier to implement than ETag, and it’s a great fit for caching public, infrequently updated resources like articles, profiles, products, or news.

When to Use

Use Last-Modified when:

  • You have a reliable updated_at field that gets updated on every change
  • You don’t need millisecond-level precision

Example in Django:

from datetime import datetime
from django.utils.http import http_date, parse_http_date_safe
from django.http import JsonResponse
from myapp.models import Article
from ninja import Router

router = Router()

@router.get("/articles/{article_id}")
def get_article(request, article_id: int):
article = Article.objects.get(id=article_id)
last_modified = article.updated_at

# Check the If-Modified-Since header
since = request.headers.get("If-Modified-Since")
if since:
since_dt = parse_http_date_safe(since)
if since_dt and last_modified.timestamp() <= since_dt:
return JsonResponse({}, status=304)

response = JsonResponse({
"id": article.id,
"title": article.title,
"content": article.content,
})
response["Last-Modified"] = http_date(last_modified.timestamp())
return response

Example server response:

HTTP/1.1 200 OK
Content-Type: application/json
Last-Modified: Wed, 10 Jul 2025 12:45:00 GMT

{
"id": 42,
"title": "How Last-Modified Works",
"content": "This is an example article with conditional request support."
}

Client’s follow-up request:

HTTP/1.1 304 Not Modified

If the data has changed, the server will return the updated content along with a new Last-Modified timestamp.

Intermediate Caching (CDN, Reverse Proxy)

Press enter or click to view image in full size

Intermediate caching occurs between the client and your server. It’s implemented using a CDN (Content Delivery Network) or a reverse proxy — most commonly with tools like Nginx, Varnish, Cloudflare, Fastly, etc.

This is one of the most effective layers of caching because it offloads traffic from your API server before the request even reaches it. Responses for frequently requested resources (such as public APIs, images, or JSON collections) can be cached right at the edge, closer to the user.

When to Use Intermediate Caching

You should consider CDN or reverse proxy caching when:

  • You serve public and frequently requested data (e.g., directories, catalogs)
  • You want to reduce latency for users worldwide

Advantages

  • Reduces server load
  • Fast responses (no need to proxy to the API backend)
  • Lower latency for geographically distant users
  • Easy integration with cloud-based infrastructure.

Disadvantages

  • Cache invalidation can be more complex (especially when data updates frequently)
  • Not suitable for personal or sensitive data without private / no-store directives
  • Requires precise HTTP header configuration — misconfigurations can lead to caching the wrong content or not caching at all
  • More difficult to test and debug.

Example: Caching API Responses with Nginx

If your API serves JSON responses, you can configure Nginx to cache them like this:

proxy_cache_path /tmp/nginx_cache levels=1:2 keys_zone=api_cache:10m inactive=5m max_size=100m;

server {
listen 80;

location /api/ {
proxy_pass http://127.0.0.1:8000;

proxy_cache api_cache;
proxy_cache_valid 200 302 10m; # cache 200/302 responses for 10 minutes
proxy_cache_valid 404 1m; # cache 404 responses for 1 minute
proxy_cache_key "$scheme$request_method$host$request_uri";

add_header X-Cache-Status $upstream_cache_status;

proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}

What’s happening here:

  • proxy_cache_path — defines where to store the cache and how it behaves
  • proxy_cache — enables caching for this location
  • proxy_cache_valid — sets TTL for different status codes
  • proxy_cache_key — generates a cache key (query params included by default)
  • X-Cache-Status — helps with debugging (returns HIT, MISS, BYPASS).

For responses to be cacheable, your API must explicitly allow it — either via HTTP headers (e.g., Cache-Control: public, max-age=600) or by design. If your responses include Cache-Control: no-store or private, Nginx (and most CDNs) will not cache them.

Also keep in mind that POST requests are not cached by default; caching generally applies only to GET requests.

Intermediate caching is a powerful optimization tool, especially under heavy load and with geographically distributed users. Like any tool, it requires careful configuration and understanding of its limitations. A well-tuned reverse proxy can serve the majority of your traffic without touching your application — ensuring instant responses and great scalability.

Conclusion

Caching is one of the most effective tools for scaling and speeding up APIs — without adding server resources or rewriting application logic. It not only reduces load on your database and server, but also dramatically improves response times — especially when every millisecond counts.

In this article, we covered caching at all levels:

  • Server-side — using key-value stores and manual invalidation
  • Client-side — via HTTP headers and conditional requests (ETag, Last-Modified)
  • Intermediate — with CDNs and reverse proxies.

The key is not to cache everything blindly, but to thoughtfully choose a strategy for each scenario. Consider how often the data changes, how critical it is, what kind of clients are consuming it, and what performance requirements you’re working with.

A little cache is fast. Great caching is an art.

We implement caching in products with various architectures and loads — from designing the strategy to fine-tuning and testing. Get in touch, and we’ll craft a solution that fits seamlessly into your infrastructure.

Additional Resources

Read our other stories:

Linkedin.com

--

--

No responses yet