Implementing Rate Limiting in .NET with Redis Easily

Share this post

Introduction

In today’s modern API’s driven world, protecting your backend from overload and abuse has become essential. Whether you are running a public API service or an internal microservice, rate limiting are critical technique to keep your system stable, online, and cost-effective.

.NET has introduced native rate-limiting features in recent versions, but when you’re working in a distributed environment — where multiple API servers need to share limits — you need something more powerful. That’s where Redis comes in. In this post, we will understand what rate limiting is, why Redis can be an excellent choice for implementing them, and how you can integrate Redis-based rate limiting into your APIs.

If you’re new to Redis or want to explore its various use cases, such as caching, feel free to check out my previous post on implementing distributed caching with Redis.

If you’re interested in another common Redis use case, like caching, feel free to check out my other blog post on implementing distributed caching with Redis.

Table Of Contents

Introduction
Understanding Rate Limiting
Native .NET Rate Limiting
- Quick Intro to .Net RateLimiter
Implementing Rate Limiting with Redis in .NET
Advanced Patterns with Redis
Gotchas and Things to Be Aware Of
Conclusion

Understanding Rate Limiting

Before we jump into code, it is important to understand what rate limiting and throttling mean and why they are essential in modern APIs.

What Is Rate Limiting?

Rate limiting sets the maximum number of requests that a client (user, application or IP) can make within a defined time period – for example, 100 requests per minute per IP. Once the limit is hit, any additional incoming request in the same time period are blocked, usually with 429 Too Many Requests response.

Common Use Cases

Here’s where rate limiting shines:

Protecting APIs from abuse: Prevent DDoS attacks, brute-force login attempts, or scraping bots.
Ensuring fair usage: All clients should be given a fair share of API capacity, especially in multi-tenant or public APIs.
Controlling costs in paid APIs: Limit free-tier users to a certain number of requests and offer higher limits on premium plans.

High-Level Strategies

There are several popular strategies to implement rate-limiting

Fixed Window Count requests within a fixed time window (e.g., per minute). Simple, but can be unfair at window edges.
Sliding Window Tracks the requests over a rolling or sliding window for smoother limits and better fairness.
Token Bucket Clients “spend” tokens to make requests. Tokens refill at a steady rate. Allows short bursts while controlling the overall rate.

Leaky Bucket Similar to a token bucket but with a queue — incoming requests are put into a “bucket” (queue). If the bucket overloads, the requests are rejected. The bucket “leaks” requests are at a steady rate, allowing more requests to be processed.

Important Note: You don’t have to implement all these algorithms yourself! Many libraries (and Redis patterns) already support them — we’ll cover this later.

Native .NET Rate Limiting

With the release of .NET 7 and continuing improvements in .NET 8, Microsoft introduced a built-in RateLimiter API that makes adding rate limiting to ASP.NET Core applications easier and more streamlined than ever before.

This is a big step forward because, in earlier versions of .NET, you had to rely on custom code or third-party libraries to implement even basic rate limiting. Now, you can add it natively using middleware and configuration.

Quick Intro to .Net RateLimiter

The .NET RateLimiter integrates directly into the ASP.NET Core middleware pipeline.

It provides several ready-to-use limiting strategies, including:

Fixed window limiter: Limit a number of requests per fixed time window (e.g., 100 requests per minute).
Sliding window limiter: Smoothly enforce limits over a rolling window.
Token bucket limiter: Allow occasional bursts while maintaining a steady rate over time.
Concurrency limiter: Limit the number of concurrent requests in-flight.

Example for applying the native .NET RateLimiter (as published in the official documentation here)

// Add native rate limiting
builder.Services.AddRateLimiter(options =>
{
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
    {
        return RateLimitPartition.GetFixedWindowLimiter(
        partitionKey: httpContext.User.Identity?.Name ?? httpContext.Request.Headers.Host.ToString(),
        factory: partition => new FixedWindowRateLimiterOptions
        {
            AutoReplenishment = true,
            PermitLimit = 10,
            QueueLimit = 0,
            Window = TimeSpan.FromMinutes(1)
        });
    });
});

Strengths

The built-in .NET RateLimiter brings several benefits:

Out-of-the-box support: You don’t need third-party packages — it’s baked right into ASP.NET Core.
Multiple strategies available: Choose from different algorithms depending on your needs.
Fine-grained control: Apply limits per endpoint, per user, or per route.
Minimal configuration required: With just a few lines, you can protect your API from abusive traffic.

Limitations

While the native rate limiter is powerful, it has some important limitations:

Local only (single instance): It works in-memory within one application instance. If you’re running multiple servers behind a load balancer, each instance enforces limits independently — meaning there’s no global enforcement across the whole system.
No shared counters or state: There’s no built-in mechanism to synchronize limits across multiple servers.
No persistent storage: Counters reset if the application restarts or scales down.

When You Might Need Redis Over Built-in Solutions

For simple scenarios — like internal APIs or small services running on a single server — the native .NET rate limiter is often enough.

But when you move into distributed systems or need global limits across multiple app instances, you’ll run into its limitations. This is where Redis comes in.

Redis acts as a centralized, distributed store for rate-limiting data:

It synchronizes counters across all servers.
It applies global per-user, per-IP, or per-API-key limits.
It ensures consistency even if some app nodes go down or restart.
It supports advanced patterns (like sliding windows or token buckets) with atomic operations via Lua scripts.

In short, Redis unlocks distributed rate limiting — something the native .NET limiter can’t provide out of the box.

Why Redis for Distributed Rate Limiting

As I mentioned earleir, the native .NET rate limiter works well – but only in a single-instance context. Once your deploy your API across multiple instances, containers or behind some load balancer, you quickly run into a critical problem: Each instance has its own memory and its own counters!

This basically means that a client can easily bypass your limits just by sending requests to different instances.

Advantages of Using Redis

Redis solves the distributed rate-limiting challenge by acting as a centralized, high-performance data store that all API instances can communicate with.

Here’s why Redis is such a great fit:

Shared state across servers: Redis lets all your API instances share the same counters and enforcement logic, ensuring consistent limits across the board.
In-memory speed: Redis is incredibly fast. Because it operates entirely in memory, it can handle thousands of reads and writes per second with minimal latency — perfect for high-throughput APIs.
Built-in TTL (Time to Live): Redis allows you to set expiration times on keys, which is ideal for time-based rate limits. You don’t need to manually clean up old counters — Redis handles it automatically.

Implementing Rate Limiting with Redis in .NET

Now that we understand the theory, let’s put it into practice by building a simple fixed window rate limiter using Redis and .NET.

In this approach, we’ll:

Track the number of requests a user makes during a fixed time window (e.g., 1 minute).
Block any requests beyond the allowed threshold (e.g., 100 requests per minute).

Step 1: Start Redis

I will run Redis locally using Docker:

docker run -d -p 6379:6379 redis

But you can use a managed Redis service or have it installed locally if you want.

Step 2: Add Stackexchange.Redis

We’ll use the popular StackExchange.Redis client:

dotnet add package StackExchange.Redis

Then, register the connection in your Program.cs:

builder.Services.AddSingleton<IConnectionMultiplexer>(
    ConnectionMultiplexer.Connect("localhost:6379"));

Step 3: Implementing Rate Limiting

Basic Rate Limiting – Fixed Window Rate Limiting

Create a new class for the rate limiter that integrates with Redis and will handle rate limiting.

public class RedisRateLimiter
{
    private readonly IDatabase _redis;
    private readonly int _limit;
    private readonly TimeSpan _window;

    public RedisRateLimiter(IConnectionMultiplexer connectionMultiplexer, int limit, TimeSpan window)
    {
        _redis = connectionMultiplexer.GetDatabase();
        _limit = limit;
        _window = window;
    }

    public async Task<bool> IsAllowedAsync(string key)
    {
        var redisKey = $"rate_limit:{key}";
        var count = await _redis.StringIncrementAsync(redisKey);

        if (count == 1)
        {
            // First request, set expiration for the window
            await _redis.KeyExpireAsync(redisKey, _window);
        }

        return count <= _limit;
    }
}

Use it as middleware:

app.Use(async (context, next) =>
{
    var limiter = context.RequestServices.GetRequiredService<RedisRateLimiter>();
    var clientIp = context.Connection.RemoteIpAddress?.ToString();

    if (!await limiter.IsAllowedAsync(clientIp))
    {
        context.Response.StatusCode = 429;
        await context.Response.WriteAsync("Too many requests. Try again later.");
        return;
    }

    await next();
});

Advanced Patterns with Redis

While the fixed window algorithm is simple and easy to implement, it come with limitations – especially in edge cases like bursts at window boundaries or when fairness is critical.

Let’s explore three more advanced and flexible pattern you can implement with Redis to achieve production-grade rate limiting.

Sliding Window Using Sorted Sets

The sliding window algorithm solves the common problem of “bursting” at window edges in a fixed window limiter.

Example problem:

If you allow 100 requests per minute, a user could send 100 requests at 12:00:59 and 100 more at 12:01:01 — effectively sending 200 requests in just 2 seconds.

Sliding window avoids this by enforcing limits over the last N seconds (e.g., the last 60 seconds), regardless of where the window “starts.”

Sorted Sets: Redis sorted sets (ZSETs) are a powerful data structure that stores unique elements ordered by a floating-point score — most commonly, a timestamp. This makes them perfect for use cases that require time-based tracking, like sliding window rate limiting. In our implementation, we store each request with the current timestamp as its score. Then, we use Redis commands like ZREMRANGEBYSCORE to remove outdated entries and ZCARD to count how many requests remain in the current window — all of which can be done with high efficiency.

There are a couple of ways to implement this algorithm. One approach uses only .NET, which is simple and often sufficient — but it’s not fully atomic. This means that in high-concurrency environments, race conditions can occur if multiple requests are processed simultaneously. The other approach uses Lua, a lightweight, high-level scripting language known for its simplicity and flexibility. Redis supports scripting with Lua, allowing us to execute multiple commands atomically in a single step — ensuring consistency, even under heavy load.

Let’s see how to implement this algorithm with and without Lua.

Option 1 – Implementation without Lua

The implementation of a sliding window without using Lua looks like this:

public class SlidingWindowRateLimiter
{
    private readonly IDatabase _redis;
    private readonly ILogger<SlidingWindowRateLimiter> _logger;
    private readonly int _limit;
    private readonly TimeSpan _window;

    public SlidingWindowRateLimiter(IConnectionMultiplexer connectionMultiplexer, ILogger<SlidingWindowRateLimiter> logger, int limit, TimeSpan window)
    {
        _redis = connectionMultiplexer.GetDatabase() ?? throw new InvalidOperationException("Unable to get Redis database.");
        _logger = logger ?? throw new ArgumentNullException(nameof(logger));
        _limit = limit;
        _window = window;
    }

    public async Task<bool> IsAllowedAsync(string key)
    {
        var redisKey = $"rate_limit:{key}";
        var now = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();
        var windowStart = now - (long)_window.TotalMilliseconds;

        // Remove expired timestamps
        try
        {
            // step 1 : Remove expired timestamps that are outside the sliding window
            await _redis.SortedSetRemoveRangeByScoreAsync(redisKey, 0, windowStart);

            // Step 2 : Count the current entries (requests) in the window
            var currentCount = await _redis.SortedSetLengthAsync(redisKey);
            if ( currentCount >= _limit )
            {
                _logger.LogWarning("Rate limit exceeded for key: {Key}. Current count: {CurrentCount}, Limit: {Limit}", key, currentCount, _limit);
                return false;
            }

            // Step 3 : Add the new request with timestamp as score
            var requestId = $"{now}-{Guid.NewGuid()}";  // prevent duplicate entries
            await _redis.SortedSetAddAsync(redisKey, requestId, now);

            // Step 4 : Set expiration for the sliding window
            await _redis.KeyExpireAsync(redisKey, _window);

            _logger.LogInformation("Request allowed for key: {Key}. Current count: {CurrentCount}, Limit: {Limit}", key, currentCount + 1, _limit);
            return true;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Error while removing expired timestamps from Redis.");
            return false;
        }

    }
}

Then register the new service:

builder.Services.AddSingleton<SlidingWindowRateLimiter>(provide =>
{
    var redis = provide.GetRequiredService<IConnectionMultiplexer>();
    var logger = provide.GetRequiredService<ILogger<SlidingWindowRateLimiter>>();
    return new SlidingWindowRateLimiter(redis, logger, 10, TimeSpan.FromMinutes(1));
});

And use it :

app.Use(async (context, next) =>
{
    var rateLimiter = context.RequestServices.GetRequiredService<SlidingWindowRateLimiter>();

    // Use client IP or user ID
    var clientKey = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";

    if (!await rateLimiter.IsAllowedAsync(clientKey))
    {
        context.Response.StatusCode = 429;
        context.Response.Headers["Retry-After"] = "60"; // Optional
        await context.Response.WriteAsync("Too many requests. Try again later.");
        return;
    }

    await next();
});

This implementation uses Redis sorted sets to store and evaluate request timestamps. It enforces a sliding window by removing outdated requests and adding new ones, while counting the active ones within the window.

Limitation: Not atomic — race conditions can occur if multiple requests arrive simultaneously.

Option 2 – Implementation with Lua

The implementation with the same rate limiter but with Lua. It guarantees that the window trimming, request counting, insertion, and TTL setting are done as a single operation — no race conditions, no inconsistencies under high concurrency.

public class SlidingWindowLuaRateLimiter
{
    private readonly IDatabase _redis;
    private readonly ILogger<SlidingWindowLuaRateLimiter> _logger;
    private readonly LuaScript _script;
    private readonly int _limit;
    private readonly TimeSpan _window;

    public SlidingWindowLuaRateLimiter(
        IConnectionMultiplexer connectionMultiplexer, ILogger<SlidingWindowLuaRateLimiter> logger, int limit, TimeSpan window)
    {
        _redis = connectionMultiplexer.GetDatabase();
        _logger = logger;
        _limit = limit;
        _window = window;

        // Atomic Lua script: removes old requests, counts existing ones,
        // adds new one if limit not exceeded, and sets expiry
        _script = LuaScript.Prepare(@"
        local key = KEYS[1]
        local now = tonumber(ARGV[1])
        local window = tonumber(ARGV[2])
        local limit = tonumber(ARGV[3])
        local expire = tonumber(ARGV[4])

        redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
        local count = redis.call('ZCARD', key)
        if count >= limit then
            return 0
        else
            redis.call('ZADD', key, now, now .. '-' .. math.random())
            redis.call('EXPIRE', key, expire)
            return 1
        end
    ");
    }

    /// <summary>
    /// Atomically determines whether a request is allowed.
    /// </summary>
    public async Task<bool> IsAllowedAsync(string key)
    {
        var redisKey = new RedisKey[] { $"rate_limit:{key}" };
        var now = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();
        var args = new RedisValue[]
        {
        now,                                // ARGV[1] = current timestamp
        (long)_window.TotalMilliseconds,    // ARGV[2] = window size
        _limit,                             // ARGV[3] = request limit
        (int)_window.TotalSeconds           // ARGV[4] = TTL
        };

        try
        {
            var result = (int)await _redis.ScriptEvaluateAsync(_script.OriginalScript, redisKey, args);
            return result == 1;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "SlidingWindowLuaRateLimiter failed for key {Key}", key);
            return true; // Optionally fail open
        }
    }
}

Then register the new service:

builder.Services.AddSingleton(provide =>
{
    var redis = provide.GetRequiredService<IConnectionMultiplexer>();
    var logger = provide.GetRequiredService<ILogger<SlidingWindowLuaRateLimiter>>();
    return new SlidingWindowLuaRateLimiter(redis, logger, 10, TimeSpan.FromMinutes(1));
});

And use it:

app.Use(async (context, next) =>
{
    var rateLimiter = context.RequestServices.GetRequiredService<SlidingWindowLuaRateLimiter>();

    // Use client IP or user ID
    var clientKey = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";

    if (!await rateLimiter.IsAllowedAsync(clientKey))
    {
        context.Response.StatusCode = 429;
        context.Response.Headers["Retry-After"] = "60"; // Optional
        await context.Response.WriteAsync("Too many requests. Try again later.");
        return;
    }

    await next();
});

Token Bucket Using Counters and Timestamps

The token bucket algorithm allows clients to make requests at a steady rate — but also permits short bursts as long as there are enough “tokens” in the bucket.

The implementation of token bucket looks like this:

public class TokenBucketRateLimiter
{
    private readonly IDatabase _redis;
    private readonly ILogger<TokenBucketRateLimiter> _logger;
    private readonly LuaScript _luaScript;
    private readonly int _bucketCapacity;
    private readonly double _refillRatePerSecond;

    public TokenBucketRateLimiter(
    IConnectionMultiplexer connectionMultiplexer,
    ILogger<TokenBucketRateLimiter> logger,
    int bucketCapacity,
    double refillRatePerSecond)
    {
        _redis = connectionMultiplexer.GetDatabase();
        _logger = logger;
        _bucketCapacity = bucketCapacity;
        _refillRatePerSecond = refillRatePerSecond;

        // Lua script:
        // KEYS[1] = token key
        // KEYS[2] = timestamp key
        // ARGV[1] = current timestamp (ms)
        // ARGV[2] = bucket capacity
        // ARGV[3] = refill rate per second
        _luaScript = LuaScript.Prepare(@"
        local tokens_key = KEYS[1]
        local timestamp_key = KEYS[2]
        local now = tonumber(ARGV[1])
        local capacity = tonumber(ARGV[2])
        local refill_rate = tonumber(ARGV[3])

        local last_tokens = tonumber(redis.call('GET', tokens_key) or capacity)
        local last_refill = tonumber(redis.call('GET', timestamp_key) or now)

        local elapsed = now - last_refill
        local refill = math.floor(elapsed * refill_rate / 1000)
        local tokens = math.min(capacity, last_tokens + refill)

        if tokens <= 0 then
            return 0
        else
            tokens = tokens - 1
            redis.call('SET', tokens_key, tokens)
            redis.call('SET', timestamp_key, now)
            redis.call('PEXPIRE', tokens_key, 60000)
            redis.call('PEXPIRE', timestamp_key, 60000)
            return 1
        end
    ");
    }

    public async Task<bool> IsAllowedAsync(string key)
    {
        var now = DateTimeOffset.UtcNow.ToUnixTimeMilliseconds();
        var redisKeys = new RedisKey[]
        {
        new RedisKey($"token_bucket:{key}:tokens"),
        new RedisKey($"token_bucket:{key}:timestamp")
        };
        var redisArgs = new RedisValue[]
        {
        now,
        _bucketCapacity,
        _refillRatePerSecond
        };

        try
        {
            var result = (int)await _redis.ScriptEvaluateAsync(_luaScript.OriginalScript, redisKeys, redisArgs);
            return result == 1;
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "TokenBucketRateLimiter failed for key {Key}", key);
            return true; // fail-open strategy
        }
    }
}

This implementation of the token bucket algorithm ensures rate control with burst tolerance. Here’s how:

Each user/client has a bucket with a limited number of tokens (e.g., 10).
Tokens are refilled over time at a configurable rate (e.g., 1 token per second).
Every incoming request checks how many tokens exist:
- If tokens are available, one is consumed and the request proceeds.
- If no tokens remain, the request is rejected (429).

We store:

The current token count (Redis string key)
The last refill timestamp (Redis string key)

Gotchas and Things to Be Aware Of

Clock Skew
Redis-based rate limiters rely on timestamps. If you’re running across multiple regions or systems with inconsistent clocks, rate enforcement can become unreliable. Make sure your servers are synced via NTP.
Key Expiry Management
If keys aren’t expired properly (especially in fixed/sliding window logic), they can pile up and increase Redis memory usage. Always set TTLs with each request or through Lua.
Fail-Open vs. Fail-Closed
When Redis is down, should your API block all requests (fail-closed) or allow them (fail-open)? Choose based on your risk model — fail-open may be more user-friendly but risks abuse.

Conclusion

Rate limiting and throttling are no longer optional features — they are foundational to building secure, fair, and resilient APIs. While .NET provides solid built-in support for local rate limiting, scaling to distributed environments requires more sophisticated strategies.

We explored and implemented three powerful techniques:

Fixed Window, which is simple and fast
Sliding Window, which improves fairness and accuracy
Token Bucket, which offers flexible rate control with burst tolerance

Thanks for stopping by –

Happy Coding

Cover Photo by Castorly Stock