Kafka Consumer Auto-Commit: Why 'At-Least-Once' Is Often Misunderstood


If you search online, you’ll often see statements like:

“Kafka consumers provide at-least-once delivery when auto-commit is enabled.”

While this statement is not entirely wrong, it is dangerously incomplete. Many production issues happen because developers take this at face value without understanding when offsets are committed and what exactly “at-least-once” means in practice.

In this article, we’ll look at what actually happens inside a Kafka consumer when auto-commit is enabled, why failures can still cause data loss, and when auto-commit is truly safe to use.


Why Auto-Commit Feels Safe

By default, the Kafka Java consumer has:

  • enable.auto.commit = true
  • auto.commit.interval.ms = 5000

This gives a comforting impression:

  • Kafka periodically commits offsets
  • If a consumer crashes, it restarts
  • Messages should be reprocessed

So it feels like at-least-once delivery is guaranteed.

But the real question is:

At least once relative to what? Polling? Processing? Business logic?

Kafka only tracks polling. It does not track what your application does after that.


What Kafka Actually Commits

Kafka commits offsets, not messages.

An offset simply means:

“The next record the consumer should read.”

When auto-commit is enabled, the consumer periodically commits the latest offsets returned by poll(), regardless of whether your application has finished processing those records.

Kafka does not know:

  • Whether you processed the record
  • Whether processing succeeded or failed
  • Whether your database write completed

From Kafka’s perspective, once poll() returns records, those offsets are eligible for commit.


The Actual Timeline (Critical to Understand)

Let’s walk through a realistic scenario:

  1. Consumer calls poll()
  2. Kafka returns records with offsets 100–120
  3. Auto-commit timer fires
  4. Offset 120 is committed
  5. Your application is still processing records
  6. The consumer crashes (OOM, JVM kill, container restart)

What happens next?

  • Kafka sees offset 120 as committed
  • On restart, the consumer resumes from 121
  • Records 100–120 are never re-read

From your application’s point of view, those messages are effectively lost.


Why Is This Still Called “At-Least-Once”?

Because Kafka’s guarantee is scoped narrowly.

Kafka guarantees:

  • Records returned by poll() will be delivered at least once to the consumer
  • As long as offsets are not committed before polling

Kafka does not guarantee:

  • At-least-once processing
  • At-least-once database writes
  • At-least-once business side effects

This distinction is often overlooked.


The Real-World Problem

In real systems:

  • Processing may be asynchronous
  • There are database writes, API calls, retries
  • Processing can take seconds or even minutes

Auto-commit is time-based, not processing-based.

So the larger the gap between:

  • Polling the record
  • Finishing business processing

the higher the risk of data loss if the consumer crashes.

This is why teams sometimes observe:

  • Missing records
  • Inconsistent aggregates
  • Silent data loss after restarts

Kafka usually behaves correctly — the misunderstanding is in how the guarantees are interpreted.


When Auto-Commit Is Actually Safe

Auto-commit can be acceptable when:

  1. Processing is very fast
  2. Processing is idempotent
  3. Losing a small number of records is acceptable
  4. The consumer does not maintain critical state

Typical examples:

  • Metrics collection
  • Log aggregation
  • Monitoring events
  • Best-effort analytics

In these cases, simplicity may outweigh strict correctness.


When You Should Avoid Auto-Commit

Avoid auto-commit when:

  • You write to a database
  • You update business state
  • You perform non-idempotent operations
  • You require strong delivery guarantees

In these situations, manual offset management provides better control:

  1. Process the record
  2. Ensure processing succeeds
  3. Commit the offset explicitly

It adds complexity, but it aligns offset commits with business success.


Manual Commit Is Not Magic Either

Even with manual commits:

  • Duplicates can still happen
  • Rebalances can interrupt processing
  • Commits can fail or be delayed

Kafka gives delivery guarantees, not business correctness guarantees.

Production systems should still be designed with:

  • Idempotent processing
  • Clear retry strategies
  • Proper failure handling

Key Takeaways

  • Kafka commits offsets, not processing results
  • Auto-commit is tied to poll(), not your business logic
  • “At-least-once” does not mean “processed at least once”
  • Auto-commit is fine for best-effort use cases
  • For critical systems, explicit offset control is safer

Understanding this early can prevent subtle production bugs later.


What’s Next?

Now that we understand how offset commits affect at-least-once delivery, the next step is understanding what happens during consumer group rebalancing.

👉 Read: Kafka Eager vs Cooperative Rebalancing Explained


If you found this useful and want to share your thoughts, this article is also published on Dev.to where discussions are more active. You can read it there and leave a comment if you’d like:

https://dev.to/rajeev_a954661bb78eb9797f/kafka-consumer-auto-commit-why-at-least-once-is-often-misunderstood-15hn

I always appreciate feedback and different perspectives.