Kafka Consumer Auto-Commit: Why 'At-Least-Once' Is Often Misunderstood
If you search online, you’ll often see statements like:
“Kafka consumers provide at-least-once delivery when auto-commit is enabled.”
While this statement is not entirely wrong, it is dangerously incomplete. Many production issues happen because developers take this at face value without understanding when offsets are committed and what exactly “at-least-once” means in practice.
In this article, we’ll look at what actually happens inside a Kafka consumer when auto-commit is enabled, why failures can still cause data loss, and when auto-commit is truly safe to use.
Why Auto-Commit Feels Safe
By default, the Kafka Java consumer has:
enable.auto.commit = trueauto.commit.interval.ms = 5000
This gives a comforting impression:
- Kafka periodically commits offsets
- If a consumer crashes, it restarts
- Messages should be reprocessed
So it feels like at-least-once delivery is guaranteed.
But the real question is:
At least once relative to what? Polling? Processing? Business logic?
Kafka only tracks polling. It does not track what your application does after that.
What Kafka Actually Commits
Kafka commits offsets, not messages.
An offset simply means:
“The next record the consumer should read.”
When auto-commit is enabled, the consumer periodically commits the latest offsets returned by poll(), regardless of whether your application has finished processing those records.
Kafka does not know:
- Whether you processed the record
- Whether processing succeeded or failed
- Whether your database write completed
From Kafka’s perspective, once poll() returns records, those offsets are eligible for commit.
The Actual Timeline (Critical to Understand)
Let’s walk through a realistic scenario:
- Consumer calls
poll() - Kafka returns records with offsets
100–120 - Auto-commit timer fires
- Offset
120is committed - Your application is still processing records
- The consumer crashes (OOM, JVM kill, container restart)
What happens next?
- Kafka sees offset
120as committed - On restart, the consumer resumes from
121 - Records
100–120are never re-read
From your application’s point of view, those messages are effectively lost.
Why Is This Still Called “At-Least-Once”?
Because Kafka’s guarantee is scoped narrowly.
Kafka guarantees:
- Records returned by
poll()will be delivered at least once to the consumer - As long as offsets are not committed before polling
Kafka does not guarantee:
- At-least-once processing
- At-least-once database writes
- At-least-once business side effects
This distinction is often overlooked.
The Real-World Problem
In real systems:
- Processing may be asynchronous
- There are database writes, API calls, retries
- Processing can take seconds or even minutes
Auto-commit is time-based, not processing-based.
So the larger the gap between:
- Polling the record
- Finishing business processing
the higher the risk of data loss if the consumer crashes.
This is why teams sometimes observe:
- Missing records
- Inconsistent aggregates
- Silent data loss after restarts
Kafka usually behaves correctly — the misunderstanding is in how the guarantees are interpreted.
When Auto-Commit Is Actually Safe
Auto-commit can be acceptable when:
- Processing is very fast
- Processing is idempotent
- Losing a small number of records is acceptable
- The consumer does not maintain critical state
Typical examples:
- Metrics collection
- Log aggregation
- Monitoring events
- Best-effort analytics
In these cases, simplicity may outweigh strict correctness.
When You Should Avoid Auto-Commit
Avoid auto-commit when:
- You write to a database
- You update business state
- You perform non-idempotent operations
- You require strong delivery guarantees
In these situations, manual offset management provides better control:
- Process the record
- Ensure processing succeeds
- Commit the offset explicitly
It adds complexity, but it aligns offset commits with business success.
Manual Commit Is Not Magic Either
Even with manual commits:
- Duplicates can still happen
- Rebalances can interrupt processing
- Commits can fail or be delayed
Kafka gives delivery guarantees, not business correctness guarantees.
Production systems should still be designed with:
- Idempotent processing
- Clear retry strategies
- Proper failure handling
Key Takeaways
- Kafka commits offsets, not processing results
- Auto-commit is tied to
poll(), not your business logic - “At-least-once” does not mean “processed at least once”
- Auto-commit is fine for best-effort use cases
- For critical systems, explicit offset control is safer
Understanding this early can prevent subtle production bugs later.
What’s Next?
Now that we understand how offset commits affect at-least-once delivery, the next step is understanding what happens during consumer group rebalancing.
👉 Read: Kafka Eager vs Cooperative Rebalancing Explained
If you found this useful and want to share your thoughts, this article is also published on Dev.to where discussions are more active. You can read it there and leave a comment if you’d like:
I always appreciate feedback and different perspectives.