Kafka Retries and Idempotent Producers Explained: Avoid Duplicates and Ensure Reliable Delivery


In the previous article Kafka Producer Acks Explained: Replicas, ISR, and Write Guarantees, we discussed when a producer considers a write successful and how acknowledgments impact durability and availability.

But even with correct acknowledgment settings, one important problem still remains:

What happens when a write fails in Kafka?

Or even more interesting:

What happens when Kafka thinks a write failed, but it actually succeeded?

This is where Kafka retries and idempotent producers become critical.


The Real Problem: Uncertain Failures in Distributed Systems

In distributed systems, failures are not always clear.

Consider this scenario:

  1. Producer sends a message to the leader.
  2. Leader writes the message successfully.
  3. Leader sends acknowledgment.
  4. Network issue occurs → acknowledgment is lost.

From Kafka’s perspective:

  • Broker: Write succeeded
  • Producer: Write failed

Now the producer retries.

👉 The same message gets written again.

This leads to duplicate messages in Kafka, even though the system behaved correctly.


Kafka Retries Explained

Kafka producers support automatic retries to handle transient failures.

Kafka Retry Configuration

retries=3
retry.backoff.ms=100

How Kafka Retries Work

  1. Producer sends a record.
  2. If it receives an error (or timeout), it retries.
  3. This continues until:
    • Retry count is exhausted, or
    • The send succeeds

When Do Kafka Retries Trigger?

Retries typically happen in scenarios like:

  • Temporary network failures
  • Leader broker not available
  • NOT_ENOUGH_REPLICAS
  • REQUEST_TIMED_OUT

These are recoverable errors, making retries useful.


Problem with Kafka Retries: Duplicate Messages

Retries improve reliability but introduce a major issue:

Duplicate message production

Why?

Because the producer cannot always distinguish between:

  • A failed write
  • A successful write with lost acknowledgment

So retrying can result in:

Message A → written
Retry Message A → written again


Message Ordering Issues with Retries

Retries can also impact ordering.

Example:

  • Message A is sent
  • Message B is sent
  • A fails and is retried later

Now B might be written before A retry.

👉 This can break ordering guarantees.

Kafka controls this using:

max.in.flight.requests.per.connection

But retries alone cannot guarantee correctness.


Kafka Idempotent Producer

To solve duplicate messages in Kafka, we use:

Idempotent Producer


What is Idempotence in Kafka?

Idempotence means:

Sending the same message multiple times results in it being written only once.

In Kafka:

👉 Even if retries happen, duplicate messages are not stored.


How Kafka Idempotent Producer Works

Kafka ensures idempotency using:

1. Producer ID (PID)

Each producer gets a unique identifier from the broker.


2. Sequence Numbers

  • Each message has a sequence number per partition
  • Broker tracks the latest sequence number

Duplicate Detection

On retry:

  • Same sequence number is sent
  • Broker detects duplicate
  • Duplicate message is discarded

Enable Idempotent Producer in Kafka

enable.idempotence=true

This is enough to enable duplicate protection.


Important Kafka Config Changes with Idempotence

When idempotence is enabled, Kafka automatically enforces:

acks=all
retries=Integer.MAX_VALUE
max.in.flight.requests.per.connection=5
# Note: For idempotent producers, this number should be ≤5 to preserve ordering and ensure no duplicates

Why These Settings Matter

  • acks=all → ensures durability
  • retries=∞ → safe retry mechanism
  • limited in-flight requests → preserves ordering

Scope of Idempotent Producer

What It Guarantees

  • No duplicate messages per partition
  • Safe retries
  • Ordering guarantees (with correct config)

What It Does Not Guarantee

  • No duplicates across producers
  • No duplicates across restarts
  • End-to-end exactly-once processing

For that, Kafka provides transactions.


Kafka Retries vs Idempotent Producer

FeatureWithout IdempotenceWith Idempotence
RetriesCan create duplicatesSafe
ReliabilityModerateHigh
OrderingCan breakPreserved

acks=all
enable.idempotence=true
retries=Integer.MAX_VALUE
retry.backoff.ms=100

This setup ensures:

  • High reliability
  • No duplicate messages
  • Strong durability guarantees

When to Use Idempotent Producer

Use idempotent producers in:

  • Payment systems
  • Order processing
  • Inventory management
  • Critical event-driven systems

In modern Kafka setups:

👉 It should almost always be enabled.


Closing Thoughts

Kafka retries are essential for handling transient failures, but they introduce the risk of duplicate messages.

Idempotent producers eliminate this risk by making retries safe.

Together, they ensure:

  • Reliable message delivery
  • No duplication
  • Strong consistency at the producer level

Summary

Kafka retries help recover from failures but can cause duplicate messages.

Idempotent producers solve this by ensuring messages are written exactly once per partition.

  • Retries improve fault tolerance
  • Idempotence ensures correctness
  • Together they enable reliable Kafka pipelines

If you found this useful and want to share your thoughts, this article is also published on Dev.to where discussions are more active. You can read it there and leave a comment if you’d like:

https://dev.to/rajeev_a954661bb78eb9797f/kafka-retries-and-idempotent-producers-explained-avoid-duplicates-and-ensure-reliable-delivery-gj7

I always appreciate feedback and different perspectives.