Summary of Kafka Fundamentals: Understanding Segments, Commit Log, and Retention Policy

segments in Kafka are sets of messages stored within partitions.

messages are appended to segments as new messages come in.

The commit log in Kafka stores actual data produced by producers.

The retention policy in Kafka determines how long data should be kept before deletion.

There are two types of retention policies: size-based and time-based.

The cleaner process runs in the background to delete old data based on the retention policy.

Default retention period in Kafka is 168 hours or 7 days.

segments are created as new messages come in, and old segments are deleted to maintain partition size limits.

Notable Quotes

04:08 — « "But for how long should that data remain, let us tell you. There is a retention policy." »
04:31 — « "What is the second policy? The second policy is time based policy." »
05:03 — « "The cleaner process keeps running in the background, it keeps checking whether the retention time of which message has ended." »
05:35 — « "The size which is within our limit should again come within our limit." »

Category

Educational

Video