Apache Kafka is an open-source message broker project to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Kafka is a distributed, partitioned, replicated commit log service.
Bigdata Ingestion Software
Publish and subscribe to streams of records, similar to a message queue or enterprise messaging system.
Store streams of records in a fault-tolerant durable way.
Process streams of records as they occur.
Building real-time streaming data pipelines that reliably get data between systems or applications
Building real-time streaming applications that transform or react to the streams of data
Apache Kafka is an open-source message broker project to provide a unified, high-throughput, low-latency platform for handling real-time data feeds. Kafka is a distributed, partitioned, replicated commit log service. It provides the functionality of a messaging system, but with a unique design. Kafka has a modern cluster-centric design that offers strong durability and fault-tolerance guarantees
Kafka is designed to allow a single cluster to serve as the central data backbone for a large organization. It can be elastically and transparently expanded without downtime. Data streams are partitioned and spread over a cluster of machines to allow data streams larger than the capability of any single machine and to allow clusters of co-ordinated consumers.
Apache Kafka is widely adopted for use cases ranging from collecting user activity data, logs, application metrics, stock ticker data and device instrumentation. Its key strength is its ability to make high volume data available as a real-time stream for consumption in systems with very different requirements—from batch systems like Hadoop, to real-time systems that require low-latency access, to stream processing engines that transform the data streams as they arrive.