Bigdata

Now Reading

Confluent Platform Based on Kafka with Improved Enterprise Security

Confluent Platform 2.0, is based on an updated Apache Kafka 0.9 core. Representing a big leap forward in the maturation of the technology, the new release boasts fresh features to enable secure multi-tenant operations, simplify development and maintenance of applications that produce or consume data in Kafka, and provide high-throughput, scalable data integration with a wide array of data sources. These new capabilities are designed to meet the needs of a growing wave of enterprise customers.

Created by Apache Kafka committers from Confluent, LinkedIn and other members of the vibrant Kafka community, Kafka 0.9 provides the critical foundation for the next adoption wave for Apache Kafka. Kafka is already a wildly popular open source system for managing real-time streams of data from websites, applications and sensors that is now being used as fundamental infrastructure by thousands of companies, ranging from LinkedIn, Netflix and Uber, to Cisco and Goldman Sachs.

Confluent Platform is built upon Apache Kafka to solve the challenges of data integration by providing a central stream data pipeline. It turns an organization’s data into readily available low-latency streams, and further acts as a buffer between systems that might produce or consume such data at different rates. The founders of Confluent created Kafka while at LinkedIn to help cope with the very large-scale data ingestion and processing requirements of the business networking service.

Confluent Platform 2.0 boasts new features to address the needs of large enterprises that must handle the highly sensitive personal, financial or medical data, and operate multi-tenant environments with strict quality of service standards. Details include:

Data Encryption: Ensures encryption over the wire using SSL. Authentication and Authorization: Allows access control with permissions that can be set on a per-user or per-application basis. Quality of Service: Configurable quotas that allow throttling of reads and writes to ensure quality of service. Kafka Connect: A new connector-driven data integration feature facilitates large-scale, real-time data import and export for Kafka, enabling developers to easily integrate various data sources with Kafka without writing code, and boasting a 24/7 production environment including automatic fault-tolerance, transparent, high-capacity scale-out, and centralized management—meaning more efficient use of IT staff.

With Kafka, developers can build a centralized data pipeline enabling microservices or enterprise data integration, such as high-capacity ingestion routes for Apache Hadoop or traditional data warehouses, or as a foundation for advanced stream processing using engines like Apache Spark, Storm or Samza. Confluent Platform 2.0 includes a range of new developer-friendly features:

New Java consumer: The update includes a new Java consumer simplifying the development, deployment, scalability, and maintenance of consumer applications using Kafka. Supported C client: Fully featured producer and consumer implementation.Over 500 individual performance, operational, and management improvements.

"Our industry is in the middle of a transition away from slow, batch data integration to real-time streams that make data available in milliseconds,” said Jay Kreps, CEO and co-founder of Confluent, and co-creator of Kafka. “Confluent Platform makes it easy for customers to embrace streaming data as a platform for data integration, real-time analytics and streaming data applications."

Kafka’s influence and impact continues to grow. Today it powers every part of LinkedIn’s business where it has scaled to more than 1.1 trillion messages/day. It drives Microsoft’s Bing, Ads and Office operations at more than 1 trillion messages/day.

"Companies know how important stream processing is for their ability to create reactive customer environments—and make their own business decisions moment to moment—but it's a difficult thing to do," said Doug Henschen, vice president and principal analyst at Constellation Research. "Kafka is becoming much more popular, as evidenced by the growing number of tools that are out there, and it anchors a stream processing ecosystem that is changing how businesses process data."

Organizations from all sectors are looking to evolve their data architecture to enable real-time stream data processing as demand for Kafka and Confluent Platform is accelerating. Confluent is the only company that enables businesses to take full advantage of Kafka, from initial development through ongoing operation.

“The Netflix Data Pipeline service handles more than 700 billion events each day to help deliver great experiences to our members around the world, and Apache Kafka is a key infrastructure component,” said Peter Bakas, Director of Engineering, Cloud Platform Engineering at Netflix. “As we continue to scale, we're excited to be working with the Confluent team to keep driving innovation in the Apache Kafka community.”

"At CA Technologies, Kafka plays a crucial role in making our data available as realtime streams, given its unique combination of reliability, throughput, and scalability,” said Bob Cotton, Senior Principal Software Engineer, CA Technologies, Agile Business Unit. “Confluent's expertise with Kafka really helps to enable successful production deployments, and we're excited by the new capabilities in Confluent Platform 2.0."