Bigdata

Now Reading

Syncsort Simplify Integration of Streaming Data in Apache Spark, Kafka and Hadoop

Syncsort, new capabilities, include native integration with Apache Spark and Apache Kafka, allowing organizations to access and integrate enterprise-wide data with streams from real-time sources. The new release of Syncsort’s industry-leading data integration software, DMX-h v9, greatly simplifies Spark application development, allowing applications to leverage the increasing power of a rapidly evolving Big Data technology stack. DMX-h v9 also securely integrates batch and real-time data streams from Kafka, mainframe, relational databases, and unstructured sources in the same data pipeline feeding Apache Hadoop and Spark. "Organizations need to access a complex ecosystem of enterprise-wide data sources to evaluate patterns, predict trends and gain competitive advantage,” said John Tripier, Senior Director of Business Development at Databricks. “We’re delighted that Syncsort is further expanding Spark’s enterprise capabilities by increasing the number of real-time and batch data sources it can connect to, such as the mainframe.”

“Adoption of Apache Kafka is accelerating due to the importance of stream processing in a growing number of real-time use cases,” said Jabari Norton, Vice President Business Development for Confluent. “Syncsort was one of the first companies we included in our new partner program due to their ability to create and continually enhance a simplified data pipeline between Kafka and other enterprise data sources, such as the mainframe.”

With the delivery of Spark support in DMX-h v9, customers can take the same jobs initially designed for MapReduce and run them natively in Spark. They can run the jobs in Spark by simply changing the execution framework from a drop-down menu in the graphical user interface, without requiring any rewriting or recompiling. This capability dramatically simplifies the process of moving applications from standalone server environments and from Hadoop MapReduce to Spark.

In addition, Syncsort’s DMX-h ships with Intelligent Execution, which dynamically plans for the applications at run-time based on the chosen compute framework, future-proofing the applications as the Big Data technology stack evolves.

These new capabilities build on Syncsort’s previously announced contributions to Spark-packages, which allow enterprises to include historical transactional data along with real-time data sources such as mobile and the Internet of Things (IoT). Syncsort’s DMX-h supports Spark on both YARN and Mesos.

Syncsort’s integration with the Kafka distributed messaging system allows users to leverage DMX-h’s easy-to-use graphical interface to subscribe, transform and enrich enterprise-wide data coming from real-time Kafka queues. Now, DMX-h can also publish these enriched datasets to Kafka to simplify the creation of real-time analytical applications by cleansing, pre-processing and transforming data in motion.

Examples of customer use cases supported by new capabilities in DMX-h include Internet Analytics Company: Analyze and find “outliers” from billions of new digital events per month from high volume sources including IoT and mobile. Healthcare Organization: Enable emergency units to analyze vital signs in real time to determine whether patients are in danger. Financial Institution: Near real-time updates for online banking through customer events processed and integrated via Kafka data bus. Insurance Company: Manage all customer application use cases using Spark, on a single platform.

“Many of our financial, telecommunications, insurance and healthcare customers need an easy way to gather, transform and distribute batch and streaming data coming from multiple enterprise data sources, including Kafka, for advanced analytics in Hadoop and Spark,” said Tendü Yoğurtçu, General Manager of Syncsort’s Big Data business. “The new capabilities we are delivering today meet those needs while continuing to provide best-in-class, secure access to mainframe data from within the fastest growing data platforms.”