Sign in to see all reviews and comparisons. It's Free!
By clicking Sign In with Social Media, you agree to let PAT RESEARCH store, use and/or disclose your Social Media profile and email address in accordance with the PAT RESEARCH Privacy Policy and agree to the Terms of Use.
Kudu is a columnar storage manager developed for the Apache Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and supports highly available operation.
Category
Column-oriented DBMS
Features
• In-memory columnar execution path • Advanced in-process tracing capabilities • Extensive metrics support • Watchdog threads which check for latency outliers • Columnar storage allows efficient encoding and compression • Lazy data materialization and predicate pushdown
License
• Open source
Price
• Open source
Pricing
Subscription
Free Trial
Available
Users Size
Small (<50 employees), Medium (50 to 1000 Enterprise (>1001 employees)
Company
Apache Kudu
What is best?
• In-memory columnar execution path • Advanced in-process tracing capabilities • Extensive metrics support • Watchdog threads which check for latency outliers
What are the benefits?
• Strong performance for running sequential and random workloads simultaneously • High availability: Reads can be serviced by read-only follower tablets, even in the event of a leader tablet failure • Data Compression: Fulfill queries while reading even fewer blocks from disk • Integrated: Take advantage of the broader Hadoop ecosystem
PAT Rating™
Editor Rating
Aggregated User Rating
Rate Here
Ease of use
7.6
6.4
Features & Functionality
7.6
7.1
Advanced Features
7.6
7.3
Integration
7.6
7.4
Performance
7.6
7.7
Customer Support
7.6
—
Implementation
—
Renew & Recommend
—
Bottom Line
Apache Kudu is a free and open source column-oriented data store of the Apache Hadoop ecosystem.
7.6
Editor Rating
7.2
Aggregated User Rating
7 ratings
You have rated this
Kudu is a columnar storage manager developed for the Apache Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and supports highly available operation. Kudu internally organizes its data by column rather than row. Columnar storage allows efficient encoding and compression. With techniques such as run-length encoding, differential encoding, and vectorized bit-packing, Kudu is as fast at reading the data as it is space-efficient at storing it. Columnar storage also dramatically reduces the amount of data IO required to service analytic queries. Using techniques such as lazy data materialization and predicate pushdown, Kudu can perform drill-down and needle-in-a-haystack queries over billions of rows and terabytes of data in seconds. Kudu is implemented in C++, so it can scale easily to large amounts of memory per node. With an in-memory columnar execution path, Kudu achieves good instruction-level parallelism using SIMD operations from the SSE4 and AVX instruction sets. Kudu gives architects the flexibility to address a wider variety of use cases without exotic workarounds. Kudu is designed to excel at use cases that require the combination of random reads / writes and the ability to do fast analytic scans—which previously required the creation of complex Lambda architectures. Engineered to take advantage of next-generation hardware and in-memory processing, Kudu lowers query latency significantly for Apache Impala (incubating) and Apache Spark. Kudu targets support for families of applications that are difficult or impossible to implement on current generation Hadoop storage technologies.
By clicking Sign In with Social Media, you agree to let PAT RESEARCH store, use and/or disclose your Social Media profile and email address in accordance with the PAT RESEARCH Privacy Policy and agree to the Terms of Use.