Sign in to see all reviews and comparisons. It's Free!
Kudu is a columnar storage manager developed for the Apache Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and supports highly available operation.
• In-memory columnar execution path • Advanced in-process tracing capabilities • Extensive metrics support • Watchdog threads which check for latency outliers • Columnar storage allows efficient encoding and compression • Lazy data materialization and predicate pushdown
• Open source
• Open source
Small (<50 employees), Medium (50 to 1000 Enterprise (>1001 employees)
What is best?
• In-memory columnar execution path • Advanced in-process tracing capabilities • Extensive metrics support • Watchdog threads which check for latency outliers
What are the benefits?
• Strong performance for running sequential and random workloads simultaneously • High availability: Reads can be serviced by read-only follower tablets, even in the event of a leader tablet failure • Data Compression: Fulfill queries while reading even fewer blocks from disk • Integrated: Take advantage of the broader Hadoop ecosystem
Aggregated User Rating
Ease of use
Features & Functionality
Renew & Recommend
Apache Kudu is a free and open source column-oriented data store of the Apache Hadoop ecosystem.
Aggregated User Rating
You have rated this
Kudu is a columnar storage manager developed for the Apache Hadoop platform. Kudu shares the common technical properties of Hadoop ecosystem applications: it runs on commodity hardware, is horizontally scalable, and supports highly available operation. Kudu internally organizes its data by column rather than row. Columnar storage allows efficient encoding and compression. With techniques such as run-length encoding, differential encoding, and vectorized bit-packing, Kudu is as fast at reading the data as it is space-efficient at storing it. Columnar storage also dramatically reduces the amount of data IO required to service analytic queries. Using techniques such as lazy data materialization and predicate pushdown, Kudu can perform drill-down and needle-in-a-haystack queries over billions of rows and terabytes of data in seconds. Kudu is implemented in C++, so it can scale easily to large amounts of memory per node. With an in-memory columnar execution path, Kudu achieves good instruction-level parallelism using SIMD operations from the SSE4 and AVX instruction sets. Kudu gives architects the flexibility to address a wider variety of use cases without exotic workarounds. Kudu is designed to excel at use cases that require the combination of random reads / writes and the ability to do fast analytic scans—which previously required the creation of complex Lambda architectures. Engineered to take advantage of next-generation hardware and in-memory processing, Kudu lowers query latency significantly for Apache Impala (incubating) and Apache Spark. Kudu targets support for families of applications that are difficult or impossible to implement on current generation Hadoop storage technologies.
PAT RESEARCH is a B2B discovery platform which provides Best Practices, Buying Guides, Reviews, Ratings, Comparison, Research, Commentary, and Analysis for Enterprise Software and Services. We provide Best Practices, PAT Index™ enabled product reviews and user review comparisons to help IT decision makers such as CEO’s, CIO’s, Directors, and Executives to identify technologies, software, service and strategies.