Automating Data Preparation and Data Blending by ClearStory
ClearStory announced breakthrough advancement to its industry-first data inference and Intelligent Data Harmonization capabilities called Infinite Data Overlap Detection (IDOD). ClearStory’s Spark-based, business-ready analytics solution now detects and infers data patterns and customer-specific data types for all values for all data types in every source that a user connects to as part of an analysis. The benefit to organizations is more automation in data discovery and even faster blending of sources to further eliminate traditional data modeling complexities and speed business insights.
ClearStory’s IDOD advancement addresses exactly the types of complexities that are prevalent across all Global 2000 organizations. The new ability to blend and harmonize “categorical value” data sources that are highly dimensional solves the root cause for the biggest delays and IT challenges in speeding business insights. Organizations benefit from more precise insights on large, complex data sources including ones with a high degree of customer-specific information, which is common in companies across industries and contributes to a rise in data analysis complexity.
ClearStory’s new, large-scale IDOD capability is used to determine how complex data from multiple sources should be blended, viewed, and visualized on the fly. IDOD plays the role of data modeling advisor to the business user, enabling them to blend data together and discover insights quickly, without data modeling expertise and days or weeks of manual effort. ClearStory’s approach replaces traditional methods of manually matching data or and data relationships across diverse sources. Traditional approaches are not sustainable as businesses have reached an urgent need to see business insights faster.
Highlights of the new Infinite Data Value Overlap capability include:
Smarter Data Inference: Detects and infers the overlap of categorical values for all data types across hundreds, millions or even billions of unique values for attributes across all the source data being analyzed;
Infinite Types: No limits on how many unique custom data types, custom dimensions, or values can be recognized in each source for data inference and data harmonization;
Extensibility: New data types can be easily patterned and plugged into the capability for increased automation of vertical industries’ custom data types. This brings a powerful way to address vertical-specific and customer-specific data nuances and complexities. Further, it allows the ClearStory system to learn customer-specific attributes that further accelerate reaching insights and significantly reduce manual and complex data modeling.
Granular Data Scoring and Data Relationships – Detailed granular scores are calculated for each custom data type and the values within are used to determine the right way to automatically blend data sources together into a holistic, harmonized view. Even data sources with hundred of millions of unique values per attribute can be intelligently inferred and automatically scored and matched to enable users to reach fast meaningful insights;
Simple User Experience: As in all areas of the ClearStory Data solution, ease of use and an intuitive user experience is of the utmost importance. With IDOD, ClearStory extends its data inference and Intelligent Data HarmonizationTM user interface and experience to surface the power of the new, advanced processing engine in a simple, user-friendly way so users can be self-sufficient on even complex data sources.
“ClearStory’s introduction of automated, machine-based advancements in data preparation, discovery and data harmonization continues to build on its Spark-based core IP to process large-scale data at high speeds,” said Dr. Tim Howes, CTO of ClearStory Data. “By adding the advanced IDOD capability to automatically recognize infinite categories, values and granularities in data sources, we speed the cycle of data to insights by addressing a significant pain point that enterprises across all industries face today: the intricate, tedious task and massive time sink caused by manual data wrangling on large, complex data.”
ClearStory Data’s new capabilities are offered as a core part of the ClearStory solution and customers can experience it as part of their standard offering. For more advanced users, the data extensibility feature can also be made available as a premium API-based service.