ClearStory Automated Intelligent Data Harmonization Spark based solution
ClearStory Data, announced advancement to its Spark-based Data Inference and Intelligent Data Harmonization capabilities through further machine-based discovery and scoring of data sources and their contents to drive fine-tuned recommendations to users who otherwise struggle in wrestling data and combining disparate sources to answer new questions.This innovation computes and keeps track of alternate and even richer data blending and harmonization strategies so users can quickly identify and pick the data sources and elements that serve to accurately and quickly answer business questions. The benefit to organizations is up to 20 times faster data discovery when determining the optimal overlap between large, disparate data sets and the ability to scale data discovery and preparation across more data complex sources and more users.
Global 2000 enterprises today are looking to combine complex, multi-structured data from multiple sources – including existing data repositories, on-premise and SaaS applications, and third-party data – in order to answer pressing business questions. When combining three, four, six or more sources at one time, all it takes is for one of the sources in the mix to be complex to make the entire data prep and blending process a very complex, labor-intensive headache that leaves many business stakeholders dissatisfied with the process and results. ClearStory provides the speed and flexibility that businesses need to ask and answer questions quickly based on data that is constantly evolving and in flux.
Traditional data analysis solutions were not designed for the speed, flexibility, and business user profile and skills that most enterprises now demand. Organizations struggle with accessing, preparing, combining and analyzing data due to the complexity within the data sources and the skillsets of today’s users, who are typically not data scientists or expert analysts. ClearStory’s business-optimized usability, coupled with an intelligent, machine-based approach to discovering and combining data even on large, complex sources, ensures a fast, consistent and simple experience so anyone can be self-sufficient in combining data and answering questions.
Key components of ClearStory’s Intelligent Data Harmonization advances include:
Granular Scoring for Alternate Blending Recommendations: The advances in Data Harmonization allow discovery and assessment of how data sources can be combined, and generate more granular alternate scores. The ClearStory system recommends distinct blending options based on these granular scores, enabling users to select the system-driven recommendation, or select alternate strategies guided by the system in a business-friendly user model.
Seamless Coupling for Infinite Data Overlap Detection (IDOD): ClearStory’s Intelligent Data HarmonizationTM advancements let users extend and expand on the recently announced Infinite Data Overlap Detection (IDOD) capability, which provides the ability to infer and harmonize data sets across unlimited categorical values for all data types. Organizations in different industries and verticals can take advantage of IDOD to automate blending of industry and company-specific data sets quickly and accurately.
Faster Performance and Scale across More Complex Data and More Users: The new harmonization advancements speed discovery, scoring and recommendations by up to 20 times when combining many sources that each may have many attributes and dimensions. Performance and scale are especially critical for organizations that have many users simultaneously accessing the same sources and harmonizing data across them to answer different questions.
“With our new advances that deliver faster, automated data preparation and granular harmonization, we’ve added more data science smarts inside our Spark-based platform,” said Tim Howes, CTO of ClearStory Data. “Everyday business users and non-expert analysts can now benefit from less frustration and time spent wrangling with large, complex data sets. Our fast data discovery, harmonization usability, and more granular recommendations let users spend more time answering urgent business questions, reaching meaningful insights, and iterating on-the-fly.”