Bigdata

Now Reading

SAS Data Management helps to cleaning and organizing data

SAS has updated its SAS Data Management suite, including SAS Data Loader for Hadoop, SAS Federation Server and SAS Event Stream Processing to help data scientists spend less time preparing data and more time discovering value in it. “Data scientists and business analysts spend too much time cleaning and organizing data; they should be mining insights from it,” said Matthew Magne, Global Product Marketing Manager for Data Management at SAS. “These updates take the hassle out of data preparation, delivering faster access for organizations to the data they want, when and how they want it.”

The Town of Cary, N.C., uses SAS to integrate data from multiple sources, including its IoT-based water-conservation project Aquastar. By monitoring water usage by Cary residents, Aquastar can save resources and dollars. If the system detects abnormally high water usage, which can indicate a problem with an appliance or faucet, Aquastar immediately alerts the homeowner.

“The integration capabilities of SAS Data Integration Studio and SAS Data Management provide the needed infrastructure to rapidly analyze the data for our Internet of Things-based water meter initiative,” said Town of Cary Business Analyst Janelle Bailey. “We use SAS to blend data from different sources for a holistic view across multiple silos. As a result, we create triggers to identify higher-than-expected water usage which helps us to reduce cost, conserve water and improve the citizen experience.”

Updates to the SAS Data Management Suite span data cleansing, preparation, streaming and virtualization. They include:

• SAS Data Loader for Hadoop can run data quality functions in-memory on Apache Spark; perform in-cluster merging of multiple tables and faster Cloudera Impala queries; and integrate with additional Hadoop distributions.

• SAS Federation Server has improved security with dynamic data masking and better accuracy with on-demand data quality.

• SAS Event Stream Processing has streamlined integration with YARN on Hadoop and Apache Camel, and increased accuracy with in-stream data quality and machine learning.

• SAS/ACCESS to Amazon Redshift helps organizations create and execute Amazon Redshift code from SAS with both implicit conversion of SAS queries to Amazon Redshift SQL and explicit pass-through support.