City of Chicago automate delivery of open data sets using Pentaho
City of Chicago automate delivery of open data sets using Pentaho : City of Chicago has developed an OpenData ETL Utility Kit with Pentaho Data Integration, enabling the automated delivery of the city’s massive, open data sets from sources like traffic congestion, crime data, water quality statistics and car permit data. This data is driving the development of initiatives and applications that make life easier, healthier and more efficient for Chicago citizens.
The City of Chicago Department of Innovation and Technology (DoIT) was introduced in 2008 to add innovation to the charter of the former Department of Business and Information Services. As the central information technology organization for the City, DoIT provides a number of technology and telecommunications services to departments, the Mayor, Aldermen, other city agencies, residents, businesses and tourists. The department also supports City innovation providing departments with tools to deliver valuable new processes and services, designing and implementing systems that deliver these processes and services and applying technology where appropriate to automate processes and services.
Chicago has built a leading open data program relying on automation to deliver updated data. While initially created to increase transparency and accountability for Chicago’s government, multiple departments and developers have begun launching programs that enrich the lives of Chicago’s residents. For example, developers created Rack It, an iOS app that lets people find bike racks throughout the city. The SweepAround.us app leverages street sweeping records to notify residents when to move their cars and avoid street cleaning tickets.
OpenData ETL Utility Kit provides several utilities and framework to help governments deploy automated ETLs using the open-source Pentaho data integration (Kettle) software. The toolkit allow loading data from a database and upload it to a Socrata data portal, integrates with an SMTP server to provide e-mail alerts on the outcome of ETL scripts to administrators, handles deployment issues when using multiple operating systems during development and utilities to allow administrators to quickly analyze the log files of ETLs for quick diagnostics
“Pentaho allows us to automate the delivery of hundreds of data sets to the public, which can be used for accountability, application development, or even research,” said Tom Schenk, Chief Data Officer at City of Chicago. “Pentaho Data Integration has allowed us to develop an open-source framework that can be used by the hundreds of other cities with open data portals. Their commitment to an open source innovation model has helped us grow efficiently and share our work with other governments.”
The complexity and quantity of these data sets encouraged the City of Chicago to embed Pentaho Data Integration, to deliver hundreds of the city’s data sets to the public. The Pentaho ETL capabilities transform and refine the data, allowing anyone, in any city, in any department, to deploy automated ETL using the Pentaho software. Other cities are being encouraged to leverage the same tools that the City of Chicago has created, so that they too can start to address societal challenges through the lens of data.
“Pentaho makes it easy and cost-effective for the City of Chicago to orchestrate the process of preparing large, diverse, blended data sets to build these next generation applications and initiatives for the city,” said Donna Prlich, Vice President Product Marketing at Pentaho. “The work we’re seeing from the team is transformational - this is the true power of connecting people and data.”