Reviews

Now Reading

Pandas

Next
Prev

Review

Pandas

Overview

Synopsis

Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language

Category

Data Mining Software Free

Features

•New .agg() API for Series/DataFrame similar to the groupby-rolling-resample API’s,
•Integration with the feather-format, including a new top-level pd.read_feather() and DataFrame.to_feather() methodThe .ix indexer has been deprecated,
•Panel has been deprecated
•Addition of an IntervalIndex and Interval scalar type,
•Improved user API when accessing levels in .groupby(),
•Improved support for UInt64 dtypes, A new orient for JSON serialization, orient='table' that uses the Table Schema spec,
•Experimental support for exporting DataFrame.style formats to Excel
•Window Binary Corr/Cov operations now return a MultiIndexed DataFrame rather than a Panel, as Panel is now deprecated,
•Support for S3 handling now uses s3fs,
•Google BigQuery support now uses the pandas-gbq library
•Switched the test framework to use pytest

License

Open Source

Price

Free

Pricing

Subscription

Free Trial

Available

Users Size

Small (<50 employees), Medium (50 to 1000 Enterprise (>1001 employees)

Website

pandas

Company

pandas

What is best?

•Improved user API when accessing levels in .groupby(),
•Improved support for UInt64 dtypes, A new orient for JSON serialization, orient='table' that uses the Table Schema spec,
•Experimental support for exporting DataFrame.style formats to Excel
•Window Binary Corr/Cov operations now return a MultiIndexed DataFrame rather than a Panel, as Panel is now deprecated,
•Support for S3 handling now uses s3fs,
•Google BigQuery support now uses the pandas-gbq library

What are the benefits?

• Perform fast, efficient data manipulation
• Access easy-to-use data structures
• Intelligent alignment of data
• Access hierarchical axis indexing of data
• Create domain-specific time offsets

PAT Rating™

Editor Rating

Aggregated User Rating

Rate Here

Ease of use

7.6

8.3

Features & Functionality

7.6

8.1

Advanced Features

7.6

8.1

Integration

7.6

8.3

Performance

7.6

—

Customer Support

7.6

—

Implementation

5.7

Renew & Recommend

—

Bottom Line

Intelligent data alignment and integrated handling of missing data: gain automatic label-based alignment in computations and easily manipulate messy data into an orderly form.

7.6

Editor Rating

7.7

Aggregated User Rating

4 ratings

You have rated this

Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. Pandas is a NUMFocus sponsored project.

This will help ensure the success of development of pandas as a world-class open-source project, and makes it possible to donate to the project. Best way to get pandas is to install via conda Builds for osx-64,linux-64,linux-32,win-64,win-32 for Python 2.7, Python 3.4, and Python 3.5 are all available. This is a major release from 0.19.2 and includes a number of API changes, deprecations, new features, enhancements, and performance improvements along with a large number of bug fixes.

We recommend that all users upgrade to this version. Pandas has changed the internal structure and layout of the codebase. This can affect imports that are not from the top-level pandas.* namespace. Check the API Changes and deprecations before updating. Version 0.20.1 contains one additional change for backwards-compatibility with downstream projects using pandas’ hashing routines.

The 0.19.x series includes some small regression fixes, bug fixes and performance improvements such as Compatibility with Python 3.6 and Added a Pandas Cheat Sheet. Python has long been great for data munging and preparation, but less so for data analysis and modeling. pandas helps fill this gap, enabling users to carry out entire data analysis workflow in Python without having to switch to a more domain specific language like R.

Combined with the excellent IPython toolkit and other libraries, the environment for doing data analysis in Python excels in performance, productivity, and the ability to collaborate. pandas doesn’t implement significant modeling functionality outside of linear and panel regression; for this, look to statsmodels and scikit-learn. More work is still needed to make Python a first class statistical modeling environment.

Filter reviews

1 Reviews

Leave a Review

Ernest Thormina
September 10, 2017 at 11:33 am

Easy-to-use data structures, and data analysis tools
Company size
Small (<50)

User Role
Super User
User Industry
Entertainment
Rating
Ease of use8.3
Features & Functionality8.1
Advanced Features8.1
Integration8.3

ADDITIONAL INFORMATION
Pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures, and data analysis tools. It enables users to carry out your entire data analysis workflow in Python programming language without having to switch to a more domain specific language like R. Pandas also offers fast and efficient data frame object for data manipulation with integrated indexing. It provides tools for reading and writing data between in-memory data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format. With Pandas, you can align data intelligently as well as integrate handling of missing data to gain automatic label-based alignment in computations and easily manipulate unstructured data into an orderly form. Pandas facilitates flexible reshaping and pivoting of data sets as well as intelligent label-based slicing, fancy indexing, and sub-setting of large data sets. It allows columns to be inserted and deleted from data structures for size mutability. Users can also aggregate or transform data with a powerful engine that allows split-apply-combine operations to be performed data sets. This sets the way for high performance merging and joining of data sets. Pandas hierarchical axis indexing provides an intuitive way of working with high-dimensional data in a lower-dimensional data structure. Pandas time series-functionality also supports date range generation and frequency conversion, moving window statistics, moving window linear regressions, date shifting and lagging. It even lets users create domain-specific time offsets and join time series without losing any data.