TrustRadius: an HG Insights company

pandas

Score10 out of 10

3 Reviews and Ratings

What is pandas?

pandas is an open-source Python package designed for data analysis and manipulation. According to the vendor, it provides fast, flexible, and expressive data structures that make working with relational or labeled data easy and intuitive. pandas is said to be suitable for companies of all sizes, from small startups to large enterprises. It is widely used by data scientists, data analysts, financial analysts, statisticians, and social scientists across various industries.

Key Features

Data structures: According to the vendor, pandas provides two main data structures, Series and DataFrame, which allow for efficient and organized handling of data.

Easy handling of missing data: pandas is said to offer easy handling of missing data, represented as NaN, in both floating-point and non-floating-point data, ensuring accurate analysis and computations.

Size mutability: According to the vendor, columns can be easily inserted and deleted from DataFrame and higher dimensional objects, allowing for size mutability and flexibility in data manipulation.

Automatic and explicit data alignment: Users can choose to explicitly align objects to a set of labels or let pandas automatically align the data for computations, ensuring accurate and consistent results, as claimed by the vendor.

Powerful group by functionality: pandas is said to provide powerful group by functionality for split-apply-combine operations on data sets, allowing for both aggregating and transforming data, making complex data analysis tasks more accessible, according to the vendor.

Conversion of data from other Python and NumPy structures: According to the vendor, pandas simplifies the conversion of ragged, differently-indexed data from other Python and NumPy data structures into DataFrame objects, enabling seamless integration with existing data sources.

Intelligent label-based slicing and indexing: pandas is said to offer intelligent label-based slicing, fancy indexing, and subsetting of large data sets, making it easy to extract specific data for analysis and visualization purposes, as claimed by the vendor.

Merging and joining data sets: According to the vendor, pandas provides intuitive merging and joining capabilities, allowing for the combination of multiple data sources and enabling comprehensive analysis from diverse data sets.

Flexible reshaping and pivoting of data sets: According to the vendor, users can easily reshape and pivot data sets with pandas, enabling efficient data restructuring for further analysis and exploration.

Hierarchical labeling of axes: According to the vendor, pandas supports hierarchical labeling of axes, allowing for multiple labels per tick, providing enhanced data visualization and analysis capabilities.

Categories & Use Cases

Top Performing Features

  • Automatic Data Format Detection

    Automatic detection of data formats and schemas

    Category average: 9.2

  • Connect to Multiple Data Sources

    Ability to connect to a wide variety of data sources including data lakes or data warehouses for data ingestion

    Category average: 8.7

  • Extend Existing Data Sources

    Use R or Python to create custom connectors for any APIs or databases

    Category average: 8.9

Areas for Improvement

  • MDM Integration

    Integration with MDM and metadata dictionaries

    Category average: 7.8

analytics implemented with pandas are great performance debugging tools

Use Cases and Deployment Scope

We use pandas in our analytics framework to calculate and analyze performance metrics of the operational data. It is mostly about response time for various APIs and resource consumption.

Pros

  • It is easy to do statistical analysis
  • It is easy to clean the data
  • It is easy to produce graphs and charts to visualize

Cons

  • There are a lot of libraries and ways to do visualization. Sometimes it is very confusing.
  • Error handling can be a challenge. Sometimes the error messages do not provide valuable clues for the debugging.
  • In our case, there are a bunch of different frameworks and libraries working together. I would rather work with one framework, well tuned for my use case

Return on Investment

  • Performance debugging was time consuming and mostly poorly automated exploratory process. Once we started use pandas for these tasks, it really moved the needle. Pandas are instrumental to provide actionable insights. As a result we were able to improve notably cloud software resource utilization and performance
  • Analytics implemented with pandas allow us to detect and. address problems in our APIs before they are notable to our customers

Usability

Alternatives Considered

Splunk AppDynamics, Splunk IT Service Intelligence (ITSI), Dynatrace and New Relic

Other Software Used

Splunk AppDynamics, Dynatrace, Splunk Enterprise