Polars pivot

Polars Pivot

Polars is a powerful data manipulation library for Rust, providing functionalities similar to Pandas in Python. One of the useful features it offers is the ability to perform pivot operations on data, helping to transform it in a structured way. Let’s understand the concept of pivoting and see an example of how it works.

What is Pivoting?

Pivoting is a data transformation technique that involves reshaping a dataset by converting rows into columns or vice versa, based on some defined criteria. It is a common operation used for summarizing, aggregating, or rearranging data in a tabular format.

Polars Pivot Example

Let’s say we have a dataset containing information about sales transactions. Each row represents a single transaction, and the columns contain details such as the date, product name, quantity sold, and sales amount.

Date Product Quantity Sales Amount
2022-01-01 A 10 100.00
2022-01-01 B 20 200.00
2022-01-02 A 15 150.00
2022-01-02 B 25 250.00

In this example, we can use the Polars library to pivot the dataset and obtain a new representation where the products become columns, and the sales amount values are filled in accordingly:

Date A B
2022-01-01 100.00 200.00
2022-01-02 150.00 250.00

As you can see, each unique product value (A and B) becomes a separate column, and the sales amount values are filled in the corresponding cells based on the date and product.

How to Perform Pivot with Polars?

To achieve this pivot operation with Polars, you can use the `pivot` function provided by the library. This function requires three main parameters:

  • The column to use as the pivot index (e.g., ‘Date’ in our example).
  • The column to use as the pivot column (e.g., ‘Product’ in our example).
  • The column to aggregate and fill values (e.g., ‘Sales Amount’ in our example).

Here is an example code snippet that demonstrates how to perform the pivot operation using Polars:


    use polars::prelude::*;
    
    fn main() {
        let mut df = DataFrame::new(vec![
            Series::new("Date", &[2022-01-01, 2022-01-01, 2022-01-02, 2022-01-02]),
            Series::new("Product", &["A", "B", "A", "B"]),
            Series::new("Quantity", &[10, 20, 15, 25]),
            Series::new("Sales Amount", &[100.00, 200.00, 150.00, 250.00]),
        ]);
    
        // Perform pivot
        let pivoted_df = df.pivot("Date", "Product", "Sales Amount", Operator::Sum).unwrap();
        
        println!("{:?}", pivoted_df);
    }
  

This code defines a DataFrame with the same data as our example and then uses the `pivot` function to pivot the data based on the ‘Date’ and ‘Product’ columns. The `Operator::Sum` parameter specifies that we want to aggregate the sales amount values by taking their sum.

The `pivot` function returns a new DataFrame (`pivoted_df` in this case) containing the pivoted representation of the data. You can further perform operations on this DataFrame or export it to different file formats as needed.

That’s how you can utilize Polars to perform pivoting operations on your data. It offers a flexible and efficient way to reshape and transform datasets according to your requirements.

Leave a comment