Pandas: Average every n rows
Pandas is a popular Python library used for data manipulation and analysis. It provides a convenient way to perform operations on tabular data, including calculating averages.
To average every n rows in a pandas DataFrame, you can use the rolling()
method in combination with the mean()
method.
Example:
Suppose we have a DataFrame with the following data:
Index | Value |
---|---|
0 | 10 |
1 | 20 |
2 | 30 |
3 | 40 |
We want to average every 2 rows of the DataFrame. Here’s how you can do it:
import pandas as pd
# Create the DataFrame
data = {'Value': [10, 20, 30, 40]}
df = pd.DataFrame(data)
# Average every 2 rows
averaged_df = df['Value'].rolling(2).mean().iloc[1::2]
print(averaged_df)
The output will be:
1 15.0
3 35.0
Name: Value, dtype: float64
The rolling(2)
method creates a rolling window of size 2, which means it considers 2 consecutive rows at a time. The mean()
method calculates the average of each window. Finally, iloc[1::2]
selects every second row starting from the second row, since the first row doesn’t have a previous row to calculate the average. This ensures that we get the average every 2 rows.
You can adjust the window size as per your requirement. For example, if you want to average every 3 rows, use rolling(3)
.