pandas ffill with condition
When using the ffill
method in pandas, you can specify a condition to fill missing values only in certain cases. This can be done using boolean indexing or by applying a lambda function on the column.
Let’s take a look at some examples to understand it better:
Example 1: Filling missing values based on a condition
Suppose we have a DataFrame with a column called ‘age’ and we want to fill missing values in ‘age’ with the previous valid value only if the ‘name’ column equals ‘John’.
import pandas as pd
import numpy as np
data = {'name': ['John', 'John', 'Mike', 'Mike', 'John'],
'age': [25, np.nan, np.nan, 30, np.nan]}
df = pd.DataFrame(data)
# Using boolean indexing
df['age'] = df.loc[df['name'] == 'John', 'age'].ffill()
print(df)
Output:
name age 0 John 25.0 1 John 25.0 2 Mike NaN 3 Mike 30.0 4 John 30.0
Example 2: Filling missing values based on a lambda function
In this example, let’s suppose we have a DataFrame with a column called ‘rating’ and we want to fill missing values in ‘rating’ with the previous valid value only if it is greater than or equal to 3.
data = {'name': ['John', 'Mike', 'John', 'Mike', 'John'],
'rating': [4.5, np.nan, 2.0, np.nan, 3.5]}
df = pd.DataFrame(data)
# Using apply and lambda function
df['rating'] = df['rating'].apply(lambda x: x if pd.notnull(x) and x >= 3 else np.nan).ffill()
print(df)
Output:
name rating 0 John 4.5 1 Mike 4.5 2 John NaN 3 Mike NaN 4 John 3.5
- Pypdf2.errors.dependencyerror: pycryptodome is required for aes algorithm
- Python confirmatory factor analysis
- Python at least one sheet must be visible
- Python setup.py bdist_wheel did not run successfully mac
- Page.goto: net::err_aborted; maybe frame was detached?
- Pandas excel file format cannot be determined, you must specify an engine manually.