Pd.to_numeric unable to parse string

The pd.to_numeric() function in Python is used to convert values in a pandas Series or DataFrame to numeric data types. However, if the function encounters a value that it is unable to parse as a number, it will raise an error.

To better understand this, let’s consider the following examples:

Example 1:

Suppose we have a pandas Series with both numeric and non-numeric values:

import pandas as pd

data = pd.Series([‘1’, ‘2.5’, ‘3a’, ‘4.2’])

The output will be:

0 1
1 2.5
2 3a
3 4.2
dtype: object

If we try to convert this Series to numeric using pd.to_numeric():

numeric_data = pd.to_numeric(data)

The output will be:

0 1.0
1 2.5
2 NaN
3 4.2
dtype: float64

As we can see, the function was able to convert the strings ‘1’ and ‘2.5’ to numbers, but it encountered the value ‘3a’ which it couldn’t parse. For such cases, pd.to_numeric() assigns a NaN (Not a Number) value.

Example 2:

Let’s consider another example with a DataFrame:

data = pd.DataFrame({‘A’: [‘1’, ‘2’, ‘3’],
‘B’: [‘4’, ‘5’, ‘6a’]})

The output will be:

0 1 4
1 2 5
2 3 6a

If we try to convert this DataFrame to numeric using pd.to_numeric():

numeric_data = data.apply(pd.to_numeric, errors=’coerce’)

The output will be:

0 1 4.0
1 2 5.0
2 3 NaN

In this example, we used the apply() function along with pd.to_numeric() to apply the conversion to each column of the DataFrame. The errors='coerce' parameter is used to replace any non-numeric value with NaN.

By default, pd.to_numeric() raises a ValueError if it encounters a non-numeric value. However, by setting errors='coerce', it will instead replace such values with NaN.

This allows us to handle the conversion gracefully without encountering errors that might disrupt our workflow.

Leave a comment