The pd.to_numeric()
function in Python is used to convert values in a pandas Series or DataFrame to numeric data types. However, if the function encounters a value that it is unable to parse as a number, it will raise an error.
To better understand this, let’s consider the following examples:
Example 1:
Suppose we have a pandas Series with both numeric and non-numeric values:
“`python
import pandas as pd
data = pd.Series([‘1’, ‘2.5’, ‘3a’, ‘4.2’])
print(data)
“`
The output will be:
“`
0 1
1 2.5
2 3a
3 4.2
dtype: object
“`
If we try to convert this Series to numeric using pd.to_numeric()
:
“`python
numeric_data = pd.to_numeric(data)
print(numeric_data)
“`
The output will be:
“`
0 1.0
1 2.5
2 NaN
3 4.2
dtype: float64
“`
As we can see, the function was able to convert the strings ‘1’ and ‘2.5’ to numbers, but it encountered the value ‘3a’ which it couldn’t parse. For such cases, pd.to_numeric()
assigns a NaN (Not a Number) value.
Example 2:
Let’s consider another example with a DataFrame:
“`python
data = pd.DataFrame({‘A’: [‘1’, ‘2’, ‘3’],
‘B’: [‘4’, ‘5’, ‘6a’]})
print(data)
“`
The output will be:
“`
A B
0 1 4
1 2 5
2 3 6a
“`
If we try to convert this DataFrame to numeric using pd.to_numeric()
:
“`python
numeric_data = data.apply(pd.to_numeric, errors=’coerce’)
print(numeric_data)
“`
The output will be:
“`
A B
0 1 4.0
1 2 5.0
2 3 NaN
“`
In this example, we used the apply()
function along with pd.to_numeric()
to apply the conversion to each column of the DataFrame. The errors='coerce'
parameter is used to replace any non-numeric value with NaN.
By default, pd.to_numeric()
raises a ValueError
if it encounters a non-numeric value. However, by setting errors='coerce'
, it will instead replace such values with NaN.
This allows us to handle the conversion gracefully without encountering errors that might disrupt our workflow.
- Psycopg2.errors.insufficientprivilege: permission denied for schema public
- Property assignment expected.ts(1136)
- Process is terminated due to stackoverflowexception. c#
- Pandas.errors.emptydataerror: no columns to parse from file
- Promise is not defined
- Pandas._config.config.optionerror: ‘pattern matched multiple keys’
- Pd.read_sql timeout