Pandas read excel formula as nan

Pandas Read Excel Formula as NaN

When using pandas to read an Excel file that contains formulas, sometimes the resulting DataFrame may have NaN values instead of the expected calculated values. This can happen due to various reasons and understanding these issues is important for data analysis.

Formula Evaluation

By default, pandas does not evaluate the formulas in Excel files when reading them. Instead, it simply reads the formula as a string and assigns NaN (Not a Number) values to those cells in the DataFrame.

Example:

Consider an Excel file named “data.xlsx” with the following content:

A B C
10 5 =A1+B1

When reading this file using pandas, the resulting DataFrame would look like:

      A   B      C
  0  10   5  =A1+B1
  

Notice that the formula “=A1+B1” is not evaluated, and the calculated value of 15 is not present in the DataFrame.

Solution: Using openpyxl

To overcome this issue, we can make use of the openpyxl library, which provides more control over Excel files. By installing openpyxl, we can force pandas to evaluate the formulas while reading the Excel file.

Example:

Here is an example code that demonstrates how to read an Excel file with formulas using pandas and openpyxl:

    import pandas as pd
    pd.options.mode.use_inf_as_na = False
    
    # Install openpyxl if not already installed
    # pip install openpyxl
    
    df = pd.read_excel("data.xlsx", engine="openpyxl")
    print(df)
  

Running this code would produce the following DataFrame:

       A   B   C
    0  10   5  15
  

Now, the formula in cell C1 has been evaluated, and the actual calculated value of 15 is present in the DataFrame.

Additional Considerations

It is important to note that the openpyxl engine may not fully support certain Excel features or file formats, so it’s recommended to verify the compatibility before relying on this approach. Additionally, some formulas may still produce NaN values if they reference cells with errors or if they require external dependencies that are not available.

Leave a comment