Pandas Read Excel with Formatting
Pandas is a powerful data manipulation library in Python that provides various functions to read and write data in different formats, including Excel files. When reading an Excel file using pandas, it is possible to preserve the formatting of the cells. This can be helpful in scenarios where the style and formatting of the data are important.
Example
Let’s consider an example where we have an Excel file named “data.xlsx” with some formatted data in it.
data.xlsx:
City | Temperature (°C) | Humidity (%) |
---|---|---|
Tokyo | 25 | 60 |
New York | 20 | 45 |
London | 15 | 50 |
To read this Excel file with formatting using pandas, we can utilize the `pandas.read_excel()` function. The `read_excel()` function returns a DataFrame object containing the data from the Excel file.
import pandas as pd
# Read the Excel file with formatting
df = pd.read_excel('data.xlsx', sheet_name='Sheet1', engine='openpyxl')
# Display the DataFrame
print(df)
The above code reads the Excel file “data.xlsx” using `read_excel()` and assigns it to the DataFrame object `df`. The `sheet_name` parameter specifies the name of the sheet to read from (in this case, “Sheet1”). The `engine` parameter is set to “openpyxl” to utilize the openpyxl engine for reading the Excel file.
Executing the code and printing the DataFrame will give the following output:
City Temperature (°C) Humidity (%)
0 Tokyo 25 60
1 New York 20 45
2 London 15 50
As seen in the output, the DataFrame retains the cell values from the Excel file along with any formatting applied, such as colors and font styles.
Please note that the formatting information is stored as cell properties, and not all formatting options may be preserved when reading an Excel file using pandas. By default, only the formatting applied to individual cells is preserved, and not the overall look and feel of the worksheet.