Pandas Sub-Columns
In pandas, sub-columns refer to the hierarchical or multi-level structure of column headers in a DataFrame.
Creating Sub-Columns
By using multi-indexing, you can create sub-columns in pandas. The MultiIndex class allows you to create multiple levels of column headers.
Here’s an example:
import pandas as pd
# Create a DataFrame with sub-columns
data = {'A': [1, 2, 3],
'B': [4, 5, 6],
'C': [7, 8, 9],
'D': [10, 11, 12]}
df = pd.DataFrame(data)
# Create sub-columns
df.columns = pd.MultiIndex.from_tuples([('Group1', 'A'), ('Group1', 'B'), ('Group2', 'C'), ('Group2', 'D')])
# Display the DataFrame
df
This will result in the following DataFrame:
Group1 Group2
A B C D
0 1 4 7 10
1 2 5 8 11
2 3 6 9 12
Accessing Sub-Columns
You can access sub-columns by providing the levels of the column headers using the df.loc
or df.iloc
methods.
Here’s an example:
# Access sub-column 'A' from 'Group1'
df.loc[:, ('Group1', 'A')]
# Access sub-column 'C' from 'Group2'
df.iloc[:, ('Group2', 'C')]
Output:
0 1
1 2
2 3
Name: (Group1, A), dtype: int64
0 7
1 8
2 9
Name: (Group2, C), dtype: int64
Modifying Sub-Columns
You can modify the values in sub-columns as you would with any other DataFrame columns. For example:
# Change values in sub-column 'C' from 'Group2'
df.loc[:, ('Group2', 'C')] = [15, 16, 17]
# Display the modified DataFrame
df
Group1 Group2
A B C D
0 1 4 15 10
1 2 5 16 11
2 3 6 17 12
As you can see, the values in sub-column ‘C’ from ‘Group2’ have been modified.