Pandas sub columns

Pandas Sub-Columns

In pandas, sub-columns refer to the hierarchical or multi-level structure of column headers in a DataFrame.

Creating Sub-Columns

By using multi-indexing, you can create sub-columns in pandas. The MultiIndex class allows you to create multiple levels of column headers.

Here’s an example:

import pandas as pd

# Create a DataFrame with sub-columns
data = {'A': [1, 2, 3],
        'B': [4, 5, 6],
        'C': [7, 8, 9],
        'D': [10, 11, 12]}
df = pd.DataFrame(data)

# Create sub-columns
df.columns = pd.MultiIndex.from_tuples([('Group1', 'A'), ('Group1', 'B'), ('Group2', 'C'), ('Group2', 'D')])

# Display the DataFrame
df

This will result in the following DataFrame:

   Group1    Group2   
        A  B      C   D
0     1  4      7  10
1     2  5      8  11
2     3  6      9  12

Accessing Sub-Columns

You can access sub-columns by providing the levels of the column headers using the df.loc or df.iloc methods.

Here’s an example:

# Access sub-column 'A' from 'Group1'
df.loc[:, ('Group1', 'A')]

# Access sub-column 'C' from 'Group2'
df.iloc[:, ('Group2', 'C')]

Output:

0    1
1    2
2    3
Name: (Group1, A), dtype: int64

0    7
1    8
2    9
Name: (Group2, C), dtype: int64

Modifying Sub-Columns

You can modify the values in sub-columns as you would with any other DataFrame columns. For example:

# Change values in sub-column 'C' from 'Group2'
df.loc[:, ('Group2', 'C')] = [15, 16, 17]

# Display the modified DataFrame
df
   Group1    Group2   
        A  B      C   D
0     1  4     15  10
1     2  5     16  11
2     3  6     17  12

As you can see, the values in sub-column ‘C’ from ‘Group2’ have been modified.

Leave a comment