How to Groupby and Aggregate Multiple Columns in Pandas
Answer
Use .groupby() with .agg() and pass a dictionary specifying which aggregation to apply to each column. You can apply different functions to different columns, or multiple functions to the same column.
Why This Happens
Real analysis rarely needs just one aggregation. You want sum of sales, average order value, and count of transactions โ all grouped by customer. The .agg() method lets you do this in one operation instead of multiple separate groupbys.
Solution
The rule: use .agg() with a dictionary for different functions per column, or named aggregation syntax for clean output column names.
import pandas as pd
df = pd.DataFrame({
'customer': ['A', 'A', 'B', 'B', 'B'],
'sales': [100, 150, 200, 50, 300],
'quantity': [2, 3, 5, 1, 4]
})
# โ
Different aggregations per column
df.groupby('customer').agg({
'sales': 'sum',
'quantity': 'mean'
})
# โ
Multiple aggregations on same column
df.groupby('customer').agg({
'sales': ['sum', 'mean', 'max'],
'quantity': ['sum', 'count']
})
# โ
With named output columns (cleaner)
df.groupby('customer').agg(
total_sales=('sales', 'sum'),
avg_sales=('sales', 'mean'),
total_qty=('quantity', 'sum'),
order_count=('quantity', 'count')
).reset_index()
# โ
Custom aggregation functions
df.groupby('customer').agg({
'sales': lambda x: x.max() - x.min() # range
})
# โ
Groupby multiple columns
df.groupby(['customer', 'region']).agg({'sales': 'sum'})Better Workflow
Zerve persists your groupby results at the cell level, so you can experiment with different aggregations without re-running your entire data pipeline. Try a different grouping, see the result, iterate โ your previous states are always there.
)
&w=1200&q=75)
&w=1200&q=75)
&w=1200&q=75)