Pandas

How to Groupby and Aggregate Multiple Columns in Pandas

Answer

Use .groupby() with .agg() and pass a dictionary specifying which aggregation to apply to each column. You can apply different functions to different columns, or multiple functions to the same column.

Why This Happens

Real analysis rarely needs just one aggregation. You want sum of sales, average order value, and count of transactions — all grouped by customer. The .agg() method lets you do this in one operation instead of multiple separate groupbys.

Solution

The rule: use .agg() with a dictionary for different functions per column, or named aggregation syntax for clean output column names.

import pandas as pd

df = pd.DataFrame({
    'customer': ['A', 'A', 'B', 'B', 'B'],
    'sales': [100, 150, 200, 50, 300],
    'quantity': [2, 3, 5, 1, 4]
})

# ✅ Different aggregations per column
df.groupby('customer').agg({
    'sales': 'sum',
    'quantity': 'mean'
})

# ✅ Multiple aggregations on same column
df.groupby('customer').agg({
    'sales': ['sum', 'mean', 'max'],
    'quantity': ['sum', 'count']
})

# ✅ With named output columns (cleaner)
df.groupby('customer').agg(
    total_sales=('sales', 'sum'),
    avg_sales=('sales', 'mean'),
    total_qty=('quantity', 'sum'),
    order_count=('quantity', 'count')
).reset_index()

# ✅ Custom aggregation functions
df.groupby('customer').agg({
    'sales': lambda x: x.max() - x.min()  # range
})

# ✅ Groupby multiple columns
df.groupby(['customer', 'region']).agg({'sales': 'sum'})

Better Workflow

Zerve persists your groupby results at the cell level, so you can experiment with different aggregations without re-running your entire data pipeline. Try a different grouping, see the result, iterate — your previous states are always there.