Using Pandas GroupBy Transform Function


In this example we demonstrate how to use the groupby transform function from Pandas and add the outputs to the original dataframe.

We first create a dataframe players that contains the points and assists of 3 players over different games. We want to create two additional columns total_points and total_assists for each player that is added to the players dataframe. To do this we use groupby transform.

Groupby transform applies a function to each column in the dataframe in turn and returns a series that is the same length as the original dataframe but with values groupby whatever column(s) we pass to the groupby function.

In the below example, we show two equivalent ways to sum the points and assist columns grouped by each player using both a lambda function and a regular function. The outputs of the groupby transform function are then declared as new columns total_points and total_assists in the players dataframe. Finally we show how to groupby transform only the points column.

 1|  players = pd.DataFrame({'player':['A', 'A', 'B', 'B','C'], 
 2|                          'points':[22,27,8,12,16], 'assists':[1,2,7,4,11]})
 4|  # 1. GroupBy Transform using Lambda
 5|  players[['total_points','total_assists']] = players.groupby('player').transform(lambda x: x.sum())
 8|  # 2. GroupBy Transform using a function
 9|  def get_totals(x):
10|      return x.sum()
11|  players[['total_points','total_assists']] = players.groupby('player').transform(get_totals)
14|  # 3. GroupBy Transform for a single column
15|  players['total_points'] = players.groupby('player')['points'].transform(lambda x: x.sum())
Did you find this snippet useful?

Sign up for free to to add this to your code library