Programming

  minte9
LearnRemember



Pandas / Group Rows


Group By

Groupby is one of the most powerful feature in pandas.
 
""" Grouping rows by values

Groupby is one of the most powerful feature in pandas.
We can group rows according to some shared value.

We can also group by a first column, the group that grouping 
by a second column.
"""

import pandas as pd
import pathlib

DIR = pathlib.Path(__file__).resolve().parent
df = pd.read_csv(DIR / '../titanic.csv')

T1 = df.groupby('Sex').count().to_markdown()
T2 = df.groupby('Sex').mean(numeric_only=True).to_markdown()
T3 = df.groupby('Sex')['Survived'].count().to_markdown()
T4 = df.groupby(['Sex', 'Survived'])['Age'].mean().to_markdown()

print("Grouped by Sex:"); print(T1, "\n")
print("Grouped by Sex averages:"); print(T2, "\n")
print("Grouped by Sex (count only Survived):"); print(T3, "\n")
print("Group by two columns (average):"); print(T4, "\n")

"""
Grouped by Sex:
| Sex    |   Name |   PClass |   Age |   Survived |   SexCode |
|:-------|-------:|---------:|------:|-----------:|----------:|
| female |    462 |      462 |   288 |        462 |       462 |
| male   |    851 |      851 |   468 |        851 |       851 | 

Grouped by Sex averages:
| Sex    |     Age |   Survived |   SexCode |
|:-------|--------:|-----------:|----------:|
| female | 29.3964 |   0.666667 |         1 |
| male   | 31.0143 |   0.166863 |         0 | 

Grouped by Sex (count only Survived):
| Sex    |   Survived |
|:-------|-----------:|
| female |        462 |
| male   |        851 | 

Group by two columns (average):
|               |     Age |
|:--------------|--------:|
| ('female', 0) | 24.9014 |
| ('female', 1) | 30.8671 |
| ('male', 0)   | 32.3208 |
| ('male', 1)   | 25.9519 | 
"""





References