MLearning
/
Pandas
- 1 Supervised ML 4
-
Classifier S
-
Linear model S
-
Basis expansion S
-
Regularization S
- 2 Matplotlib 2
-
Subplots S
-
Pyplot S
- 3 Datasets 4
-
Iris species S
-
Diabetes S
-
Breast cancer S
-
Simulated data S
- 4 Numpy 7
-
Matrices S
-
Sparse matrices S
-
Vectorize S
-
Average S
-
Standard deviation S
-
Reshape S
-
Multiplication S
- 5 Pandas 5
-
Read data S
-
Data cleaning S
-
Find values S
-
Group rows S
-
Merge data S
- 6 Calculus 2
-
Derivatives S
-
Integrals S
- 7 Algorithms 3
-
K nearest neighbors S
-
Linear regression S
-
Gradient descent S
S
R
Q
ML Pandas Merge Data
Merge, get all that data in one place Outer join, the 'how' parameter is used pd.merge(A, B, on='id') pd.merge(A, B, on='id', how='outer')
Merge
p56 In real world we usually are faced with multiple sources.
""" Merging DataFrames
In real world we usually are faced with multiple sources.
Sometimes we need to get all that data in one place.
For outer join, the 'how' parameter is used.
"""
import pandas as pd
employes = pd.DataFrame()
employes['id_employee'] = [1, 2, 3, 4]
employes['name'] = ['John', 'Mary', 'Bob', 'Michael']
sales = pd.DataFrame()
sales['id_employee'] = [3, 4, 5, 6]
sales['total_sales'] = [10000, 20000, 30000, 40000]
# Inner join (default)
T = pd.merge(employes, sales, on='id_employee')
print(T.to_markdown())
# | | id_employee | name | total_sales |
# |---:|--------------:|:--------|--------------:|
# | 0 | 3 | Bob | 10000 |
# | 1 | 4 | Michael | 20000 |
# Outer join (how)
T = pd.merge(employes, sales, on='id_employee', how='outer')
print(T.to_markdown())
# | | id_employee | name | total_sales |
# |---:|--------------:|:--------|--------------:|
# | 0 | 1 | John | nan |
# | 1 | 2 | Mary | nan |
# | 2 | 3 | Bob | 10000 |
# | 3 | 4 | Michael | 20000 |
# | 4 | 5 | nan | 30000 |
# | 5 | 6 | nan | 40000 |
# Column name (in each) to merge on
T = pd.merge(employes, sales, left_on='id_employee', right_on='id_employee')
print(T.to_markdown())
# | | id_employee | name | total_sales |
# |---:|--------------:|:--------|--------------:|
# | 0 | 3 | Bob | 10000 |
# | 1 | 4 | Michael | 20000 |
➥ Questions
Last update: 46 days ago