Python Pandas-Merging Using the ‘how’ Argument

Aanu of Tech
The Startup
Published in
3 min readSep 4, 2020

--

In reality, Data Science projects often involves gathering information from variety of sources which might require data from multiple tables. Therefore, in order to conduct analysis, there is a need to join the tables. Merging in Python Pandas is a very effective way to successfully carry out this operation.

This tutorial aims to take you through the step by step process of different methods of merging data frames with pandas library in Python using the “how” argument.

First, Import datasets and convert to tables

import pandas as pd

df = pd.read_csv(‘link.csv’)
#convert year column to dateobject
df[‘Year’] = pd.to_datetime(df[‘Year’], format = ‘%Y’).dt.year
#groupby country
df= pd.DataFrame(df1.groupby([‘Country’, ‘Year’],as_index=False).sum())
print(df)

df1 = pd.read_excel (r’path were file is stored.xlsx’)
print(df1)

So, we have two tables: df and df1

df columns= Country, Year and Value

df1 columns= Country Name, Country Code, Year and value

In order to merge both tables, a primary key is needed. Notice that the column that signifies Country has different names for both tables. Therefore, let’s rename table df1 ‘Country Name’ column to ‘Country’ and group by the column.

Now, merging both dataframes

Notice that we merged left, this implies that the table above contain only rows that match with table df only. That is, any extra countries contained in table df1 that is not in table df is not included in the above table df2

Now let’s use right, inner and outer merge

The right merge removed all rows that do not match with the table at the right hand side. i.e, df1
THE RIGHT MERGE REMOVES ROWS THAT DO NOT MATCH WITH THE TABLE AT THE RIGHT HAND SIDE. I.E, DF1
THE OUTER MERGE RETAIN ALL ROWS

The inner merge remove rows that do not match in both dataframes. This is the default pandas merge in Python if you do not specify the kind of merge you want.

Kindly let me know of any comments, suggestions or questions you might have :)

--

--