Debug School

rakesh kumar
rakesh kumar

Posted on • Updated on

Different method to combine the dataframe in pandas

Below is a checklist of common commands for DataFrame combination in pandas, along with examples and expected outputs:

In pandas, join, merge, and concat are three different methods used for combining or merging DataFrames. Here's an explanation of each along with examples and outputs:

concat

The concat function is used for concatenating DataFrames along a particular axis (either rows or columns).

import pandas as pd

# Example DataFrames
df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1']})
df2 = pd.DataFrame({'A': ['A2', 'A3'], 'B': ['B2', 'B3']})

# Concatenating along rows (axis=0)
result_row = pd.concat([df1, df2])
print("Concatenate along rows:")
print(result_row)

# Concatenating along columns (axis=1)
result_col = pd.concat([df1, df2], axis=1)
print("\nConcatenate along columns:")
print(result_col)
Enter fullscreen mode Exit fullscreen mode

Output:

Concatenate along rows:

    A   B
0  A0  B0
1  A1  B1
0  A2  B2
1  A3  B3
Enter fullscreen mode Exit fullscreen mode

Concatenate along columns:

    A   B   A   B
0  A0  B0  A2  B2
1  A1  B1  A3  B3
Enter fullscreen mode Exit fullscreen mode

merge

:
The merge function is used for combining DataFrames based on a common column or index.

# Example DataFrames
df1 = pd.DataFrame({'key': ['K0', 'K1'], 'value': ['V0', 'V1']})
df2 = pd.DataFrame({'key': ['K0', 'K1'], 'value': ['V2', 'V3']})

# Merging based on the 'key' column
result = pd.merge(df1, df2, on='key')
print("Merge based on 'key' column:")
print(result)
Enter fullscreen mode Exit fullscreen mode

Output:

Merge based on 'key' column:
  key value_x value_y
0  K0      V0      V2
1  K1      V1      V3
Enter fullscreen mode Exit fullscreen mode

join

The join method is used for combining DataFrames based on their indexes.

# Example DataFrames with shared index
df1 = pd.DataFrame({'A': ['A0', 'A1'], 'B': ['B0', 'B1']}, index=['K0', 'K1'])
df2 = pd.DataFrame({'C': ['C0', 'C1'], 'D': ['D0', 'D1']}, index=['K0', 'K1'])

# Joining based on the index
result = df1.join(df2)
print("Join based on index:")
print(result)
Enter fullscreen mode Exit fullscreen mode

Output:

Join based on index:

     A   B   C   D
K0  A0  B0  C0  D0
K1  A1  B1  C1  D1
Enter fullscreen mode Exit fullscreen mode

Key Differences

  1. concat is used for simple concatenation along a specified axis.
  2. merge is used for combining DataFrames based on the values of specified columns.
  3. join is used for combining DataFrames based on their indexes .

Top comments (0)