Explain the different type of Imputer in ml

In machine learning, imputers are used to handle missing data in datasets. There are several different types of imputers, each with its own way of estimating missing values based on the available data. Here are some common types of imputers along with examples:

Mean Imputer:
The mean imputer replaces missing values with the mean of the available values for that feature.

import pandas as pd
from sklearn.impute import SimpleImputer

# Example DataFrame with missing values
data = pd.DataFrame({'A': [1, 2, None, 4, 5], 'B': [10, None, 30, 40, 50]})

# Create a mean imputer instance
mean_imputer = SimpleImputer(strategy='mean')

# Fit and transform the data
data_imputed = mean_imputer.fit_transform(data)

print(data_imputed)
# Output: [[1. 10.]
#          [2. 35.]
#          [3. 30.]
#          [4. 40.]
#          [5. 50.]]

Median Imputer:
The median imputer replaces missing values with the median of the available values for that feature.

from sklearn.impute import SimpleImputer

# Create a median imputer instance
median_imputer = SimpleImputer(strategy='median')

# Fit and transform the data
data_imputed = median_imputer.fit_transform(data)

print(data_imputed)
# Output: [[1. 10.]
#          [2. 35.]
#          [3. 30.]
#          [4. 40.]
#          [5. 50.]]

Most Frequent Imputer:
The most frequent imputer replaces missing values with the most frequent value (mode) of the available values for that feature.

from sklearn.impute import SimpleImputer

# Create a most frequent imputer instance
mode_imputer = SimpleImputer(strategy='most_frequent')

# Fit and transform the data
data_imputed = mode_imputer.fit_transform(data)

print(data_imputed)
# Output: [[1. 10.]
#          [2. 30.]
#          [1. 30.]
#          [4. 40.]
#          [5. 50.]]

KNN Imputer:
The KNN imputer replaces missing values by computing the average of the k-nearest neighbors for each sample with missing values.

from sklearn.impute import KNNImputer

# Create a KNN imputer instance
knn_imputer = KNNImputer(n_neighbors=2)

# Fit and transform the data
data_imputed = knn_imputer.fit_transform(data)

print(data_imputed)
# Output: [[1. 10.]
#          [2. 25.]
#          [3. 30.]
#          [4. 40.]
#          [5. 50.]]

Iterative Imputer:
The Iterative Imputer uses a multivariate approach to estimate missing values by modeling each feature with missing values as a function of other features.

from sklearn.experimental import enable_iterative_imputer
from sklearn.impute import IterativeImputer

# Create an Iterative imputer instance
iterative_imputer = IterativeImputer()

# Fit and transform the data
data_imputed = iterative_imputer.fit_transform(data)

print(data_imputed)
# Output: [[1. 10.]
#          [2. 33.]
#          [3. 30.]
#          [4. 40.]
#          [5. 50.]]

These are some of the common types of imputers used in machine learning. The choice of imputer depends on the nature of the data, the presence of missing values, and the specific characteristics of the problem you are trying to solve.

Debug School

Explain the different type of Imputer in ml

Top comments (0)