Data Preprocessing Using Data Reduction Techniques In Python

import numpy as np import matplotlib.pyplot as plt  
from sklearn.model_selection import train_test_split
from sklearn.feature_selection import VarianceThreshold, RFE,SelectFromModel, SelectKBest, f_classif, chi2,mutual_info_classif
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import confusion_matrix, classification_report from sklearn.datasets import load_iris

Principal Component Analysis (PCA)

fig = plt.figure(figsize = (8,8))ax = fig.add_subplot(1,1,1) 
ax.set_xlabel('Principal Component 1', fontsize = 10)
ax.set_ylabel('Principal Component 2', fontsize = 10)
ax.set_title('2 component PCA', fontsize = 15)targets = ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']
colors = ['r', 'g', 'b']for target, color in zip(targets,colors):
indicesToKeep = finalDf['target'] == target
ax.scatter(finalDf.loc[indicesToKeep, 'principal component 1']
, finalDf.loc[indicesToKeep, 'principal component 2']
, c = color
, s = 50)
  • Univariate feature selection works by selecting the best features based on univariate statistical tests.
  • We compare each feature to the target variable to see a statistically significant relationship between them.
  • When we analyze the relationship between one feature and the target variable, we ignore the other features. That is why it is called ‘univariate’.
  • Each feature has its test score.
  • Finally, all the test scores are compared, and the features with top scores will be selected.
  1. f_classif
  1. for classification
  2. for regression




Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Agile at Archimedes Digital

Buy Verified Cash App Account with BTC Enable

What Is The Error 0x0 0x0 Code? How Do You Fix This 0x0 0x0 Error?

Little Box goes pew pew

Android fundamentals 05.3: Adaptive layouts

Refresh Token Based Authentication In ROR

Ruby and Return Values

TCP/IP routing gateways

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store


More from Medium

HDFS data formats

Battery capacity estimation using Machine Learning : Part-2

Mastering Data scraping with python

Randomness Test — Run Test