Introduction to Kernal PCA with Python

Principal component analysis is one of the most widely used methods for reducing dimensionality in unsupervised learning. One of the assumptions of PCA is that the data are linearly separated. Core PCA, is a variant of PCA that allows nonlinear data to be processed and made linearly separable.

If you’re wondering what linear separability is, the Python Machine Learning book we recently analyzed has a nice graphic illustrating it. Assuming we know that the data is generated in two groups, if the data is linearly separated, we can easily separate the data into small groups with a line as shown below. However, if the data is nonlinear, a more complex polynomial function may be required to partition the data. Since a classical PCA simply computes the PC as a linear combination of the underlying structure of the data, a classical PCA is unable to separate the nonlinear data.

Linear vs. nonlinear problem

What happens when you apply normal PCA to a data set that is not linearly separable? And how can we handle such a large data set? In this post, we will explore these problems using a script with examples.

Let’s start by downloading all the necessary packages to illustrate the use of Spring PCA. We first use the sklearn dataset module to create nonlinear datasets. And then we load two modules that will be useful for doing a regular PSA and a Spring Sklearn PSA.

from sklearn.datasets import make_circles
from sklearn.decomposition import PCA, KernelPCA
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
import pandas as pd

To create nonlinear data, we will use make_circles() to create circular data of two groups. Here we generate 200 colors data from two groups, one group has a circular structure and the other group contains random numbers concentrated in the center of the circle. The function make_circles() returns the data and group assignment for each observation.

# Create discontinuous linear data
X, y = make_circles(n_samples=200, random_state=1, noise=0.1, factor=0.1)

We will store the data in the Pandas data frame with a group assignment variable.

df =pd.DataFrame(X)
df.columns=[‘a’,’b’] df[y]=y

We can use the Seaborn scattering function to visualize the nonlinearity of the data.

sns.scatterplot(data=df,x=’a’,y=’b’,hue=y)

As expected, in this example we see that we have data from two groups with a distinct nonlinear pattern.

Non-linear circular data for PCA spring

Table of Contents

PCA for non-linear data

Now let’s apply normal PCA to this untrained data and see what the computers look like. We use Skliren’s PCA function to perform a PCA.

scikit_pca = PCA(n_components=2)
X_pca = scikit_pca.fit_transform(X)

To visualize the results of the conventional PCA, we construct a scatter plot between PC1 and PC2. Let us first store the PCA results in a Pandas data frame with a known group mapping.

pc_res = pd.DataFrame(X_pca)
pc_res.columns=[pc1,pc2] pc_res.head()
pc_res[‘y’]=y

The PCA graph shows that it is very similar to the original data, and there is no line to separate the data from the two groups.

sns.scatterplot(data=pc_res,x=’pc1′,y=’pc2′,hue=y)

PCA on non-linear circular data

Size reduction with core PCA using scientifically trained material

We now use the same data, but this time we apply Spring PCA using the kernalPCA() function in sklearn. The basic idea of spring automata is that we use the spring function to project nonlinear data into a higher space where the groups are linearly separated. And then use normal PCA to reduce the dimensionality.

This function uses the KernelPCA() function with the rbf kernel function to perform a kernel PCA.

kpca = KernelPCA(kernel=rbf,
fit_inverse_transform=True,
gamma=10,
n_components=2)
X_kpca = kpca.fit_transform(X)

Let’s keep the results in the data frame as before.

kpca_res = pd.DataFrame(X_kpca)
kpca_res.columns=[kpc1,kpc2] kpca_res[‘y’]=y
kpca_res.head()

We can now visualize the computers in the PCA core with a scatter diagram and clearly see that the data is linearly separated.

sns.scatterplot(data=kpca_res,x=’kpc1′,y=’kpc2′,hue=y)

PCA plot of nonlinear data with kernel PCA

The post Introduction to Kernal PCA with Python appeared first on .

kernel pca python from scratch,rbf kernel pca,kernel pca matlab,kernel pca example,kernel pca gamma,kernel pca pdf,Privacy settings,How Search works,polynomial kernel pca,kernel pca vs pca

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cooking content that keeps your audience buzzing

Introduction to Kernal PCA with Python

Gordon James

Related Posts

An Expert’s Explanation of Progressive Jackpot Slots

What Is A Private Cloud Solution? Enhancing Security And Compliance

The Importance of Slot Game Providers: Leading Companies in the Industry

Sites Like Ancestry – Find Family History with Ancestry Alternatives in 2021

GoG Galaxy 2.0: Game Launcher Ultimate Guide (2021)

ASUS ROG Phone 5 Series Price Revealed – Starts From RM2,999

Recommended

The Role of Medication Adherence Solutions in Improving Patient Outcomes

Jgf Ufyuf Cnfqk: A Revolutionary Concept for Innovation and Change

Opacanastain: The Revolutionary Material Transforming Construction, Electronics, and More

Eatmangacom: Your Go-To Food Delivery Service for Diverse Dining Options

Categories

Our Address: 222 Haloria Crossing, Vrentis Point, HV 12345

Categories

Cooking content that keeps your audience buzzing

Introduction to Kernal PCA with Python

PCA for non-linear data

Size reduction with core PCA using scientifically trained material

Related Tags:

Related Posts

Recommended

Categories

Our Address: 222 Haloria Crossing, Vrentis Point, HV 12345

Categories