Polynomial Regression
What is linear regression: -
Linear regression analysis is used to predict the value of a variable based on the value of another variable
to know more about Linear Regression checkout - Linear RegressionMultiple Linear Regression: -
Multiple linear regression is a regression model that estimates the relationship between a quantitative dependent variable and two or more independent variables using a straight line.
Polynomial Regression:
In statistics, polynomial regression is a form of regression analysis in which the relationship between the independent variable x and the dependent variable y is modelled as an nth degree polynomial in x.
As you could that the polynomial regression is essential when data is not in the linear form (i.e., the line here is curved, but in linear regression it is straight)
Practical
import pandas as pd
from sklearn.impute import SimpleImputer
import numpy as np
df=pd.read_csv(r'path')
# Numerical
num_var=df.select_dtypes(include=['int64','float64']).columns
print(df[num_var])
im=SimpleImputer(strategy='mean')
im.fit(df[num_var])
df[num_var]=im.transform(df[num_var])
print(df[num_var].isnull().sum())
# Categorical
cat_var=df.select_dtypes(include='O').columns
imp=SimpleImputer(strategy='most_frequent')
imp.fit(df[cat_var])
df[cat_var]=imp.transform(df[cat_var])
print(df.isnull().sum().sum())
################# DATA PREPROCESSING ###########################################
df2=df.drop(columns=df[cat_var])
# print(df2)
################## DATA SPlITING ################################################
X=df2.drop(columns='price', axis=1)
y=df2['price']
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.2, random_state=69)
################## FEATURE SCALING ####################################################
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
sc.fit(X_train)
X_train=sc.transform(X_train)
X_test=sc.transform(X_test)
############### Training- Polynomial Regression #############################################
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
pl=PolynomialFeatures(degree=10)
pl.fit(X_train)
X_train_pl=pl.transform(X_train)
X_test_pl=pl.transform(X_test)
print(X_train.shape)
####### Linear Regression ##################################################
lr=LinearRegression()
lr.fit(X_train_pl, y_train)
print(lr.score(X_test_pl, y_test))
######### Prediction #############################################################
pre=lr.predict(X_test_pl)
print(pre)
print(y_test)
########## RMSE, MSE ################################################################
from sklearn.metrics import mean_squared_error
mse=mean_squared_error(y_test, pre)
rmse=np.sqrt(mse)
print(mse,rmse)
I would appreciate if you could make it better
Comments
Post a Comment