R^2 Regression

January 09, 2023

R^2 Regression

#101daysofcode

It is a measure of how close the data are to the line of best fit given by are regression model.

1(100%) is best score and 0(0%) is the worst score.

Goodness of fit tell how well do, your model has predicted value

In Simple terms it shows how well the data fit the regression model (the goodness of fit).

How to Calculate R-Squared

The formula for calculating R-squared is:

Where:

SS_regressionis the sum of squares due to regression (explained sum of squares)
SS_totalis the total sum of squares

Practical

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.impute import SimpleImputer

############## DATA CLEANING #####################################################
df=pd.read_csv(r'path')
# Numerical
num_var=df.select_dtypes(include=['int64','float64']).columns
print(df[num_var])
im=SimpleImputer(strategy='mean')
im.fit(df[num_var])
df[num_var]=im.transform(df[num_var])
print(df[num_var].isnull().sum())

# Categorical 
cat_var=df.select_dtypes(include='O').columns
imp=SimpleImputer(strategy='most_frequent')
imp.fit(df[cat_var])
df[cat_var]=imp.transform(df[cat_var])

print(df.isnull().sum().sum())

################# DATA PREPROCESSING ###########################################
df2=df.drop(columns=df[cat_var]) Here you should try using different methods of 
                                 data preprocessing and please let me know
                                  if you used 
# print(df2)
################## DATA SPlITING ################################################
X=df2.drop(columns='price', axis=1)
y=df2['price']

from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.2, random_state=69)

################## FEATURE SCALING ####################################################
from sklearn.preprocessing import StandardScaler
sc=StandardScaler()
sc.fit(X_train)
X_train=sc.transform(X_train)
X_test=sc.transform(X_test)

################### TRAINING ####################################################
from sklearn.linear_model import LinearRegression
lr=LinearRegression()
lr.fit(X_train, y_train)

# print(lr.coef_) # used to  print feature coefeciant that our model has learned


print(lr.intercept_)

################# PREDICTION ###############################################
pre=lr.predict(X_test)
print(pre) # the predicted values 
print(y_test) # the original values 

score=lr.score(X_test, y_test) # shows you the accuracy percentage of your model
print(score*100)

########### R-square #####################################################
from sklearn.metrics import r2_score
y_pre=lr.predict(X_test)
r2=r2_score(y_test, y_pre)
print(r2)

Search This Blog

AstraIxCodes

Labels

R^2 Regression

R^2 Regression

How to Calculate R-Squared

Practical

Comments

Post a Comment

Popular Posts

css-2

Polynomial Regression