Linear Regression

January 11, 2023

Linear Regression

#101daysofcode

#day-3

Linear Regression is predicting the unknown values by looking at the known values. Example predicting someone's height by his weight and vice versa

There are many libraries that you could use for performing Linear Regression, but here I will be using SciKit Learn.

Before applying linear regression to your data make sure that it is cleaned- it means that it should not contain any missing values and your data must contain numerical values only if it has categorical values, you can use categorical encoding for it

Practical

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.impute import SimpleImputer

############## DATA CLEANING #####################################################

df=pd.read_csv(r'D:\Coding\Python\Machine Learning\Algorithms\Bengaluru_House_Data.csv')

# Numerical

num_var=df.select_dtypes(include=['int64','float64']).columns

print(df[num_var])

im=SimpleImputer(strategy='mean')

im.fit(df[num_var])

df[num_var]=im.transform(df[num_var])

print(df[num_var].isnull().sum())

# Categorical 

cat_var=df.select_dtypes(include='O').columns

imp=SimpleImputer(strategy='most_frequent')

imp.fit(df[cat_var])

df[cat_var]=imp.transform(df[cat_var])

print(df.isnull().sum().sum())

################# DATA PREPROCESSING ###########################################

df2=df.drop(columns=df[cat_var])

# print(df2) #this is not very effective way of data preprocessing I recommend, to use

             #different method 

################## DATA SPlITING ################################################

X=df2.drop(columns='price', axis=1)

y=df2['price']

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test=train_test_split(X, y, test_size=0.2, random_state=69)

################## FEATURE SCALING ####################################################

from sklearn.preprocessing import StandardScaler

sc=StandardScaler()

sc.fit(X_train)

X_train=sc.transform(X_train)

X_test=sc.transform(X_test)

################### TRAINING ####################################################

from sklearn.linear_model import LinearRegression

lr=LinearRegression()

lr.fit(X_train, y_train)

# print(lr.coef_) # used to  print feature coefeciant that our model has learned

print(lr.intercept_)

################# PREDICTION ###############################################

pre=lr.predict(X_test)

print(pre) # the predicted values 

print(y_test) # the original values 

score=lr.score(X_test, y_test) # shows you the accuracy percentage of your model

print(score*100)

print(pre)

Search This Blog

AstraIxCodes

Labels