Linear Regression Code in Python for absolute beginners.

Brief about Linear Regression Algorithm

In Linear Regression, there is one independent variable(x) and one dependent variable(y). Linear Regression assumes that there is a linear relationship between the independent variable x and dependent variable y. 

The linear Regression algorithm tries to fit the equation of a straight line on to our data points. 

y=mx+c

where, 

y= Dependent variable

x= Independent variable

m= Slope of the line

c= y-intercept

Training our Machine

The machine will be trained on the training data set and the machine will compute the equation of line based on those data points ie. finding values of slope(m) and y-intercept(c).

Predicting Dependent variable

Now, as our machine is trained on the training data set, we can predict the values of the dependent variable and compare it against the actual values on our testing data set. 

To check whether the machine is predicting accurate or not, we calculate the accuracy of the prediction.

Code for Linear regression on Jupyter Notebook using Python

Step1 Importing all required libraries


Step2 Reading data in Jupyter Notebook

within ' ' enter the path to the data set in your system. 

.head() will show the first 5 rows of the data set


Step3 Separating Dependent and Independent Variable

The .drop function will remove the column mentioned in ' ' and store the remaining data in x.

[[' "]] will store that column in y.


Step4 Preparing training and testing data set


train_size will tell the machine what percent of data is considered as a training set and the remaining will be testing data set. The training data will be chosen randomly and not in sequence.

 Here train_size is 0.8 which means 80% of data will be used for training the machine and the rest 20% will be used for testing.


Step5 Importing Linear Regression and making it class object



Step6 Fitting training data set in our model



Step7 Predicting values dependent variable(y)




Step8 Calculating accuracy of predictions


score will give the accuracy and precision of the model. 




Note-
This code is applicable when data is in numeric format. If data has any categorical data then it needs to be encoded first using a label encoder.

This is the easiest code to run the Linear Regression algorithm. If you need an example of fitting actual data then write in the comment and I will make a separate blog for it.

Thanks For Reading!!

Comments

Popular posts from this blog

Logistic Regression Code in Python for absolute beginners