Aditya Gireesh S

22070149003

Home price prediction using linear regression


Introduction

Home prices are predicted to keep an eye on price increasing over the next few years, but you can’t just trust your broker when it comes to home prices. To really know what you should be paying, it’s best to look at the data behind home prices themselves and then decide for yourself if this is the right time to buy or sell your home. That’s why you need our step-by-step guide on how to predict home prices using linear regression.

 

linear regression

One of the machine learning algorithms based on supervised learning is linear regression. It executes a regression operation. Regression uses independent variables to model a goal prediction value. It is mostly used to determine how variables and forecasting relate to one another. Regression models vary according to the number of independent variables they use and the type of relationship they take into account between the dependent and independent variables.


Diagram for example

The task of predicting a dependent variable's value (y) based on an independent variable(X) is carried out using linear regression. Therefore, x (input) and y (output) are found to be linearly related by this regression technique. Thus, the term "linear regression" was given.

 

 

Hypothesis function of linear regression is given by,

y = m*x + b

m = gradient or slope

x = independent variable

y = dependent variable

b = intercept.

 

Now, in stepwise manner let's see how linear regression is implemented for home price prediction.

Libraries

The libraries imported are pandas, NumPy, matplotlib and sklearn


Numerous effective methods for machine learning and statistical modelling, such as classification, regression, clustering, and dimensionality reduction, are included in the Sklearn package.

Pandas is mostly used for tabular data manipulation and analysis in Data Frames. Data can be imported into Pandas from a variety of file types, including Microsoft Excel, JSON, Parquet, SQL database tables, and comma-separated values.

One of the most potent Python libraries is NumPy. In the business, it is employed for array computing. Additionally, it will give a clear understanding of the typical mathematical operations.

For Python and its numerical extension NumPy, Matplotlib is a cross-platform data visualization and graphical charting package. As a result, it presents a strong open-source substitute for MATLAB.

Dataset

Once the data is saved in a CSV file, we will likely want to load and use it from time to time. You can do that with the Pandas read_csv() function:



We can use either xlsx file or csv file to import our dataset.

 

Inline function

The IPython environment supports drawing matplotlib figures thanks to the line magic command %matplotlib inline. The matplotlib plots will display underneath the cell in which the plot function was called for the remainder of the session once this command has been used in any cell.

As it distinguishes between plots of various cells, the matplotlib inline command is quite useful. The graph is plotted after the current cell for each cell; thus, the graphs of earlier cells are unaffected.



Here the x axis is assigned for area of the house and the y-axis is assigned for price. “plt.scatter” is used to plot the points on the graph.

 

Training

Here we first need to enter the linear regression object. As we have imported linear model from Sklearn,the linear object we enter will be “reg=linear_model.LinearRegression”.

Now we use “reg.fit()” command to fit our data frame.This command actually train the model. We should enter a 2D array i.e.., area for the first argument in the x-axis and we are going to enter price in the second argument for the y-axis.

So, we have a model ready to predict the home price according to the area of the house.




Prediction

The model predicts the home price using the following equation,

Price = m*area + b

To get the value of slope we can use the command “reg.coef_” and the value of intercept can be found out using the command “reg.intercept_”.



Now we are going to import a new excel sheet named “A” which imports the value of area needed to be predicted by our model. We are going to import five values of random value of area of house and use the command “reg.predict(A)” which will be assigned to the value “p”.

P= reg.predict(A) gives the prediction of home price for all the value of area imported in numerical terms.



Now we are going to use the scatter command which we have used before to visualize the prediction of the home price according to the imported value of the areas. 


This plot shows us the visual representation of the Linear Equation.

 

Conclusion

Hence, we have completed the home price prediction using linear regression and have also seen the way it is implemented in stepwise manner.







Comments