Linear Regression in Python (Curve Fit y=a+bx)

In this Python program, we implement Linear Regression Method using Least Square Method to find curve of best fit of type y=a+bx.

We first read n data points from user and then we implement linear regression in Python programming language as follow:

Python Source Code: Linear Regression


# This is naive approach, there are shortcut methods for doing it!
# Linear regression method in Python
# Fitting y = a + bx to given n data points
import numpy as np

# Reading value of n
n = int(input("How many data points? "))

# Creating numpy array x & y to store n data points
x = np.zeros(n)
y = np.zeros(n)

# Reading data
print("Enter data:")
for i in range(n):
    x[i] = float(input("x["+str(i)+"]= "))
    y[i] = float(input("y["+str(i)+"]= "))
    
# Finding required sum for least square methods
sumX,sumX2,sumY,sumXY = 0,0,0,0
for i in range(n):
    sumX = sumX + x[i]
    sumX2 = sumX2 + x[i]*x[i]
    sumY = sumY + y[i]
    sumXY = sumXY + x[i]*y[i]

# Finding coefficients a and b
b = (n*sumXY-sumX*sumY)/(n*sumX2-sumX*sumX)
a = (sumY - b*sumX)/n

# Displaying coefficients a, b & equation
print("\nCoefficients are:")
print("a: ", a)
print("b: ", b)
print("And equation is: y = %0.4f + %0.4f x" %(a,b))

Output

How many data points? 4
Enter data:
x[0]= 1
y[0]= 2
x[1]= 2
y[1]= 3
x[2]= 3
y[2]= 4
x[3]= 4
y[3]= 5

Coefficients are:
a:  1.0
b:  1.0
And equation is: y = 1.0000 + 1.0000 x