I have been reading about linear regression a lot on internet. And people everywhere use a model: y = w*x+b
and I have huge difficulties to understand why? As well I think they are overthinking with loss function. If you check the code bellow, my prediction model doesn't have constant, still it's working just fine.
import numpy as np
X = np.random.randint(100,1000,100).reshape(50,2) #features
y = np.mean(X,axis=1).reshape(50,1) #labels
w = np.random.random((2,1)) #weights
for i in range(100000):
model = X.dot(w) #model
loss = model-y #loss
w = w- (X.T.dot(loss))*0.0000000001 #gradient
if np.mean(np.abs(loss)) < 0.00001:
break
prediction_input = np.array([[8880,9000]])
print(prediction_input.dot(w))
Linear models are convex function, so local minima is always it's global. I am sure there is a reason to include constant b, however I don't get why. Is there any good explanation please?