On the XOR problem, there is no linear decision boundary (linear in $x_1$ and $x_2$) that will be able to perfectly classify all 4 points. The weights and bias of a decision boundary that classifies the most points correctly is:
weight = $[1, 1]$ bias = $-1.5$
Producing the decision boundary:
Is there any linear model (of the features $x_1$ and $x_2$ only) that will converge to the decision boundary that classifies the points as best as is possible? If so, what is the model and why does it converge to this decision boundary?