Oh that link is a classic example of someone over-complicating statistics when it doesn't need to be!
Logistic regression is used in classification problems, i.e. predicting what category an observation falls into. So for example, predict the weather tomorrow when the options are "hot/cold/rain". In a binary case, this reduces to a two-class classification problem and the example before becomes, predict the weather when the options are "hot/cold". The result you predict is what's known as the 'response', this is compared to the true result, the 'truth'.
Now a slight divergence, a random variable X follows a Bernoulli distribution if the following is true:
P(X = 1) = p
P(X = 0) = 1 - P(X= 1) = 1 - p
Hopefully you will have seen before that in categorial predictions, the assignment of labels is arbitrary, so you could predict the weather tomorrow as 'hot' or 'cold' but equivalently you could predict the weather as '1' or '0', as long as you know that '1' corresponds to 'hot' and '0' corresponds to 'cold'.
Logistic regression assumes the response is conditionally Bernoulli distributed given the values of the features
This says that the prediction you make follows a Bernoulli distribution, which means that you only need to predict P(Weather = "hot") or P(Weather = "cold") but not both because P(Weather = "hot") = 1 - P(Weather = "cold"). And the statement about conditionality just means that the logistic regression model is trained on a matrix of features to make its prediction (which is the exact same as linear regression).
What about binary logistic regression? Do we have to care about the error term when doing inference on coefficients?
It requires less assumptions than linear regression but still a few. It does not require linearity, normally distributed errors or homoscedasticity. It does require independent observations (which is almost always an assumption), preferably independent features (or with little collinearity, again a common assumption).