Problem
1. Suppose we are training a model using stochastic gradient descent. How do we know if we are converging to a solution?
2. Do gradient descent methods always converge to the same point?
3. What assumptions are required for linear regression? What if some of these assumptions are violated?
4. How do we train a logistic regression model? How do we interpret its coefficients?