In response to 2)
Recall that linear regression is a conditional mean. Therefore, an "individual coefficient" hypothesis for the $j$-th coefficient is an hypothesis about $\mathbb{E}[Y|X_j]$. An hypothesis about "everything together" is an hypothesis about $\mathbb{E}[Y|X_1,X_2,...,X_J]$. Therefore your hypotheses are always in a sense conditional on each other. An hypothesis about a single coefficient is kind of like a marginal hypothesis, "averaged over" values of the other predictors. An hypothesis about everything together is a joint hypothesis. For that reason, hypotheses about individual coefficients based on pairwise relationships tend not to translate to good joint hypotheses.
This is where the first-semester surprise comes from, where you fit two univariate regressions that have significant coefficients but when you put them together in multiple regression, or add a third predictor, or an interaction, they are both nonsignificant. Better yet is when they only become significant when you add the interaction. Bonus points if the interaction itself is non-significant.
Unless I misunderstood you, and you're asking about how to test an existing model. For that I defer to David Giles at his blog. The punch line is that you probably shouldn't test individual coefficients unless you have a substantive reason for doing so, but you should read the whole thing because it's a fantastic post and everyone that ever plans to use multiple regression should read the whole thing.
It's also not meaningful to talk about correlation between $\beta_j$s outside of a Bayesian context (although I got into a debate with another poster on a related subject). Correlation between $\hat{\beta}_j$s is different, and correlation between $X_j$s is different still. All statistics packages take the former "into account" because, well, they explicitly compute $\mathbb{V}[\hat{\beta}]$, and standard error is computed using the main diagonal of that matrix, which is what you typically use to test hypotheses. The latter isn't a big deal in principle, except for the fact that highly correlated predictors will "steal" the magnitudes of their coefficients from each other, particularly if they are on different measurement scales. It's very often good practice to re-scale and center your variables. And if they are perfectly correlated, you don't have a full rank $X$ matrix and regression is mathematically impossible. If they are very very highly correlated, this is theoretically okay but it will make your computer very unhappy and you will get numerical issues trying to invert $X^TX$.
I'd personally recommend not learning regression from Tibshirani & company, at least not at first. I have great respect for them and I hold dear my copy of Elements of Statistical Learning, but as a machine learning book it takes a very... machine-like approach to regression that in my opinion doesn't admit the kind of thinking needed to build a meaningful parametric model. My background is in economics, so I'll invariably recommend Wooldrige's Introductory Econometrics: A Modern Approach for what I think is a much more organic and intuitive approach to regression. There's a lot of stuff in there you don't need to know if you aren't working with, say, survey data, but there's nothing in there you don't want to know. Seeing regression built up from statistics principles, as well as the geometric/algebraic principles you get in Elements, is important for understanding it.