How we define a Hypothesis in the context of Multiple Hypotheses Testing?

Question

In a course I am following the professor stated that conducting 10,000 tests would be normal in a high-dimensional setting, making a direct mention to regression analysis.

What is meant by that phrase?

How a hypothesis is defined in this context? Do we consider each different combination of parameters -- e.g. coefficients in a logistic regression-- entertained as defining separate hypotheses that are tested?

However, are we truly testing hypotheses in the process of fitting parameters to the data? In practice we simply solve an optimization problem, normally by minimizing a loss function of some sort through a process such as gradient descent or some other method. We don't do any statistical testing of any kind in the process of fitting the parameters -- if I understand correctly the algorithms (please correct me).

To provide more context, the lecture is about Multiple Testing of Hypotheses in the frame of Statistical Inference and the exact segment from the lecture is the following:

So, first we're going to talk about controlling the false positive rate. And so, if P values are correctly calculated, you can actually just use the P values that you've calculated directly and call all P values less than sum threshold Alpha, where Alpha's between zero and one. To be significant, that will actually control the false positive rate at level alpha on average. In other words, the expected rate of false positives is less than alpha. So here's the problem with that. Suppose that you perform say 10,000 hypothesis tests, this seems a little bit extreme, a large number of tests maybe. For people that are doing just one or two regressions but in many high dimensional settings or signal processing settings there are, this is actually a reasonably small number of hypothesis tests that might be performed and if you call all P values less than 0.05 say significance and we say alpha equal to 0.05, then the expected number of false positives is just the total number of tests that you've performed. Times the false positive rate that you're controlling the, error rate at. And so you get 500 false positives. So if you perform this many hypothesis tests and you get 500 significant results, it's pretty likely that they're mostly going to be made up of false positive results. So a question that we immediately comes to mind is how do we control, a different error rate so that we avoid so many false positives.

The situation has not been clearly defined; one might set up different sequences of tests that have different properties, and to talk about something having a distribution you must define a random variable or sequence of random variables that you're discussing the distribution of (some statistic? a count of rejections maybe?). In the absence of such a random variable, we don't even know what the claim really is. We might be able to guess at many circumstances and define random variables to go with them, and perhaps one of them would even arrive at what this unnamed professor intended, ... ctd — Glen_b, Jul 17 '17 at 21:43
ctd... but guessing what a person we have no access to actually intended from a second hand account and no context is just that ... guessing. Unless you can give a great deal more context and enough detail that the specific claim is unambiguous, this appears to be either unclear (what is the specific situation?) or too broad (there are too many answers that would fit what little detail is here) -- the person to ask is the person who made the statement. — Glen_b, Jul 17 '17 at 21:45
Note further that the title question doesn't seem to be asking the same thing as the question in the body; if you can clarify the situation you still need to be clearer about what you're asking. — Glen_b, Jul 17 '17 at 21:49
I voted to reopen this question because it's a good one, but wish to point out that it is formulated in terms of some false premises. In particular, hypothesis tests are not "a process of fitting parameters to the data" and it is not always the case that we "we simply minimize a loss function through gradient descent": often there's no loss (although there usually is an objective function of some sort, such as a likelihood) and often gradient descent is not used. — whuber, Jul 18 '17 at 12:06
A good example of this situation appears at https://stats.stackexchange.com/questions/88065/explain-the-xkcd-jelly-bean-comic-what-makes-it-funny/88067#88067, so perhaps the answers there will address your questions. — whuber, Jul 18 '17 at 12:20

How we define a Hypothesis in the context of Multiple Hypotheses Testing?

0 Answers0