The decision threshold creates a trade-off between the number of positives that you predict and the number of negatives that you predict -- because, tautologically, increasing the decision threshold will decrease the number of positives that you predict and increase the number of negatives that you predict.
The decision threshold is not a hyper-parameter in the sense of model tuning because it doesn't change the flexibility of the model.
The way you're thinking about the word "tune" in the context of the decision threshold is different from how hyper-parameters are tuned. Changing $C$ and other model hyper-parameters changes the model (e.g., the logistic regression coefficients will be different), while adjusting the threshold can only do two things: trade off TP for FN, and FP for TN. However, the model remains the same, because this doesn't change the coefficients. (The same is true for models which do not have coefficients, such as random forests: changing the threshold doesn't change anything about the trees.) So in a narrow sense, you're correct that finding the best trade-off among errors is "tuning," but you're wrong in thinking that changing the threshold is linked to other model hyper-parameters in a way that is optimized by GridSearchCV
.
Stated another way, changing the decision threshold reflects a choice on your part about how many False Positives and False Negatives that you want to have. Consider the hypothetical that you set the decision threshold to a completely implausible value like -1. All probabilities are non-negative, so with this threshold you will predict "positive" for every observation. From a certain perspective, this is great, because your false negative rate is 0.0. However, your false positive rate is also at the extreme of 1.0, so in that sense your choice of threshold at -1 is terrible.
The ideal, of course, is to have a TPR of 1.0 and a FPR of 0.0 and a FNR of 0.0. But this is usually impossible in real-world applications, so the question then becomes "how much FPR am I willing to accept for how much TPR?" And this is the motivation of roc curves.