If I understand GLM correctly, to run a GLM model I need to specify the particular transformation $f$ that ensures the conditional distribution of $f(Y)$ given $X$ is from the exponential family. (I also need to make sure $f$ is one of the transformations that's easy to work with from the computational perspective.)
But let's say I don't have much confidence about which $f$ to use; all I have a large enough dataset that I can get a decent empirical distribution.
What would be a good approach in this situation?
EDIT: Will try to address @dsaxton and @glen_b comments. Let's say I have a dataset of second-by-second heart rate for many people over many workout sessions. And let's say I want to be able to predict a heart rate of a random person given the amount of time elapsed since the start of the workout, perhaps accounting for fixed effects in each person.