When selecting an appropriate number of knots for a GAM one might want to take into account the number of data and increments on the x-axis.
What if we have 100 increments on the x-axis with 1000 data points at each increment.
The info here says:
If they are not supplied then the knots of the spline are placed evenly throughout the covariate values to which the term refers: For example, if fitting 101 data with an 11 knot spline of x then there would be a knot at every 10th (ordered) x value.
So a basic start should be 9 knots in this example? I am just not sure what range of knots would be suitable for this data set as it is possible to fit very small to very large numbers.
set.seed(1)
dat <- data.frame(y = rnorm(10000), x = 100)
library(ggplot)
ggplot(dat, aes(x = x, y = y)) +
geom_point(size= 0.5) +
stat_smooth(method = "gam",
formula = y ~ s(x, bs = "cs"),k=9, col = "black")
If k=25 provided a useful fit, would it be reasonable for this data?