I hope you can help me with this question:
I have a time series data (25 years) that I will analyze to find temporal changes on seasonality over time. I am using linear regression and my model includes year (as a continuous variable) and the date in which the nest was initiated as explanatory variables. My response variable (eggs survival) was estimated as the proportion of eggs that survive in successful nests (meaning there was at least one egg in the nest at the end of the reproductive season) over the incubating period (# eggs counted at the end of incubation / #of eggs counted at the beginning of incubation). One of the years has a small sample size (n=35) compared to the rest (range without small sample size year goes from 69-338). Should I delete it from my dataset? What can I do?
If yes: I am using year as a continuous variable (year = 0 - 24), should I break the numbers (example: 0-7 and then continue 9-24) or should I number the years like if this year with small sample size doesn't exist?
UPDATE: This is the plot residuals vs fitted values, according to AIC my best model shows changes over time (the interaction is significant), however the r2 is 0.02. Any advice?
UPDATE2:
I applied robust linear regression with an exponential transformation, weights as variance/n and deleted an outlier. This is the best I could fit the data. Can you please give me your opinion?:
though, the robust regression makes not much difference. Using simple linear regression with weights and exponential transformation (because I have positive and negative values, log is not possible) my r2 improves a bit to 0.04