You didn't say it explicitly, but I would assume that all the questions measured the same thing. Without this assumption, the answer to one question doesn't have to tell us anything about how the person could answer another question.
The second issue is how to treat answers to the questions where the responder knew the question in advance. The simple approach would be just to ignore such answers. Another approach would be to have a special dummy variable that would signal to the model that the responder knew the question.
You are saying that you are quite new to statistics and this is a non-trivial problem, so this would need some additional research and learn on your side. Answer on a Q&A site like this one would be far from exhaustive, but let me try.
The question falls into the area of psychometrics. We have multiple models for treating similar problems. You could check the Item Response Theory models, with Rasch model being the simplest one. You would model the response to $j$-th question by $i$-the person $X_{ij}$ as
$$ P(X_{ij} = 1) = \frac{\exp(\theta_i - \beta_j)}{1+\exp(\theta_i - \beta_j)} $$
where $\theta_i$ parameter would tell you what is the estimated "ability" for the person, based on the available data. Notice that you don't need to know answers for all the questions to be able to fit the model because it predicts the responses for the individual questions.
The model could be made more complicated by considering also the known questions, but accounting for them differently, though you should probably start with something simpler.
To fit such models you would need to use specialized software for IRT models (e.g. mirt
, ltm
R packages) or could treat them as generalized mixed-effects models and use appropriate software for such (e.g. lme4
, nlme
R packages, MixedModels.jl
for Julia).
If you would like a trivial solution instead, you could just ignore the questions known by the responders in advance and average the answers to all the other questions for the final score. Unlike IRT models, this wouldn't correct for different difficulty $\beta_j$ of the individual questions, but if you could assume that they are equally difficult, this should not be that big of an issue.