Why would I need both a validation set & a test set if I'm not selecting a model?

Question

I have a dataset with two features and one outcome. I was asked to separate the data into three parts such that 70% of the data is a training set, 20% is for validation and 10% for testing. The model will be linear regression.

Why would I need both a validation set and test set here? I am not selecting a type of model or tuning hyperparameters.

There are no options to select a model as it is a linear regression in the form of $y = b_1a_1 + b_2a_2 + b_3$, where we will get $b_1$, $b_2$, and $b_3$ from the training set. I will test the model with the test set and report the error. So what is the need for a validation set?

http://stats.stackexchange.com/questions/19048/what-is-the-difference-between-test-set-and-validation-set — Alex R., Aug 30 '15 at 02:39
Welcome to the site, @SwethaTanamala. I believe you will find the information you need in the linked thread. Please read it. If you still have a question afterwards, come back here & edit your Q to state what you've learned & what you still need to know. Then we can provide the information you need without simply duplicating material elsewhere that already didn't help you. — gung - Reinstate Monica, Aug 30 '15 at 02:55
@gung , actually I read the answer you have posted .. I understood that one ,, Now I edited the question , so please answer this question — Swetha Tanamala, Aug 30 '15 at 03:04
Can you clarify what you are asking that is distinct from the linked thread? I have trouble following your question. Can you state what you understand from there & what you still need to know? — gung - Reinstate Monica, Aug 30 '15 at 03:07
@gung Actually in my data set two features and one response is there .., I need to do a linear regression for that ..so I use a regress (matlab inbuilt function) to get the y = b0 + b1*a1+b2*a2 , then I will b0 , b1 and b2 from my training set itself .. then I will check with this from my testing test .. then there is no need of validation set right..?? — Swetha Tanamala, Aug 30 '15 at 03:14
I think you are right; if you only have one model and you don't have to tune hyperparameters (like e.g. the capacity of a vector machine) then training and test are sufficient, but only in that context, see http://stats.stackexchange.com/questions/168807/why-splitting-the-data-into-the-training-and-testing-set-is-not-enough/168815#168815 — , Aug 30 '15 at 08:10

Why would I need both a validation set & a test set if I'm not selecting a model?

0 Answers0