I must evaluate the goodness of an instrument according to the measures of the final products. I have two instruments: one good and the other one whose goodness I have to evaluate. The measures are related to pressures.
What I thought is to sample same products with both the good instrument and the unknown one. The same product will be measured first with the unknown instrument and then with the reliable one. The instruments do not affect each other's measures; they are independent. What I will have is a paired sample.
Without any other information I assume to take about 35 measurements so that I can think of having normality (30 is not a magic number for data normality but the sample size should be defined by the power of the test adopted and I am not sure which test to adopt). Then I will take the difference between the two measurements obtained for each product (to purify the data from the variability of the individual products) in order to obtain a vector of differences which expresses the variability of the two instruments.
What I would like to know is whether the measures of the unknown instrument are significantly different from the measures of the good instrument, I thought about comparing the variance of the sample differences with an ideal population variance (to understand if there is such variability in the data that it can be attributed to different instruments or only to the case). Does this line of reasoning make sense? In this case, can I use F-Test? Which test do you recommend?