Setup: Respondents provide answers on a slider ranging from -100 to 100. Some treat the data as binary - see "data_to_reject
" and "data_ambivalent
" examples below - meaning that they only use very few values when making their choice.
My question: How can I identify such respondents that treat the slider as binary? Ideally, I would like a test that provides a fairly reliable and solid cut-off I can use to include or exclude respondents. The examples below provide test cases - the first one, data_to_pass
, should pass the test as the responses are nicely spread out; the second one, data_to_reject
, should definitely get rejected as they have only used -100 and 100 as responses; the third one, data_ambivalent
, should be more difficult to determine using the statistical test and provides a test case for more difficult cases.
I have already tried a few approaches. I tried looking at typical measures of spread, including means, standard deviation, variance, range etc, but I could not come up with a reliable test. I have also tried testing for a bimodal distribution, using Hartigan's dip test, but that also fails to distinguish reliably between the different cases (it fails for the first one). I have also tried normality tests, using kolmogorov smirnov tess and shapiro wilk all of which fall short. I have also tried reflecting the scores (to have only positive scores) and doing the things above -- in retrospect, not sure why that should work but still... I have also tried to see if there are any clever things that can be done using k-means clustering; I am not familiar with that but my explorations have proved futile.
data_to_pass <- c(4, NA, 79, -33, 53, 26, -59, 50, 9, -67, 86, 36, 61, 43, -86, -25, NA, 36, 78, -36, -38, -49, 100, -26, 50, -74, 100, 48, -33, -80, 91, 46, -37, 51, -57, 46, 28, 44, -58, 66, 95, -26, 32, 88, 62, -34, -90, -55, 50, -53, -69, -50, 13, -36, -27, -12, -8, 70, -96, 14, -63, -27, 14, 81, 18, 71, 12, 47, 22, 94, -20, -66, -30, 88, -27, 30, -30, 64, -23, -85, 68, -73, 51, 26, 93, 26, 51, -48, 49, -36, 92, NA, -15, 48, -20, -77, -39, -62, -97, -48, 15, -22, 95, -55, 40, 47, 92, -56, -40, NA, 85, 61, -23, -41, -70, -61, 80, 21, -50, 22, 88, -47, 23, -45, -98, -18, 69, 9, 66, -16, 42, -84, -34, 89, -56, 35, -57, -94, -80, 31, 70, -53, -62, 24, 81, 26, -46, -49, -80, -13, 41, 41, 78, -21, 41, 24, 51, 25, 49, -47, -24, -55, -61, 20, 56, -23, -62, 73, 30, -37, 69, 90, 29, 27, 55, 39, -65, -59, -48, -80, 24, 59, 50, -11, 25, -62, 38, -52, 31, 68, -57, -52, 29, -63, -43, 14, -76, -66, 48, -94, 24, 30, 61, -80, -35, -35, 32, -34, -77, -54, 95, -17, -53, 54, 91, -47, 39, 75, -87, 52, 17, -37, 81, -44, -39, 53, -58, 39, -95, -31, 57, 55, 40, -32, -66, 69, 42, 22, 67, -94, -43, -64, -52, 58, 39, 75, -51, 91, 63, -38, -42, -67, -81, -24, -51, 82, 51, -39, -47, 17, 58, -88, -68, -37, 19, -78, -34, 48, -35, -71, -87, -51, 46, -64, 33, 76, 25, 41, 33, -44, 72, -55, -37, NA, 43, -21, -55, 48, 58, 32, 36, -66, 86, 55, 29, 33, -76, 64, -44, -32, -86, 59, -67, 47, -89, -56, -75, 30, -50, -36, 89, -50, 17, 75, 31, -90, 46, -23, -41, 49, -63, -91, -67, 58, 39, 86, 41, -50, -70, 75, -49, -14, -49, -53, 70, 92, 19, 61, 53, -36, -82, 22, -31, -91, -64, 32, -19, 92, -76, 46, -58, -95, 64, NA, -45, 86, -64, -51, 24, 30, 55, -63, 24, -18, 46, 33, 91, -57, -51, 69, -22, 90, -38, -82, 37, -87, -41, -58, -36, -87, -45, -51, 53, -57, 42, -64, 81, 39, 71, -41, 68, -99, 44, 38, 42, 64, -96, 25, -40, 57, -96, -65, -52, -48, 91, 42, -54, -75, -62, 82, -52, -80, -24, 30, -81, -63, -50, 33, 59, -67, -40, -97, -67, -72, 25, -71, -47, 57, -40, 81, -24, -59)
data_to_reject <- c
data_ambivalent <- c(NA, -49, -28, -100, 52, 53, -42, 100, 51, 57, -52, -100, -46, 98, -100, 17, -49, 53, 100, 100, -54, -58, -100, 100, 51, -53, 50, 30, -60, -100, -49, 51, -57, 53, -100, 32, -53, 10, 47, -33, -45, -100, 45, 100, 100, -64, 100, 100, -27, -100, 55, -100, 100, 100, 30, 61, -15, -100, 34, 53, -56, -100, 100, 51, -100, 53, -100, -23, 100, 100, -100, -23, -100, -51, 56, -18, 100, 100, 56, 15, 100, 100, 54, 56, -100, -16, 100, -34, -48, -60, 23, -53, 100, -53, 100, 58, -100, 16, 100, 100, 100, 56, -100, 63, 100, -100, -64, -100, 56, 100, -100, -100, 55, -100, 100, 100, -83, 100, 100, -78, 68, 100, 100, 49, 78, -100, 100, 100, -100, -98, -97, -100, -89, 100, -100, 100, 58, 47, -100, 100, 62, 100, 88, -100, 21, 64, 86, -100, 17, -100, -58, 100, 16, 100, -63, -100, 100, 100, 100, -59, -100, -17, 100, 100, 85, 87, 85, -58, 100, -54, -14, 73, 54, -43, 100, -54, -100, -67, -69, -100, -14, 100, -37, -100, -100, -100, -100, -100, -88, 76, 60, -100, 100, -100, -100, -100, -100, -69, 100, -57, 49, 100, -100, -100, -58, 100, -55, -100, -100, -18, 54, -73, -100, -88, 20, 100, -100, -100, 84, -100, -27, 100, -100, -35, NA, -100, 100, 38, 100, 100, NA, -31, 100, 100, 76, 71, -100, 100, 39, -47, 57, -100, -61, -100, -44, -100, 53, 86, 82, 59, 68, -100, -100, 100, 16, -67, 35, -100, 16, 54, -57, 60, 73, 62, 70, -17, -100, 100, 57, 100, -100, 100, 81, 17, -100, 17, 100, -28, 37, -100, 100, 100, -22, 59, 100, 100, -58, -100, 28, 100, -100, 100, -72, -82, -100, -100, 19, -100, 37, 100, 100, 100, -59, 61, -100, 26, -59, -53, -49, -21, -100, -19, 100, 14, -100, -18, 100, -97, -100, -18, -100, -57, -100, -63, 61, 60, -100, 100, -59, -100, 100, NA, -100, 100, 100, -100, -100, 100, -100, -100, -100, -100, -100, -61, -100, 100, -100, -17, -100, -100, 13, -100, 51, -47, -100, 50, 57, -100, -100, -100, 51, 100, -100, 100, 55, -100, 100, 100, 100, 14, -100, 21, 100, 15, 55, -18, 100, -16, 100, 44, -100, -100, 100, -100, 100, -100, -100, 100, 52, 100, 100, -44, -100, 100, 100, 97, -100, -100, -65, 100, -100, -100, 59, -100, -100, 100, 54, -51, -100, -100, 20, 57, 100, -50, -100, 62, 97, 100, -100, 100, 66, 100, -100, -100, 100, 100, -100, -100, 100, -100, -33, 100)
plot(data_to_pass)
plot(data_to_reject)
plot(data_ambivalent)
data_to_pass
data_to_reject
data_ambivalent