Lets say I want to generate 100 observations of 2 likert scaled, normally? distributed variables with 10 categories (1-10) and a pearson correlation of f.e. ~0.8. I am aware that using pearson correlation on categorical data is controversial, but I really need it that way, because my goal is to analyze how pearsons r behaves, if i change the distances between the likert items. (any literature if existent would also be appreciated)
So it would look something like this:
while(cor(x,y)<= 0.8){
x <- sample( 1:10, 100, replace=TRUE, prob=c(?) )
y <- sample( 1:10, 100, replace=TRUE, prob=c(?) )
}
Of course this is super inefficient, but you get an idea of what I want. I am also not sure about the distribution. Is there any common distribution or set of probabilities most commonly occuring with likert scales like:
strongly agree | undecided | disagree | strongly disagree
or should I just use uniform distribution?