Questions tagged [continuous-data]

A random variable $X$ is called continuous if its set of possible values is uncountable, and the chance that it takes any particular value is zero ($\text{P}(X = x) = 0$ for every real number $x$). A random variable is continuous if and only if its cumulative probability distribution function is a continuous function.

From Mood et al. (page 60, 1974):

"A random variable $X$ is called continuous if there exists a function $f_{X}(.)$ such that $F_{X}(.)=\int_{-\infty}^{x}f_{X}(u)du$ for every real number $x$. The cumulative distribution function $F_{X}(.)$ of a continuous random variable $X$ is called absolutely continuous".

Mood, A. M., Graybill, F. A., & Boes, D. C. (1974). Introduction to theory of statistics. (B. C. Harrinson & M. Eichberg, Eds.) (3rd ed., p. 564). McGraw-Hill, Inc.

Excerpt reference: Glossary of Statistical Terms from berkeley.edu

However, the term is also commonly used for variables that can take on a great many values, such as IQ.

707 questions
146
votes
6 answers

Correlations with unordered categorical variables

I have a dataframe with many observations and many variables. Some of them are categorical (unordered) and the others are numerical. I'm looking for associations between these variables. I've been able to compute correlation for numerical variables…
Clément F
  • 1,717
  • 4
  • 12
  • 13
98
votes
8 answers

What is the benefit of breaking up a continuous predictor variable?

I'm wondering what the value is in taking a continuous predictor variable and breaking it up (e.g., into quintiles), before using it in a model. It seems to me that by binning the variable we lose information. Is this just so we can model…
Tom
  • 1,511
  • 1
  • 12
  • 17
97
votes
1 answer

Correlation between a nominal (IV) and a continuous (DV) variable

I have a nominal variable (different topics of conversation, coded as topic0=0 etc) and a number of scale variables (DV) such as the length of a conversation. How can I derive correlations between the nominal and scale variables?
Paul Miller
  • 971
  • 2
  • 7
  • 3
73
votes
10 answers

What is the difference between discrete data and continuous data?

What is the difference between discrete data and continuous data?
Albort
  • 881
  • 1
  • 9
  • 10
61
votes
8 answers

Does it ever make sense to treat categorical data as continuous?

In answering this question on discrete and continuous data I glibly asserted that it rarely makes sense to treat categorical data as continuous. On the face of it that seems self-evident, but intuition is often a poor guide for statistics, or at…
38
votes
5 answers

Clustering a dataset with both discrete and continuous variables

I have a dataset X which has 10 dimensions, 4 of which are discrete values. In fact, those 4 discrete variables are ordinal, i.e. a higher value implies a higher/better semantic. 2 of these discrete variables are categorical in the sense that for…
31
votes
4 answers

Predicting with both continuous and categorical features

Some predictive modeling techniques are more designed for handling continuous predictors, while others are better for handling categorical or discrete variables. Of course there exist techniques to transform one type to another (discretization,…
29
votes
2 answers

When should we discretize/bin continuous independent variables/features and when should not?

When should we discretize/bin independent variables/features and when should not? My attempts to answer the question: In general, we should not bin, because binning will lose information. Binning is actually increasing the degree of freedom of the…
Haitao Du
  • 32,885
  • 17
  • 118
  • 213
28
votes
2 answers

Continuous generalization of the negative binomial distribution

Negative binomial (NB) distribution is defined on non-negative integers and has probability mass function$$f(k;r,p)={\binom {k+r-1}{k}}p^{k}(1-p)^{r}.$$ Does it make sense to consider a continuous distribution on non-negative reals defined by the…
23
votes
2 answers

Uniform random variable as sum of two random variables

Taken from Grimmet and Stirzaker: Show that it cannot be the case that $U=X+Y$ where $U$ is uniformly distributed on [0,1] and $X$ and $Y$ are independent and identically distributed. You should not assume that X and Y are continuous variables. A…
19
votes
2 answers

Why is the Cauchy Distribution so useful?

Could anyone give me some practical examples of the Cauchy Distribution? What makes it so popular?
Daria
  • 375
  • 2
  • 11
18
votes
2 answers

How to correctly assess the correlation between ordinal and a continuous variable?

I'd like to estimate the correlation between: An ordinal variable: subjects are asked to rate their preference for 6 types of fruit on a 1-5 scale (ranging from very disgusting to very tasty) On average subjects use only 3 points of the scale. A…
San
  • 181
  • 1
  • 1
  • 3
18
votes
4 answers

$P[X=x]=0$ when $X$ is continuous variable

I know that for continuous variable $P[X=x]=0$. But i can't visualize that if $P[X=x]=0$, there is infinite number of possible $x$'s. And also why do their probabilities get infinitely small ?
time
  • 1,167
  • 5
  • 15
  • 31
17
votes
1 answer

How to choose between ANOVA and ANCOVA in a designed experiment?

I am conducting an experiment which has the following: DV: Slice consumption (continuous or could be categorical) IV: Healthy message, unhealthy message, no message (control) (3 groups in which people are randomly assigned - categorical) This is a…
mobo
  • 213
  • 1
  • 3
  • 8
16
votes
2 answers

Correlation coefficient between a (non-dichotomous) nominal variable and a numeric (interval) or an ordinal variable

I've already read all the pages in this site trying to find the answer to my problem but no one seems to be the right one form me... First I explain you the kind of data I'm working with... Let's say that I have an array vector with several names of…
1
2 3
47 48