Questions tagged [variable]

Very unclear, please avoid and find a more specific tag

97 questions
6
votes
2 answers

Estimate normal distribution from dnorm in R

The function dnorm(x) in R gives you the value of the probability density function in the points x of a certain normal distribution (mean = 0 and SD = 1 by default), returning a vector of the same length than x. However, I want to do the opposite:…
BN-stats
  • 63
  • 3
5
votes
1 answer

Can we use fractional regression for a dependent variable that is made of continuous numerator and denominator?

I have a dependent variable that is a ratio, which takes values between 0 and 1. Some 30% of values are 1s. The dependent variable measures the distribution of funds and is calculated as amount of distributed money / total amount of proposed money.…
Ken Lee
  • 321
  • 7
5
votes
1 answer

How to deal with variables that are only relevant for some people?

I am reviewing an article. I can't be specific, but it involves validating a test for a health condition. Their goal is to come up with a score for risk of the condition. One variable is pregnancy. Does this mean that they should validate the test…
Peter Flom
  • 94,055
  • 35
  • 143
  • 276
5
votes
3 answers

What type of data are dates?

According to Yale: Categorical variables represent types of data which may be divided into groups (Lacey M, 1997) To me, dates do not fit this definition. They are ordinal, as one date is bigger than the date before it. It is also quantitative…
Sinker
  • 151
  • 1
  • 1
  • 3
4
votes
1 answer

SelectKBest score function with mixed categorical and continuous data

I am building a classification model where my label is categorical (0 or 1). I want to use scikit-learn’s SelectKBest to select my top 10 features, but I’m not sure which score function to use. I thought I’d use chi2, but not all my variables are…
Insu Q
  • 255
  • 2
  • 9
3
votes
1 answer

BMI Category is qualitative or quantitative

I know for sure that BMI(Body Mass Index) is a quantitative variable as it is a continuous variable. But is that BMI Category derived from the BMI a qualitative variable or a quantitative variable? (Underweight, Normal Weight, Overweight. Thanks
3
votes
2 answers

How do I designate a variable in a linear model to be a covariate in R?

So I want to make this equation for example: y = mu + Strain + Insect + Strain*Insect + BW_final Of all these variables, strain and Insect are controlled variables, but BW_final is an independent variable which isn't necessarily controlled. So I…
3
votes
1 answer

Likert scale 0-4 or 1-5

I am conducting a study with Likert scale questions and my thesis advisor advised me to rescale the answers from 1-5 (strongly agree-strongly disagree) to 0-4. She says this will help in the regression analyses. I am researching the effect of a…
Sabine
  • 35
  • 5
3
votes
2 answers

Simulating random variables from a discrete distribution II

I have the following discrete probability distribution where $p$, $q$ and $r$ are known constants: $P(X=0)=q$, $0
3
votes
0 answers

Interpreting units for random forest variable importance

I've trained a random forest for classification in R's caret package using the ranger method and impurity for measuring variable importance. I would like to figure out what the units are for the variable importance measure returned by the model.…
3
votes
3 answers

Are the variable types here considered correct?

If we want to determine the variable types, will it be as follows for the below variables? Age ---> quantitative, discrete (we can count) Fitness ---> If the values that we will enter here are 0 and 1 only, will the type of this variable be…
Simplicity
  • 535
  • 1
  • 7
  • 12
2
votes
2 answers

Numeric variable with outliers as a categories

I'm working with a dataset that has a few variables that I'm having difficulty trying to preprocess. So one of them is called MENTHLTH where it is a numeric variable. The point of the variable is to measure the number of days a person has had a bad…
2
votes
2 answers

Combining results of multiple Lasso runs / Variable selection

I would appreciate your opinion on an analysis approach I have in mind. The idea is to do the variable selection with multiple runs of Lasso regression (by glmnet in R). Basically, the workflow would be: Run Lasso in the usual classification…
pelah
  • 23
  • 4
2
votes
1 answer

Combining categories by Weight of Evidence

When calculating Information Value and Weight of Evidence, it's possible to draw a chart of WoE for each variable to study its effect on the state of the target variable. Now, I know it's possible to group values of continuous numeric variables into…
2
votes
1 answer

Is there a way to cluster data using a dependent variable?

We are conducting research on neighborhood mail response behavior, i.e. what percentage of people in a neighborhood reply to a piece of mail. Based on regression analysis, we know which factors (% black, % poor, etc.) influence mail response rates.…
1
2 3 4 5 6 7