Questions tagged [collecting-data]
11 questions
59
votes
7 answers
Industry vs Kaggle challenges. Is collecting more observations and having access to more variables more important than fancy modelling?
I'd hope the title is self explanatory. In Kaggle, most winners use stacking with sometimes hundreds of base models, to squeeze a few extra % of MSE, accuracy... In general, in your experience, how important is fancy modelling such as stacking vs…

Tom
- 1,204
- 8
- 17
1
vote
0 answers
How to track term prevalence on the web over time?
What is the correct way to track the prevalence of a specific term for each year over several years? The goal is a bachelor thesis in media sciences about the trend sport "Speedminton". In order to correlate the web presence of the sport on the web…

Konrad Höffner
- 111
- 4
1
vote
0 answers
How to code recurring measuring points in the raw data files
Let's assume I have three measuring points in time.
T1 is study starts
T2 after one week
T3 after two weeks at the end
Without making it to complexe lets also assume there is only one numerical value on each point. How do you code that in the raw…

buhtz
- 61
- 9
1
vote
1 answer
Dealing with differing number of replicates of measurements pr sample?
Here's a question that I haven't come across in any statistics classes or books.
Imagine I have $n$ number of samples and I subject them to a series of different tests to characterise them. For fun's sake let's say it's beer. For each sample I do…

d3F
- 11
- 3
1
vote
1 answer
Type of study without groups
In a randomised control trial, true randomisation is applied to decide which subjects receive treatment. In quasi experimental trial, the randomisation is only approximate.
What is the name given to such a trial where subjects are both treated and…

Gilly
- 247
- 3
- 8
0
votes
0 answers
How can I do data analysis with insufficient gini index data for research?
I am conducting research on income inequality in my home country, but the Gini data was insufficient, which is only available every 2 or 5 years. Can I still do a data analysis with the limited data available? What can I do to solve this issue, as…

michelle
- 1
0
votes
0 answers
How to choose which questions to display in a questionnaire to maximize the significance of the data collected?
I'm currently setting up a questionnaire for a target group of people.
The questionnaire will ask users to rate how relevant a topic is to a given sentence.
Since each user might have different amount of familiarity with a given topic, I expect that…

Mandelbrotter
- 103
- 2
0
votes
1 answer
Recent mortality
Recently there are a lot of datapoints regarding COVID-19 related deaths. However I am looking to see how big the impact of non-diagnosed patients is.
Therefore I want to evaluate the deaths (corrected for reported virus deaths) and see how these…

Dennis Jaheruddin
- 436
- 3
- 20
0
votes
0 answers
Accounting for Inter-Sample Collection Time Variability
I'm trying to run regression analysis on a dataset that features a pair of continuous variables that are collected at a certain time (in days). Whilst the data should be collected at a specific time, due to various restraints there can be a…

Dan Adams
- 1
- 1
0
votes
2 answers
Question on basic hypothesis test and data collection
I have 2 other questions on hypothesis testing.
If I have a hypothesis like this:
Spanish companies have more contract offers from Portuguese companies, than German companies have from Portuguese companies.
Now I have 2 questions:
1- To test this…

vaxent
- 67
- 5
0
votes
1 answer
Data Acquisition in R
R is good at processing data. It is an analytic turbo-ginsu.
So LabVIEW is great at getting data. It plugs into anything with electrodes.
MatLab has simulink. There are toolboxes for data acquisition.
What does "R" have? Does it connect to…

EngrStudent
- 8,232
- 2
- 29
- 82