Questions tagged [collecting-data]

11 questions
59
votes
7 answers

Industry vs Kaggle challenges. Is collecting more observations and having access to more variables more important than fancy modelling?

I'd hope the title is self explanatory. In Kaggle, most winners use stacking with sometimes hundreds of base models, to squeeze a few extra % of MSE, accuracy... In general, in your experience, how important is fancy modelling such as stacking vs…
Tom
  • 1,204
  • 8
  • 17
1
vote
0 answers

How to track term prevalence on the web over time?

What is the correct way to track the prevalence of a specific term for each year over several years? The goal is a bachelor thesis in media sciences about the trend sport "Speedminton". In order to correlate the web presence of the sport on the web…
1
vote
0 answers

How to code recurring measuring points in the raw data files

Let's assume I have three measuring points in time. T1 is study starts T2 after one week T3 after two weeks at the end Without making it to complexe lets also assume there is only one numerical value on each point. How do you code that in the raw…
buhtz
  • 61
  • 9
1
vote
1 answer

Dealing with differing number of replicates of measurements pr sample?

Here's a question that I haven't come across in any statistics classes or books. Imagine I have $n$ number of samples and I subject them to a series of different tests to characterise them. For fun's sake let's say it's beer. For each sample I do…
1
vote
1 answer

Type of study without groups

In a randomised control trial, true randomisation is applied to decide which subjects receive treatment. In quasi experimental trial, the randomisation is only approximate. What is the name given to such a trial where subjects are both treated and…
Gilly
  • 247
  • 3
  • 8
0
votes
0 answers

How can I do data analysis with insufficient gini index data for research?

I am conducting research on income inequality in my home country, but the Gini data was insufficient, which is only available every 2 or 5 years. Can I still do a data analysis with the limited data available? What can I do to solve this issue, as…
0
votes
0 answers

How to choose which questions to display in a questionnaire to maximize the significance of the data collected?

I'm currently setting up a questionnaire for a target group of people. The questionnaire will ask users to rate how relevant a topic is to a given sentence. Since each user might have different amount of familiarity with a given topic, I expect that…
0
votes
1 answer

Recent mortality

Recently there are a lot of datapoints regarding COVID-19 related deaths. However I am looking to see how big the impact of non-diagnosed patients is. Therefore I want to evaluate the deaths (corrected for reported virus deaths) and see how these…
0
votes
0 answers

Accounting for Inter-Sample Collection Time Variability

I'm trying to run regression analysis on a dataset that features a pair of continuous variables that are collected at a certain time (in days). Whilst the data should be collected at a specific time, due to various restraints there can be a…
0
votes
2 answers

Question on basic hypothesis test and data collection

I have 2 other questions on hypothesis testing. If I have a hypothesis like this: Spanish companies have more contract offers from Portuguese companies, than German companies have from Portuguese companies. Now I have 2 questions: 1- To test this…
vaxent
  • 67
  • 5
0
votes
1 answer

Data Acquisition in R

R is good at processing data. It is an analytic turbo-ginsu. So LabVIEW is great at getting data. It plugs into anything with electrodes. MatLab has simulink. There are toolboxes for data acquisition. What does "R" have? Does it connect to…
EngrStudent
  • 8,232
  • 2
  • 29
  • 82