Questions tagged [aggregation]

Refers to "lumping together" potentially inhomogeneous groups of data.

Aggregation refers to "lumping together" potentially inhomogeneous groups of data. The laws of total expectation and variance can be thought of as providing a way to calculate the mean and variance of an aggregated data set, if the variable being conditioned on ($Y$ in the Wikipedia articles) is the grouping variable being aggregated over.

When aggregating data, the resulting distribution is marginal to the original datasets.

241 questions

votes

1 answer

Quantiles from the combination of normal distributions

I have information on the distributions of anthropometric dimensions (like shoulder span) for children of different ages. For each age and dimension, I have mean, standard deviation. (I also have eight quantiles, but I don't think I'll be able to…

asked Aug 18 '11 at 18:30

Thomas Levine

3,001
1
16
16

votes

4 answers

How to aggregate by minute data for a week into hourly means?

How would you get hourly means for multiple data columns, for a daily period, and show results for twelve "Hosts" in the same graph? That is, I'd like to graph what a 24 hour period looks like, for a weeks worth of data. The eventual goal would be…

r time-series aggregation

asked Feb 15 '11 at 21:00

Scott Hoffman

votes

6 answers

Fast ways in R to get the first row of a data frame grouped by an identifier

Sometimes I need to get only the first row of a data set grouped by an identifier, as when retrieving age and gender when there are multiple observations per individual. What's a fast (or the fastest) way to do this in R? I used aggregate() below…

r dataset aggregation plyr

asked Mar 04 '11 at 17:17

lockedoff

1,795
2
12
19

votes

1 answer

How do you choose a unit of analysis (level of aggregation) in a time series?

If you can measure a time series of observations at any level of precision in time, and your goal of the study is to identify a relationship between X and Y, is there any empirical justification for choosing a specific level of aggregation over…

time-series aggregation disaggregation

asked Feb 16 '11 at 14:47

Andy W

15,245
8
69
191

votes

2 answers

What statistics are preserved under aggregation?

If we have a long, high resolution time series, with lots of noise, it often makes sense to aggregate the data to a lower resolution (say, daily to monthly values) to get a better understanding of what's going on, effectively removing some of the…

time-series aggregation

asked Oct 02 '13 at 07:55

naught101

4,973
1
51
85

votes

2 answers

Should I run separate regressions for every community, or can community simply be a controlling variable in an aggregated model?

I am running an OLS model with a continuous asset index variable as the DV. My data is aggregated from three similar communities in close geographic proximity to one another. Despite this, I thought it important to use community as a controlling…

regression categorical-data stata multiple-regression aggregation

asked Oct 17 '11 at 12:46

cadamt

votes

6 answers

How to find summary statistics for all unique combinations of factors in a data.frame in R?

I want to calculate a summary of a variable in a data.frame for each unique combination of factors in the data.frame. Should I use plyr to do this? I am ok with using loops as opposed to apply() ; so just finding out each unique combination would be…

r categorical-data aggregation plyr

asked Aug 16 '10 at 13:23

humble Student

votes

1 answer

How to combine regression models?

Say I have three data sets of size $n$ each: $y_1$ = heights of people from the US only $y_2$ = heights of men from the whole world $y_3$ = heights of women from the whole world And I build a linear model for each with factors $x_i$, $i = 1,...,…

regression multiple-regression ensemble-learning aggregation

asked Jan 14 '16 at 13:58

J4y

votes

1 answer

Random Forest Probabilistic Prediction vs majority vote

Scikit learn seems to use probabilistic prediction instead of majority vote for the model aggregation technique without an explanation as to why (1.9.2.1. Random Forests). Is there a clear explanation for why? Further is there a good paper or…

random-forest python scikit-learn aggregation bagging

asked Dec 08 '14 at 01:12

user1745038

votes

3 answers

Intraclass correlation and aggregation

Imagine that: You have a sample of 1000 teams each with 10 members. You measured team functioning by asking each team member how well they think their team is functioning using a reliable multi-item numeric scale. You want to describe the extent to…

correlation intraclass-correlation aggregation interpretation effect-size

asked Aug 13 '10 at 12:44

Jeromy Anglim

42,044
23
146
250

votes

0 answers

Accuracy of aggregate vs. disaggregate forecasting

I've found a few interesting articles online on this topic, but none which appear to be too cut and dry. My question is coming up with an accurate predictive forecast based on forecasting individual component parts, then adding then up (or whatever…

forecasting aggregation

asked May 20 '14 at 17:29

user45867

votes

0 answers

What techniques are there available for averaging misaligned multivariate time series?

I want to get an average time series for a set of multivariate (2-3 coordinates) time series. My aim is finding the usual pattern of several processes. I researched the literature a bit and I only reached this paper that showed a DTW based approach…

time-series distance-functions aggregation

asked Jan 26 '17 at 08:36

Jon Nagra

votes

2 answers

Aggregation of Correlations Coefficients (Spearman)

in an analysis of survey data, I have to deal with multilevel/three-dimensional data. Now, I need to aggregate correlation coefficients found on the individual level (between individual rank-orders) and then compare these coefficients. The original…

t-test spearman-rho aggregation

asked Jul 05 '16 at 12:30

BurninLeo

votes

2 answers

What is the terminology for data aggregated via summed totals versus data aggregated via means?

The two types of data differ in that if you decide to decrease the temporal (time) resolution of the first type of data you take the mean of lower the resolutions. With the second you take the sum over the lower resolutions. Here is a concrete…

modeling terminology measurement aggregation spatio-temporal

asked May 28 '15 at 15:37

josh

3,119
4
12
14

votes

1 answer

How to make a combination (aggregation) of quantile forecast?

Framework. Fix $\alpha\in ]0,1[$. Imagine you have $n$ $\alpha$-quantile forecast methodologies that give you, at time $t$ for look ahead time $t+h$, an estimation of the quantile of wind power. Formally, for $i=1,\dots,n$, you know how to produce…

time-series forecasting quantiles aggregation forecast-combination

asked Feb 15 '11 at 16:09

robin girard

6,335
6
46
60

2 3

…

16 17 Next