Questions tagged [partitioning]

A partition is an assignment of every element of a set into 1 & only 1 subset w/ no empty subsets. A common instance of partitioning in statistics is the partitioning of sums of squares for F-tests.

A partition is an assignment of every element of a set into one and only one subset with no empty subsets. That is, no element of the original / super- set is unassigned, no element is assigned to more than one subset, and there is no subset without any assigned elements. A common instance of partitioning in statistics is the partitioning of sums of squares for F-tests.

120 questions

votes

8 answers

How to do community detection in a weighted social network/graph?

I'm wondering if someone could suggest what are good starting points when it comes to performing community detection/graph partitioning/clustering on a graph that has weighted, undirected edges. The graph in question has approximately 3 million…

asked Sep 21 '10 at 15:50

laramichaels

1,119
3
12
12

votes

5 answers

How to split dataset for time-series prediction?

I have historic sales data from a bakery (daily, over 3 years). Now I want to build a model to predict future sales (using features like weekday, weather variables, etc.). How should I split the dataset for fitting and evaluating the models? Does…

cross-validation partitioning

asked Sep 30 '14 at 16:23

tobip

1,450
4
14
11

votes

3 answers

Data partitioning for spatial data

I am constructing different configurations of a Random Forest in order to investigate the influence of well-design variables and location, on the first-year production volumes of shale oil wells, within a given area in the US. In the different model…

machine-learning random-forest spatial partitioning geostatistics

asked Apr 16 '20 at 17:47

veghokstvd

votes

2 answers

Partitioning trees in R: party vs. rpart

It's been a while since I looked at partitioning trees. Last time I did this sort of thing, I like party in R (created by Hothorn). The idea of conditional inference via sampling makes sense to me. But rpart also had appeal. In the current…

r cart rpart partitioning

asked Jan 31 '12 at 22:46

Peter Flom

94,055
35
143
276

votes

1 answer

Difference in implementation of binary splits in decision trees

I am curious about the practical implementation of a binary split in a decision tree - as it relates to levels of a categorical predictor $X{j}$. Specifically, I often will utilize some sort of sampling scheme (e.g. bagging, oversampling etc) when…

cart rpart partitioning

asked Nov 06 '11 at 18:20

B_Miner

7,560
20
81
144

votes

3 answers

Does Newman's network modularity work for signed, weighted graphs?

The modularity of a graph is defined on its Wikipedia page. In a different post, somebody explained that modularity can easily be computed (and maximized) for weighted networks because the adjacency matrix $A_{ij}$ can as well contain valued ties.…

clustering data-visualization networks partitioning modularity

asked Jan 21 '14 at 13:41

Philip Leifeld

votes

0 answers

What approaches use multiple eigenvectors in graph spectral clustering?

Background: In Newman's PNAS 2006 paper Modularity and community structure in networks, the first eigenvector splits the graph in two clusters, and then each cluster can be further divided by eigenvector of a modified Laplacian of the nodes within…

clustering graph-theory partitioning spectral-analysis modularity

asked Nov 25 '15 at 05:22

highBandWidth

2,092
2
21
34

votes

2 answers

Is $R^2$ value valid for insignificant OLS regression model?

I am interested in stating that ___ % of the variance in Y is explained uniquely by $X_1$ and ___ % is explained uniquely by $X_2$. Is there some way to obtain this from a multiple regression model, or do I need to obtain adjusted $R^2$ values…

regression r-squared partitioning

asked May 11 '13 at 19:16

Patrick

1,381
1
15
21

votes

1 answer

Estimate the population variance from a set of means

I have a set of measurements which is partitioned into M partitions. However, I only have the partition sizes $N_i$ and the means $\bar{x}_i$ from each partition. Because all measurements are assumed to be from the same distribution, I believe I can…

anova standard-deviation weighted-mean partitioning

asked Mar 20 '12 at 13:37

Hallgeir

votes

2 answers

Newman's modularity clustering for graphs

I am interested in running Newman's modularity clustering algorithm on a large graph. If you can point me to a library (or R package, etc) that implements it I would be most grateful.

clustering networks partitioning igraph modularity

asked Aug 19 '10 at 16:09

laramichaels

1,119
3
12
12

votes

1 answer

Nested ANOVA: Unequal sample sizes? Variance components?

I am completely out of my depth on this, and all the reading I try to do just confuses me. I'm hoping you can explain things to me in a way that makes sense. (As always seems to be the case, "It shouldn't be this hard!") I'm trying to help a student…

r anova nested-data partitioning

asked Apr 20 '14 at 18:31

Sam R

votes

1 answer

Sampling uniformly from the set of partitions of a set?

In this blogpost, the writer states "It’s easy to sample uniformly from the set of partitions of a set: you pick a number of bins using an appropriate exponential distribution, then randomly i.i.d. toss each element of the set into one of those…

python uniform-distribution combinatorics partitioning

asked Nov 24 '20 at 17:26

iaskdumbstuff

votes

1 answer

Interpreting output of igraph's fastgreedy.community clustering method

With the help of several people in this community I have been wetting my feet in clustering some social network data using igraph's implementation of modularity-based clustering. I am having some trouble interpreting the output of this routine and…

clustering networks partitioning igraph modularity

asked Sep 22 '10 at 18:49

laramichaels

1,119
3
12
12

votes

2 answers

R procedure for comparing multiple categorical variables (similar to anova() followed by t.test() for continuous)?

Big Picture: How can I implement partitioned Chi Square in R? I understand how to perform the overall Chi square, and then how to get individual parameters (observed counts, expected counts, residuals, etc.). However, I don't understand how to get…

r chi-squared-test partitioning

asked Feb 06 '14 at 19:06

sudo make install

votes

0 answers

Variance partitioning - why be cautious?

I'm about to use variance partitioning to interpret my results of a given model and across models and have come across various criticisms of it most notably by Pedhazur (1982, 1997). Also, the criticisms are of both the approaches to VP -…

multiple-regression multicollinearity partitioning

asked Sep 15 '13 at 17:46

Ph8

2 3 4 5 6 7 8 Next