Questions tagged [generalization]
19 questions
5
votes
1 answer
Is there a multivariate joint Amoroso distribution?
The Amoroso distribution is a remarkable feat of abstraction as it exactly or asymptotically generalizes dozens of named probability distributions. Is there a published/pre-published treatment of multivariate Amoroso distributions? Either the…

DifferentialPleiometry
- 2,274
- 1
- 11
- 27
4
votes
0 answers
What are the tradeoffs of using the generalized $f$-median?
The generalized f-mean is a generalization of multiple estimators, and even generalizes the generalized mean.
For some invertible function $f$, and $k$-dimensional vector, it is given as:
$$M_f(\vec{x}) \triangleq f^{-1} \left( \frac{1}{n}…

DifferentialPleiometry
- 2,274
- 1
- 11
- 27
3
votes
1 answer
Mean of Generalization of the Dirichlet Distribution
I know that if $X_{1},X_{2},...X_{n}$ are independent $\mathrm{Gamma}(\alpha_{i},\theta)$ - distributed variables (notice they all have the same scale parameter $\theta$) and
$Y_{i}=\frac{X_{i}}{\sum_{j=1}^{n}X_{j}}$
then…

bbecon
- 43
- 9
2
votes
1 answer
What does one imply by the term "overgeneralization" in machine learning?
I know overfitting and underfitting in machine learning context, and what generalisation means as well. But, recently I was introduced to an uncommon terminology "overgeneralization" in context of fitting. What should this term relate to?…

Umang Agarwal
- 21
- 1
2
votes
0 answers
Cannot achieve generalization of machine learning model
I'm working on a balanced, binary classification problem in a time-series (financial) dataset. I am using K-fold cross validation that is adapted for time-series (so that I'm never using future data to predict past data).
I have tried many…

Vladimir Belik
- 73
- 5
2
votes
1 answer
Statistics terminology: $n$-way and $m$-sample
In statistics, I see certain things described by "$n$-way" or "$m$-sample." For example, there is "$n$-way" ANOVA for any $n$ and "$m$-sample" t-tests for $m=1,2$. I want to get a handle on what these descriptors mean in general. It seems to me like…
user179309
1
vote
0 answers
Overfitting, generalization, data augmentation, regularization, how do they relate to each other? How to measure?
Recent work such as Deep Double Descent shows that overfitting is not really a problem with large models, even without any data augmentation or regularization (L2 weight norm, dropout or so).
Edit: Ok, maybe this is a wrong conclusion from this…

Albert
- 1,145
- 1
- 9
- 12
1
vote
0 answers
External loss functions for Spectral/Density-based clustering
In this article, Abou-Mustafa and Schuurmans proposed a method that makes it easy to decide what unsupervised learning algorithm generalizes 'better' to the entire dataset. In particular, this needs some external loss function l to measure…

drommedaris
- 11
- 1
1
vote
0 answers
Metafor Package: How to conduct Meta Regression with reliability generalization
How to conduct meta regression in "metafor" after I got I2 heterogenity 94%. My study reliability generalization alpha Cronbach, with continuous and categorical moderator variable.
Thanks all.

Wahyu Syahputra
- 11
- 2
1
vote
2 answers
SVM Model: What's a healthy number of support vectors?
For a SVM model what is a healthy number of support vectors? or more precisely what's a good ratio of number of support vectors to the total number of training samples, 10%, 20%, 30%, 50% ... 80%? Is there a general consensus on this?
By healthy I…

SkyWalker
- 825
- 1
- 7
- 12
1
vote
1 answer
Does up-sampling lead to lots of false positives in production?
Say we have a dataset with a binary outcome variable that takes the positive case (outcome = 1) roughly 20% of the time. Often, we would modify the training set by down-sampling the 0's such that the training set has something like a 50/50 split in…

AmeySMahajan
- 123
- 6
0
votes
0 answers
Clarification of line in proof of consistency theorem (Vapnik)
In Vapnik's Statistical Learning Theory (1998 edition) on pages 89-92, he proves a "key theorem of learning theory" that states the conditions for when:
"the following two statements are equivalent:
For the given distribution function F(z), the…
0
votes
0 answers
Consequences of paired healthy and diseased samples in machine learning
Consider a set-up in which we are using machine learning to classify between healthy and diseased samples. Obtaining the data requires some invasive procedure - therefore all the healthy samples come from the same patients as the diseased samples,…

N Blake
- 539
- 3
- 8
0
votes
0 answers
Rule of thumb for removing / keeping attribute based on occurrence frequencies among training observations?
I have a training dataset expressed with binary values, where 1 indicates an attribute is used in an observation, and 0 indicates it is not.
I was wondering if I should remove an attribute from the training vector if it is used by only small number…

GabiX
- 23
- 4
0
votes
0 answers
Is it possible to know whether a linear SVM is overfitting from the features' weight and value distribution in training?
I have a text sentiment classification model trained using linear SVM on 2500 training instances with around 14000 features(word), every sample is represented as binary vector with 1 indicate presence of a word and 0 indicate the absence of the word…

GabiX
- 23
- 4