The process of fiting some statistical model to a particular set of data. Mostly done on a computer, and using varied numerical methods such as optimization or numerical integration, or simulation.
Questions tagged [fitting]
731 questions
29
votes
6 answers
Fit a sinusoidal term to data
Although I read this post, I still have no idea how to apply this to my own data and hope that someone can help me out.
I have the following data:
y <- c(11.622967, 12.006081, 11.760928, 12.246830, 12.052126, 12.346154, 12.039262, 12.362163,…

Pascal
- 413
- 1
- 5
- 6
25
votes
2 answers
Fitting custom distributions by MLE
My question relates to fitting custom distributions in R but I feel it has enough of a probability element to remain on CV.
I have an interesting set of data which has the following characteristics:
Large mass at zero
Sizeable mass below a…

epp
- 2,372
- 2
- 12
- 31
22
votes
1 answer
Detecting outliers in count data
I have what I naively thought to be a fairly straight forward problem that involves outlier detection for many different sets of count data. Specifically, I want to determine if one or more values in a series of count data is higher or lower than…

Joe Gomphus
- 221
- 1
- 2
- 3
20
votes
1 answer
When an analytical Jacobian is available, is it better to approximate the Hessian by $J^TJ$, or by finite differences of the Jacobian?
Let's say I'm computing some model parameters my minimizing the sum squared residuals, and I'm assuming my errors are Gaussian. My model produces analytical derivatives, so the optimizer does not need to use finite differences. Once the fit is…

Colin K
- 477
- 3
- 9
18
votes
4 answers
Fitting t-distribution in R: scaling parameter
How do I fit the parameters of a t-distribution, i.e. the parameters corresponding to the 'mean' and 'standard deviation' of a normal distribution. I assume they are called 'mean' and 'scaling/degrees of freedom' for a t-distribution?
The following…

user12719
- 1,009
- 1
- 8
- 10
18
votes
1 answer
MLE vs least squares in fitting probability distributions
The impression that I got, based on several papers, books and articles that I've read, is that the recommended way of fitting a probability distribution on a set of data is by using maximum likelihood estimation (MLE). However, as a physicist, a…

Christian Alis
- 343
- 3
- 6
16
votes
4 answers
What does interpolating the training set actually mean?
I just read this article: Understanding Deep Learning (Still) Requires Rethinking Generalization
In section 6.1 I stumbled upon the following sentence
Specifically, in the overparameterized regime where the model capacity
greatly exceeds the…

Samuel
- 585
- 4
- 15
16
votes
1 answer
How to minimize residual sum of squares of an exponential fit?
I have the following data and would like to fit a negative exponential growth model to it:
Days <- c( 1,5,12,16,22,27,36,43)
Emissions <- c( 936.76, 1458.68, 1787.23, 1840.04, 1928.97, 1963.63, 1965.37, 1985.71)
plot(Days, Emissions)
fit <-…

Strohmi
- 815
- 1
- 10
- 13
16
votes
5 answers
Why does linear regression use a cost function based on the vertical distance between the hypothesis and the input data point?
Let’s say we have the input (predictor) and output (response) data points A, B, C, D, E and we want to fit a line through the points. This is a simple problem to illustrate the question, but can be extended to higher dimensions as well.
Problem…

alpha_989
- 283
- 3
- 10
16
votes
5 answers
Computing the mode of data sampled from a continuous distribution
What are the best methods for fitting the 'mode' of data sampled from a continuous distribution?
Since the mode is technically undefined (right?) for a continuous distribution, I'm really asking 'how do you find the most common value'?
If you…

keflavich
- 285
- 2
- 8
15
votes
3 answers
Can I use Kolmogorov-Smirnov test and estimate distribution parameters?
I've read that Kolmogorov-Smirnov test should not be used to test the goodness of fit of a distribution whose parameters have been estimated from the sample.
Does make sense to split my sample in two and use the first half for parameter estimation…

sortega
- 251
- 2
- 5
15
votes
3 answers
How can I fit a spline to data that contains values and 1st/2nd derivatives?
I have a dataset that contains, let's say, some measurements for position, speed and acceleration. All come from the same "run". I could construct a linear system and fit a polynomial to all of those measurements.
But can I do the same with splines?…

dani
- 203
- 1
- 8
15
votes
3 answers
How can I programmatically detect segments of a data series to fit with different curves?
Are there any documented algorithms to separate sections of a given dataset into different curves of best fit?
For example, most humans looking at this chart of data would readily divide it into 3 parts: a sinusoidal segment, a linear segment, and…

whybird
- 203
- 1
- 10
14
votes
2 answers
ARIMA vs ARMA on the differenced series
In R (2.15.2) I fitted once an ARIMA(3,1,3) on a time series and once an ARMA(3,3) on the once differenced timeseries. The fitted parameters differ, which I attributed to the fitting method in ARIMA.
Also, fitting an ARIMA(3,0,3) on the same data…

user1965813
- 253
- 2
- 7
13
votes
2 answers
When fitting a curve, how do I calculate the 95% confidence interval for my fitted parameters?
I am fitting curves to my data to extract one parameter. However, I am unsure what the certainty of that parameter is and how I would calculate / express its $95$% confidence interval.
Say for a dataset containing data that exponentially decays, I…

Leo
- 465
- 1
- 5
- 18