Questions tagged [fitting]

The process of fiting some statistical model to a particular set of data. Mostly done on a computer, and using varied numerical methods such as optimization or numerical integration, or simulation.

731 questions
29
votes
6 answers

Fit a sinusoidal term to data

Although I read this post, I still have no idea how to apply this to my own data and hope that someone can help me out. I have the following data: y <- c(11.622967, 12.006081, 11.760928, 12.246830, 12.052126, 12.346154, 12.039262, 12.362163,…
Pascal
  • 413
  • 1
  • 5
  • 6
25
votes
2 answers

Fitting custom distributions by MLE

My question relates to fitting custom distributions in R but I feel it has enough of a probability element to remain on CV. I have an interesting set of data which has the following characteristics: Large mass at zero Sizeable mass below a…
epp
  • 2,372
  • 2
  • 12
  • 31
22
votes
1 answer

Detecting outliers in count data

I have what I naively thought to be a fairly straight forward problem that involves outlier detection for many different sets of count data. Specifically, I want to determine if one or more values in a series of count data is higher or lower than…
Joe Gomphus
  • 221
  • 1
  • 2
  • 3
20
votes
1 answer

When an analytical Jacobian is available, is it better to approximate the Hessian by $J^TJ$, or by finite differences of the Jacobian?

Let's say I'm computing some model parameters my minimizing the sum squared residuals, and I'm assuming my errors are Gaussian. My model produces analytical derivatives, so the optimizer does not need to use finite differences. Once the fit is…
Colin K
  • 477
  • 3
  • 9
18
votes
4 answers

Fitting t-distribution in R: scaling parameter

How do I fit the parameters of a t-distribution, i.e. the parameters corresponding to the 'mean' and 'standard deviation' of a normal distribution. I assume they are called 'mean' and 'scaling/degrees of freedom' for a t-distribution? The following…
user12719
  • 1,009
  • 1
  • 8
  • 10
18
votes
1 answer

MLE vs least squares in fitting probability distributions

The impression that I got, based on several papers, books and articles that I've read, is that the recommended way of fitting a probability distribution on a set of data is by using maximum likelihood estimation (MLE). However, as a physicist, a…
16
votes
4 answers

What does interpolating the training set actually mean?

I just read this article: Understanding Deep Learning (Still) Requires Rethinking Generalization In section 6.1 I stumbled upon the following sentence Specifically, in the overparameterized regime where the model capacity greatly exceeds the…
Samuel
  • 585
  • 4
  • 15
16
votes
1 answer

How to minimize residual sum of squares of an exponential fit?

I have the following data and would like to fit a negative exponential growth model to it: Days <- c( 1,5,12,16,22,27,36,43) Emissions <- c( 936.76, 1458.68, 1787.23, 1840.04, 1928.97, 1963.63, 1965.37, 1985.71) plot(Days, Emissions) fit <-…
Strohmi
  • 815
  • 1
  • 10
  • 13
16
votes
5 answers

Why does linear regression use a cost function based on the vertical distance between the hypothesis and the input data point?

Let’s say we have the input (predictor) and output (response) data points A, B, C, D, E and we want to fit a line through the points. This is a simple problem to illustrate the question, but can be extended to higher dimensions as well. Problem…
alpha_989
  • 283
  • 3
  • 10
16
votes
5 answers

Computing the mode of data sampled from a continuous distribution

What are the best methods for fitting the 'mode' of data sampled from a continuous distribution? Since the mode is technically undefined (right?) for a continuous distribution, I'm really asking 'how do you find the most common value'? If you…
keflavich
  • 285
  • 2
  • 8
15
votes
3 answers

Can I use Kolmogorov-Smirnov test and estimate distribution parameters?

I've read that Kolmogorov-Smirnov test should not be used to test the goodness of fit of a distribution whose parameters have been estimated from the sample. Does make sense to split my sample in two and use the first half for parameter estimation…
sortega
  • 251
  • 2
  • 5
15
votes
3 answers

How can I fit a spline to data that contains values and 1st/2nd derivatives?

I have a dataset that contains, let's say, some measurements for position, speed and acceleration. All come from the same "run". I could construct a linear system and fit a polynomial to all of those measurements. But can I do the same with splines?…
dani
  • 203
  • 1
  • 8
15
votes
3 answers

How can I programmatically detect segments of a data series to fit with different curves?

Are there any documented algorithms to separate sections of a given dataset into different curves of best fit? For example, most humans looking at this chart of data would readily divide it into 3 parts: a sinusoidal segment, a linear segment, and…
whybird
  • 203
  • 1
  • 10
14
votes
2 answers

ARIMA vs ARMA on the differenced series

In R (2.15.2) I fitted once an ARIMA(3,1,3) on a time series and once an ARMA(3,3) on the once differenced timeseries. The fitted parameters differ, which I attributed to the fitting method in ARIMA. Also, fitting an ARIMA(3,0,3) on the same data…
user1965813
  • 253
  • 2
  • 7
13
votes
2 answers

When fitting a curve, how do I calculate the 95% confidence interval for my fitted parameters?

I am fitting curves to my data to extract one parameter. However, I am unsure what the certainty of that parameter is and how I would calculate / express its $95$% confidence interval. Say for a dataset containing data that exponentially decays, I…
Leo
  • 465
  • 1
  • 5
  • 18
1
2 3
48 49