Questions tagged [computational-statistics]

Refers to the interface of statistics and computing; the use of algorithms and software for statistical purposes.

Computational statistics refers to the interface of statistics and computing; the use of algorithms and software for statistical purposes.

References

The following journals are dedicated to research in computational statistics:

648 questions
167
votes
21 answers

Does Julia have any hope of sticking in the statistical community?

I recently read a post from R-Bloggers, that linked to this blog post from John Myles White about a new language called Julia. Julia takes advantage of a just-in-time compiler that gives it wicked fast run times and puts it on the same order of…
Christopher Aden
  • 1,775
  • 4
  • 24
  • 43
79
votes
9 answers

What algorithm should I use to detect anomalies on time-series?

Background I'm working in Network Operations Center, we monitor computer systems and their performance. One of the key metrics to monitor is a number of visitors\customers currently connected to our servers. To make it visible we (Ops team) collect…
52
votes
8 answers

Excel as a statistics workbench

It seems that lots of people (including me) like to do exploratory data analysis in Excel. Some limitations, such as the number of rows allowed in a spreadsheet, are a pain but in most cases don't make it impossible to use Excel to play around with…
Carlos Accioly
  • 4,715
  • 4
  • 25
  • 28
50
votes
6 answers

What algorithm is used in linear regression?

I usually hear about "ordinary least squares". Is that the most widely used algorithm used for linear regression? Are there reasons to use a different one?
41
votes
4 answers

How to sample from a normal distribution with known mean and variance using a conventional programming language?

I've never had a course in statistics, so I hope I'm asking in the right place here. Suppose I have only two data describing a normal distribution: the mean $\mu$ and variance $\sigma^2$. I want to use a computer to randomly sample from this…
Fixee
  • 555
  • 1
  • 4
  • 6
32
votes
13 answers

If R were reprogrammed from scratch today, what changes would be most useful to the statistics community?

Many people in the statistics community and other academic fields use R as their primary language for data analysis and statistical computing. It is a wonderful and versatile language that has become extremely popular across both academic and…
Ben
  • 91,027
  • 3
  • 150
  • 376
30
votes
12 answers

Command-line tool to calculate basic statistics for stream of values

Is there any command-line tool that accepts the flow of numbers (in ascii format) from standard input and gives the basic descriptive statistics for this flow, such as min, max, average, median, RMS, quantiles etc? The output is welcome to be…
mbaitoff
  • 757
  • 1
  • 8
  • 16
29
votes
5 answers

What are examples of statistical experiments that allow the calculation of the golden ratio?

There are some very simple experiences that can be done by a kid at home, whose result allows one to statistically approach famous numbers such as $\pi$ or $e$. An example where $\pi$ shows up is perhaps the most famous one of its kind. In Buffon's…
29
votes
7 answers

Statistics concept to explain why you're less likely to flip the same number of heads as tails, as the number of flips increases?

I'm working on learning probability and statistics by reading a few books and writing some code, and while simulating coin flips I noticed something that struck me as slightly counter to one's naive intuition. If you flip a fair coin $n$ times,…
mindcrime
  • 393
  • 1
  • 3
  • 10
27
votes
2 answers

How could stochastic gradient descent save time compared to standard gradient descent?

Standard Gradient Descent would compute gradient for the entire training dataset. for i in range(nb_epochs): params_grad = evaluate_gradient(loss_function, data, params) params = params - learning_rate * params_grad For a pre-defined number of…
23
votes
4 answers

C++ libraries for statistical computing

I've got a particular MCMC algorithm which I would like to port to C/C++. Much of the expensive computation is in C already via Cython, but I want to have the whole sampler written in a compiled language so that I can just write wrappers for…
JMS
  • 4,660
  • 1
  • 22
  • 32
20
votes
3 answers

Julia: Taking stock of how it has been doing

I came across a 2012 question that had a very good discussion about Julia as an alternative to R / Python for various types of Statistical Work. Here lies the original Question from 2012 about Julia's promise Unfortunately Julia was very new back…
curious_cat
  • 1,043
  • 10
  • 28
19
votes
4 answers

What kinds of statistical problems are likely to benefit from quantum computing?

We are at the advent of quantum computing, with quantum languages anticipating hardware quantum computers now available at high and low levels for simulated quantum computers. Quantum computing brings new elementary functions like entanglement and…
Alexis
  • 26,219
  • 5
  • 78
  • 131
19
votes
3 answers

What is the difference between bagging and random forest if only one explanatory variable is used?

" The fundamental difference between bagging and random forest is that in Random forests, only a subset of features are selected at random out of the total and the best split feature from the subset is used to split each node in a tree, unlike in…
19
votes
2 answers

How do ABC and MCMC differ in their applications?

To my understanding Approximate Bayesian Computation (ABC) and Markov Chain Monte Carlo (MCMC) have very similar aims. Below I describe my understanding of these methods and how I perceive the differences in their application to real life…
Remi.b
  • 4,572
  • 12
  • 34
  • 64
1
2 3
43 44