Questions tagged [compression]

Data compression is a process used to reduce the number of bits used to store a "message". Compression can be lossless or lossy. Lossy compression is an option for audio and visual data, whereas many other applications require lossless compression.

47 questions
26
votes
4 answers

Kullback-Leibler divergence WITHOUT information theory

After much trawling of Cross Validated, I still don't feel like I'm any closer to understanding KL divergence outside of the realm of information theory. It's rather odd as somebody with a Math background to find it much easier to understand the…
8
votes
1 answer

Comparison of entropy and distribution of bytes in compressed/encrypted data

I have some question which occupies myself for a while. The entropy test is often used to identify encrypted data. The entropy reaches its maximum when the bytes of the analyzed data are distributed uniformely. The entropy test identifies encrypted…
7
votes
2 answers

Compression theory, practice, for time series with values in a space of distributions (say of a real random variable)

Example of problem: Part of our research team is working on providing operationally wind power forecast. Usually, since there are different time scalse that interest forecast user, a forecast is issued every 15 min (it has even happened that 5…
6
votes
2 answers

When and why do we use sparse coding?

Sparse coding is described as "given an input $X$, finding a latent representation $h$ such that h is sparse and the input can be reconstructed as well as possible." (source: https://www.youtube.com/watch?v=7a0_iEruGoM) My question is why do we want…
Sofia693
  • 173
  • 8
5
votes
1 answer

Why low rank expansions can exploit the redundancy that exist between different feature channels and filters?

I read Jaderberg et al., 2014 paper about Speeding up Convolutional Neural Network with Low Rank Expansions. In the introduction, it is written in bold font: Our key insight is to exploit the redundancy that exists between feature channels and…
4
votes
1 answer

Weights of random sets of random 32-bit strings

I have random sets of $N$ random 32-bit strings, where all bits are i.i.d. with $\mathbb{P}(0) = \mathbb{P}(1) = 1/2$. Define $\ \ \ \ $weight( 32-bit x ) = number of 1 bits in x, i.e. Hamming distance to 0 $\ \ \ \ $minweight( set $X$ ) =…
denis
  • 3,187
  • 20
  • 34
4
votes
4 answers

Ultimate compression algorithm

I was not sure where to put this question, so I put it here. Feel free to move it to another stack exchange site moderators. Lets say I have a 10 gigs of pictures (or for that matter any type of data, please don't answer the question specifically…
SamB
  • 143
  • 5
3
votes
0 answers

From a deep learning point of view, is there a lower limit on the number of hours of speech needed to train a neural net

From a deep learning practitioner's point of view, is there a lower limit on the number of hours of speech needed to train a neural net to translate speech to text? An estimate from CMU is 3000-5000 hours for 90% accuracy commercial quality speech…
3
votes
3 answers

How does SVD save space?

We start with an $m \times n$ matrix before SVD. After SVD, we have three matrices of sizes, $m \times m$, $n \times n$ and $m \times n$. How do we save space then if now we have three matrices instead of one and more numbers to store? Why are we…
3
votes
1 answer

Compressed Sensing: Missing Fourier Coefficients?

This question is regarding the problem of reconstructing a signal given only a subset of the Fourier coefficients are observed: $$\min_x \|x\|_1 \text{ subject to } y = Ax$$ where $x = (x_1,x_2,\dots,x_t)$ is a time-domain representation of our…
3
votes
2 answers

How to compress sets of integer series?

I have a set of integer series $S_1$, $S_2$, ... $S_n$. Each series has 3600 data points. Each data point is a positive integer. Each data point is stored as an unsigned int requiring 4 bytes. So, storing the entire series requires 4 * 3600 bytes.…
Nikhil
  • 73
  • 6
2
votes
0 answers

Analyzing 3D data: What can be done?

I am new to this kind of analysis, and I want to know what values I can look at in 3D data. The data itself is a 3D volume $(x,y,z)$ with a floating point value in every coordinate. It is a hyperspectral image, meaning: the $z$-space is the same…
reBourne
  • 33
  • 4
2
votes
1 answer

How to compute theoretical compression limit?

Assume we have a sensor field with dimension M*M. In order to apply any data compression technique, first I want to know what is the compression limit or minimum entropy of the entire sensor field. How could I compute the minimum entropy or…
user2384
  • 21
  • 1
2
votes
0 answers

Optimal compressibility and PCA

I have a population $\mathcal{X}$ of $N$ samples extracted from a multivariate gaussian random variable $\mathbf{x} \in \mathbb{R}^d$. Let us define a transformation $f_{d\rightarrow r} (\mathbf{x}) = \mathbf{x'}$ which performs a dimensionality…
2
votes
0 answers

Data compression for graph plotting

I am using Google Charts to plot a large data set. The database contains one record for every two seconds; five minutes' worth of data yields 150 records (data points) and the result is acceeptable. However, my client wants to be able to visualize a…
1
2 3 4