2

I am using Google Charts to plot a large data set. The database contains one record for every two seconds; five minutes' worth of data yields 150 records (data points) and the result is acceeptable. However, my client wants to be able to visualize a whole day or even a month, which is a lot of data for an Ajax request, and the resulting graph doesnt look good. I tried averaging the values, but this attenuates the peaks. I need to compress the data while preserving the extrema.results The data represent the voltage in each phase of an electric grid. The first graph ("A") in the image displays 5 minutes' worth of data (150 datapoints), with a peak of around 600 volts at 15:32. The second ("B") is for 10 minutes, also 150 datapoints with averaging. Notice the peak now appears to be a little over 400V. The third graph is for 60 minutes, no averaging; the peaks are preserved, but there are now 1800 datapoints and the graph appears crowded. The last graph is an example of the desired result. Somehow it manages to display a month's worth of data without appearing jumbled up. This is an example of the desired result.

example data 280 points

http://expti.com.br/getlog.asp

example

Sextus Empiricus
  • 43,080
  • 1
  • 72
  • 161
  • 4
    This is much too general. There is no way to suggest ideas about compressing the data and about creating concise graphics without knowing what sort of data and issues it is about. Could you show examples, and better explain which features you would like to wish to present in a graph. – Sextus Empiricus Sep 30 '19 at 13:50
  • 2
    An example of an answerable question of this type appears at https://stats.stackexchange.com/questions/35220. It reveals why this question must be made more specific, as requested by @Martijn. – whuber Sep 30 '19 at 14:20
  • 1
    Having worked as a submarine nuclear reactor operator in the U.S. Navy, I find your question to be clearly stated. My recommendation is to present the data in two parts: voltages within the normal range are not directly plotted, but rather displayed as a "within normal range" band horizontal line. These voltages are not of concern by definition, as they are normal. Only voltages outside the normal range are plotted by value. This will convert the plot from a display of "all voltages" to "abnormal voltages". You might allow choice of "normal range" with some preset default value. – James Phillips Sep 30 '19 at 15:27
  • 2
    So you have a slow-moving trend plus fast and sharp peaks. You could treat the two separately. Filter all the peaks and then display the relatively continuous part of the data by subsampling (or some other data reduction or filter method). Then put them together 'filtered trend + peaks'. Two questions: The peaks are always a single point? This is not clear from your graphs which seem to add more detail (some wavy curve), more than what 150 points provide. In addition, what features are important about the peaks, only the height, and not the width? – Sextus Empiricus Sep 30 '19 at 15:36
  • 1
    @MartijnWeterings I should have mentioned that the data in the plot were simulated. I should think real peaks are not always a single point, and their width does matter. And you're right about the wavyness. It's a feature of Google Charts. When disabled, the graph presents sharp edges. I also might have mentioned I have no experience with statistical methods whatsoever. How does one go about subsampling? – developer1405 Sep 30 '19 at 16:23
  • 1
    @JamesPhillips thanks for your reply, but I think that wouldn't meet the client's specs – developer1405 Sep 30 '19 at 16:26
  • In that case, @MartijnWeterings suggestion merits serious consideration. – James Phillips Sep 30 '19 at 16:58
  • Developer1405 an answer can be made, but it would be very helpfull when you add an example data set or example time series. – Sextus Empiricus Sep 30 '19 at 17:06
  • @MartijnWeterings please see edited question & forgive poor formatting – developer1405 Sep 30 '19 at 17:31
  • Those are only 13 datapoints. You do not have an example of more points? The idea is to remain as close to your problem as possible, and also to get an idea of how the data looks like (what do the peaks look like, what does the slow trend look like). – Sextus Empiricus Sep 30 '19 at 17:55
  • 1
    @MartijnWeterings provided a link with 280 data points, pls see edited question – developer1405 Sep 30 '19 at 19:05
  • In comparison to your simulated data, your sample data shows a much less clear separation between peaks and the relatively more constant (less fluctuating) background. It is not clear to me what you wish to show (accentuate) in this data. – Sextus Empiricus Sep 30 '19 at 19:29
  • I had voted to reopen your question but now I am voting to close again. This does not mean I find your question bad, but it is very unclear and broad. We should clear up the question before people may start adding useless answers (which means useless work, but also may confuse the question further). – Sextus Empiricus Sep 30 '19 at 19:33

0 Answers0