2

I need to compare two outputs of electricity consumption measurement system coming from one source (one company) and assess whether they are corresponding. If not, it may prove that one of the systems may be faulty.

The outputs are in form of hourly consumption data sets, 8784 entries, over the period of one year.

Besides correlation, what other methods can I use to achieve this task?

gung - Reinstate Monica
  • 132,789
  • 81
  • 357
  • 650
  • When you say "corresponding" do you mean are the two sets of measurements measuring the same thing, perhaps with some level of discrepancy, or do you mean measuring two different things that should be similar in their behavior? For example, two different monitors taking electricity consumption readings for the same building, or two different monitors in two different buildings that are reasonably close together? – Wayne Aug 16 '16 at 19:52
  • 1
    The first option: two different systems measure electricity consumption in one company/building. Is "correspondence" a proper name? – morphy_richards Aug 16 '16 at 19:57

2 Answers2

1

Since these are a time series, you have to be careful. If both time series are increasing (or decreasing) over the year, a straight-up correlation will be misleading. That's how various websites show clever correlations between things like coffee production in Peru and movies in which Kevin Bacon starred. Have you looked at this question in this forum?

Would it be useful to aggregate to larger time periods? Have you already looked at the average usage per hour, per day, per month for each over the year? Are they reasonably close, just eyeballing it? Then there are tests for whether two groups have the same mean/variance, and it sounds like you have a paired example, that you can look at by week and by month.

There are also time-series-specific comparison methods, as pionpi_ says in their answer.

The key issue with time series is that they have autocorrelation: things tend to be the same from day to day and week to week, etc. If yesterday was hot, today will likely be hot. July temperatures are more like June temperatures than January temperatures. Etc. So each hour's electricity usage is not independent of the previous or next hour's usage.

Wayne
  • 19,981
  • 4
  • 50
  • 99
0

there are many measures of distance for time series data. see Fu 2011 and Liao 2005.

pionpi_
  • 33
  • 5