I need to calculate the annual increase rate (both absolute and relative) of a set of data. However, I do not have the raw data for each year. I only have the data averaged for each of several years as listed in the table below.
# | Duration | Middle year | Average amount (kg/year) |
---|---|---|---|
1 | 1980-1990 | 1985 | a1 |
2 | 1991-1995 | 1993 | a2 |
3 | 1996-2000 | 1998 | a3 |
4 | 2001-2005 | 2003 | a4 |
5 | 2006-2010 | 2008 | a5 |
6 | 2011-2015 | 2013 | a6 |
I can think of two methods to calculate the annual increase rate but have several questions.
Method 1:
The annual increase rate r was calculated as
$$r = (a_6-a_1)/n$$ where n is the number of years.
Q1: How to determine the number of years n? Shall I calculated (1) using the beginning of each period $$n = 2015-1980 + 1 = 36 \: years$$ or (2) using the middle year of each period as
$$n = 2013- 1985+1 = 29 \: years$$ For me, it seems that (2) is more reasonable since each number should not represent the beginning of the period but the middle.
Q2: The relative increase rate RR should be calculated (1) as $$RR = r/n$$ or by (2) solving the equation $$a_1(1+RR)^n = a_6$$ The number n also represents the number of years. (1) Assumes the first data as the reference while (2) assumes the previous data as referrence. I am not sure which method is the common accepted one.
Method 2
A linear regression is done between the data (a1 to a6) and the year and the resultant equation y = kx + b. The number k will be the annual increase rate. And the relative increase rate (RR) can be calculated as $$RR = k/n$$ where n is the number of years (same as that in Method 1).
Q3: For the regression analysis, shall I use the begging of each period of the middle of each period?
Q4: Comparing Method 1 and Method 2, which is more reasonable? Since I have to show the plot with the regression equation, I would prefer Method 2 to make the number in the text and in the plot consistent in Method 2 is acceptable.