None of that matters. The drop over 10 minutes was 2.9 per minute.Ok, let's go with your example sequence. The average over 10 minutes is 2.9 per minute, and the standard deviation is 6. That's a huge standard deviation. This means, we can say that in 68% of circumstances, the 11th minute's usage will be somewhere between -3.1 and 8.9. Obviously it can't be less than zero, but this demonstrates that your example sequence gives a really bad prediction. That's not a very tight prediction.
Let's test the r-squared value of your prediction. First, we have to map the usage. Under your sequence the total used energy by minute is 1, 2, 3, 4, 5, 25, 26, 27, 28, 29. Right? Under your 2.9/minute prediction, the next 10 minutes will be 2.9, 5.8, 8.7, 11.6, 14.5, 17.4, 20.3, 23.2, 26.1, 29. Same total usage at the end of the 10 minutes. The r-squared of this is 0.8. Not terrible, but not great either.
How about this sequence for 10 minutes: 20,1,1,1,1,1,1,1,1,20. Average is 4.8/minute. Standard deviation is 8! R-squared is 0.6 - pretty freakin' bad.
Compare your sequence, and my other sequence, to a tighter 10-minute sequence: 17,17,17,17,17,20,17,17,17,17. Average is 17.3/minute. Standard deviation is 0.9. R-squared is 0.999. Now that is something from which you can make a good prediction.
If there had been two 20 spikes within a given 10 minute window the inpact would be a value of 49/10 = 4.9
I.e. About twice and exactly as needed/expected!
The purpose of the average is to average out the minute by minute measurements of 1/min and 20/min