Fourier Curve Fitting

Fourier theory states that we can model any time series with a series of sine curves. The only major caveat is that the series must have no overall trend (tendency to increase or decrease in value with time); if it does, we can remove the linear trend, solve for the Fourier series, and then add it back to the trend. The result will be:

Original Time Series = Trend + Mean + Series sine curves

Each sine curve has three parameters:

Because sine or cosine curves oscillate about zero, adding the mean allows the series to have an average value that is not zero.

The periods at which we get the sine curves are not random. Several choices can be made, either using powers of two or all integers. In the case of all integers, the zero-th term has infinite period and is the mean; the first term has a period equal to the length of the time series; the second term has a period equal to the length of the time series (two cycles over the entire time period); the third term has a period equal to 1/3 the length of the time series (three cycles over the entire time period); and the nth term has a period equal to 1/n of the length of the time series (n cycles over the entire time period). If we only used powers of two, we would only have periods of 1, , , 1/8, 1/16, 1/32…. of the overall period.

In the extreme case, we need the number of terms in the series to equal the number of points in the data series. In practical terms we can generally recreate any series, even a step function, with a much smaller number of terms. To look for periodicity in the series, we can order the terms by their amplitude, and pick out periods with extremely high amplitudes.

To find periodicity in the data, we need to sample over a number of periods. We can resolve very fine differences in period at short periods, but not at the long periods. If the period is long compared to overall time series, we may not be able to accurately measure it since an even number of periods will not fit into the data series.

 

This graph shows a topographic profile across Mt. St. Helens in red. Superimposed in green is the fit with a series of 50 sine curves. The fit is remarkably good, except at the two ends of the profile where the Fourier curve has to repeat (note that it drops on the left side and rises on the right, which the real profile does not do). In light blue we have the sum of the two strongest terms in the series, and in dark blue is the second strongest term.

The table below shows the data for the strongest terms in the Fourier series to represent Mt. St. Helens. Note that for each a period, phase, and amplitude appear. Each has also has a % SS for percentage of the sum of squares, to indicate how much of the variance each explains. Note that in this case a single sine curve explains almost 95% of the variance; Mt. St. Helens is almost a sine curve (the fortuitous length of the profile probably helps this).

 Mean  = 1668.45  Variance  =   198624.668

Component   Period   Phase    Amp    % SS

     1     602.000   278.3 612.845  94.54
     4     150.500   332.4  75.208   1.42
     5     120.400   134.0  71.785   1.30
     6     100.333   352.5  64.491   1.05
     3     200.667   141.8  62.420   0.98
     8      75.250   359.9  28.498   0.20
     2     301.000    91.1  21.209   0.11
     7      86.000   146.0  19.324   0.09
    11      54.727     0.9  12.635   0.04
    12      50.167   271.4  12.390   0.04
     9      66.889   312.5  10.960   0.03
    14      43.000   292.8   9.984   0.03
    10      60.200   298.2   9.586   0.02

Total Percent sum of Squares  =  100.0

Note that component one has a period of 602, the number of points in the data set, component 2 has a period of 301 (602/2) component 3 has a period of 150.5 (602/3), and so on. If there were any periodicity at a value of 200 or 250 it would be missed by this analysis.

Predicted Hourly readings, Annapolis Tides

50 terms, min resolved period 57.6 hours

This graph shows the Annapolis tide in red, with the fit in green. The minimum resolved period does not capture the true diurnal and semidiurnal tide periods, but does capture some of the longer periods. Overall, as shown in the table below, the series only explains 30% of the variance.

 

Component   Period   Phase    Amp   % SS  Cum % SS

     1    2880.000   171.2   0.260  18.122   18.12
     4     720.000   146.8   0.117   3.667   21.79
     2    1440.000   168.6   0.109   3.174   24.96
     8     360.000    14.0   0.083   1.860   26.82
     3     960.000   164.0   0.080   1.706   28.53
     9     320.000   168.1   0.053   0.751   29.28
     5     576.000    73.5   0.032   0.280   29.56
    50      57.600     3.0   0.004   0.003   30.48
Total Percent sum of Squares  =   30.4845

With only 50 terms and a minimum resolved period of 57 hours, the series does not do a good job of capturing the tidal variation. The only strong amplitude reflects the annual cycle of the Annapolis tide, where summer water levels are significantly higher than winter levels.

Predicted Hourly readings, Annapolis Tides

250 terms, min resolved period 11.52 hours

This graph shows the Annapolis tide in red, with the fit in green. This fit captures 99.7% of the variance.

 

Component   Period   Phase    Amp   % SS  Cum % SS
   232      12.414   151.2   0.416  46.263   46.26
     1    2880.000   171.2   0.260  18.122   64.39
   112      25.714   295.5   0.126   4.243   68.63
     4     720.000   146.8   0.117   3.667   72.30
     2    1440.000   168.6   0.109   3.174   75.47
   111      25.946   123.5   0.095   2.424   77.89
   121      23.802    50.8   0.091   2.201   80.09
   120      24.000    36.2   0.090   2.180   82.27
     8     360.000    14.0   0.083   1.860   84.14
   240      12.000    56.4   0.083   1.836   85.97
     3     960.000   164.0   0.080   1.706   87.68
   119      24.202    68.0   0.072   1.375   89.05
   227      12.687   348.4   0.065   1.121   90.17
   231      12.468   324.9   0.054   0.777   90.95
   107      26.916   136.7   0.054   0.767   91.72
     9     320.000   168.1   0.053   0.751   92.47
   233      12.361   156.5   0.050   0.659   93.13
   113      25.487   287.3   0.044   0.528   93.66
   118      24.407    71.5   0.040   0.419   94.07
   122      23.607    47.8   0.037   0.362   94.44
   228      12.632   162.2   0.035   0.334   94.77
   116      24.828    82.1   0.033   0.294   95.07
     5     576.000    73.5   0.032   0.280   95.35
   114      25.263   279.0   0.031   0.255   95.60
   110      26.182   131.5   0.030   0.246   95.85
   117      24.615    77.0   0.030   0.245   96.09
   250      11.520   141.1   0.005   0.006   99.42

Total Percent sum of Squares  =   99.7602

With 250 terms in the series, note the preponderance of periods around 12 and 24 hours, and that the series almost perfectly recreates the predicted tide series.


There appears to be a bug, which can let the sum of the squares explained exceed 100%.


Last revised  3/6/2005