Some of this will go into my presentation at the AGU meeting this December 12.
β Paul Pukite (@WHUT) November 10, 2016
With the meeting coming up in a few weeks, I am rationalizing my confidence in the QBO and ENSO models. Earlier I was much more confident in the QBO model, as the results were so clean. But now my confidence in the ENSO model has risen to the same level as for QBO.
The confidence test is the following. I take a pair of non-overlapping and non-contiguous intervals in the ENSO SOI time-series. Then I use the solver to extract the underling components in each of the intervals and essentially compare the two. If the behavior of ENSO were either highly chaotic or red-noise Markovian, the fits would be markedly different as each would follow a different trajectory. However, if the two are composed of the same periodic factors, then the odds are that ENSO is a deterministic cycle.
As I have described before, I flip the phase in the years 1980-1996 to capture the known climate shift, and use the wave-equation transform of the data instead of solving the DiffEq directly.
These are the periodic component comparisons between the lower and higher intervals
These all align very tightly, with the discrepancies indicating that it isn't some artifact of a flawed fitting process (i.e. the x=x problem).
The following shows the comparison between the annual harmonics (1/2, 1/3, etc periods with the annual period effectively filtered out in the original SOI time series) for the low and high regions and also the Mathieu modulation comparison
The years prior to 1910 were not used in the fit because the data appears much more noisy than the data post that date. Incidentally, no filtering was used during the model fitting process.
Also note that only data to 2013 was used and that the extrapolated fit predicts that the 2016 SOI spike was almost as strong as the 1998 event and next to that the strongest in the last 100 years. This is a graph focused only on the last 50 years, so you can see the predictive extrapolation more closely.
These are all based on the known Earth wobble and lunar tidal periods and really confirms that ENSO is a nearly pure deterministic stationary process driven by known geophysical forcings. And like the tidal models that this ENSO model emulates, the longer the period to extract from and the more lunar periods that are included, the better the fit becomes.
It only deviates from stationary determinism in terms of the odd vs even parity of the biennial modulation. The biennial modulation flips from an even-year parity before 1980 to an odd-year parity between the years 1980-1996. I only lack confidence in how to predict these flips, much like I lack the confidence to predict volcanic events which seems to impact the QBO sporadically.
Here is another clincher for long-term deterministic stationary properties of ENSO. Some time ago I was running machine-learning on the Universal ENSO Proxy (UEP) records. This goes back to the year 1650 and is an annual record. Interestingly, the primary component that the symbolic reasoning finds is the aliased anomalistic tide!
If I overlay the part of the UEP fit that overlaps the modern day SOI records, we get this.
Note that this is trivial to do as the symbolic reasoner provides a sinusoidal function.
The aliased frequency of 7.821 rads/year that the UEP fit maps to is precisely the aliased anomalistic frequency factor of 4.085 rads/year shifted by .
The phase reversal idea came from consideration of this disturbance with respect to a biennial mode, with supplemental references here:
And this wavelet graph by Roundy shows how standing modes of the SST got inverted after 1980 but came back in sync around 2000:
Roundy, P.E. "On the interpretation of EOF analysis of ENSO, atmospheric Kelvin waves, and the MJO." Journal of Climate 28.3 (2015): 1148-1165.
I think the phase flip and biennial mode go hand-in-hand. I have noticed a recent spate of papers on the biennial mode of ENSO. Here is a very recent one by NASA Godard scientists:
Achuthavarier, Deepthi, Siegfried D. Schubert, and Yury V. Vikhliaev. "North Pacific decadal variability: insights from a biennial ENSO environment." Climate Dynamics (2016): 1-19.
The above excerpt is the typical explanation via a wordy rationale of how a succeeding year is prevented by the previous year from cycling, thus creating a biennial period.
Which differs from the purely mathematical explanation of a Mathieu sloshing formulation showing an inherent period doubling (or frequency halving), such as described here by a group at CNRS in France:
Rajchenbach, Jean, and Didier Clamond. "Faraday waves: their dispersion relation, nature of bifurcation and wavenumber selection revisited." Journal of Fluid Mechanics 777 (2015): R2.
What I have been doing is noting the empirical observations from the climate scientists and then tying that into the math that the physicists and engineers have been developing for other hydrodynamics applications. What's interesting is that these research efforts are concurrently advancing and the timing is perfect to tie the hydrodynamics concepts to the biennial ENSO concept.
There are two key technical approaches that I am using in doing these DiffEq fits.
One is to use the NINO34 data to smooth out the SOI data for calculating the 2nd-derivative only. The SOI data is so noisy that the NINO34 is smooth enough that it removes the significant amount of noise that two successive differentiations will introduce.
Second is to apply a hybrid goodness-of-fit metric. The hybrid nature of the metric is that it combines a correlation coefficient (which emphasizes shape of the time series) and absolute error minimization (which reduces the relative error).
The hybrid is essentially
So my process is to apply the hybrid metric first to the Solver. Once that converges to get the scale right, I run a correlation coefficient goal to emphasize the shape. And when that converges, I run the hybrid again to reduce the relative error, which is really an energy minimization -- in that a good correlation needs to be balanced by minimum energy.
With the Excel Solver, the process takes a few hours total and I let it run in the background while I do other stuff. The linear multiple regression solver only takes a second but it can't handle the nonlinear Mathieu modulation, so that is tweaked manually to get a good fit. But the Solver approach is great in that I can start with a completely blank slate and find a largely reproducible solution in just a few hours.
What the biennial Mathieu modulation does is exaggerate the peaks and valleys until the forcing RHS matches the DiffEq LHS. The Solver iterates on the two sides until it converges to achieve an optimal metric.
And always the caveat in that there is a biennial phase inversion of ENSO between the years 1980-1996. Without that premise, the fit would not work, and which is again the likely reason that the underlying model has escaped notice all these years. Try doing any kind of fit on recent data of the last 50 years without inverting the phase over that 16-year interval and all you will find is anti-correlations and will soon give up. But as Ronald Coates said in the early 1960's "if you torture the data enough, nature will confess".
Another model training fit comparison, where the data included goes back to 1895. The earlier data looked noisy but I wanted to see how robust the fitting process is.
Periodic factor comparisons, they all match apart from the average magnitude, which varies during the solver process but is in arbitrary units anyways.
With these experiments, I am trying to stick a pin in the ENSO model and go in to the presentation with some confidence.
One result I noticed from the independent low and high model fits, is that a slight temporal shift exists between the two profiles. This is best observed as a ~1.5 month difference in the Mathieu modulations (i.e. the LHS modulation of the DiffEq):
This translates to the same shift in the forcing factors (i.e. the RHS of the DiffEq). So if I translate each of the factor curves by either 1 or 2 months, this is the alignment I get:
Note that the RED curve is the high interval and the BLUE is the low interval and so the stronger the blue, the greater degree the two agree, since the red will hide behind the blue. The correlation coefficients are shown as well.
So consider that this is the result of about two hours of the Excel Solver grinding away on an optimal solution for each interval. There was no overlap between the two intervals, yet we still get this stunning a result. The 1.5 month discrepancy may essentially be the uncertainty in the collection of the ENSO data. I don't think it's significant as this could be simply a local minimum that the Solver converged to due to inherent noise in the data.
Because it takes a while to run these trials, I haven't tried too many other sets of input parameters. Yet if I take some arbitrary values to substitute for the ( 6.41, 14.6, 4.085, 18.6, 9.3 ) set, the resulting fit is horrible and there is absolutely no correlation in the Low vs High output factors. There may in fact be another set that works as well, but the periods would have to be physically significant to make any sense in terms of the periodic geophysical forcing mechanisms at work.
The remaining two factors that I can add are derived second-order tidal factors corresponding to 5.643 and 3.447 years which are associated with the fortnightly long period tides stemming from nonlinear multiplicative interactions of the nodal, anomalistic, and tropical monthly periods. These are critical for achieving highly precise tidal predictions and so would think they would be relevant here as well. But I have to be careful of overfitting at this stage. These would require twice as long an interval for convergence and so I would lose the low vs high training validation.
So how might the 5.643 and 3.447 year aliased tidal factors contribute to the ENSO model fit? I needed to increase the interval width at the expense of overlapping the Low and High intervals. The reason for this is that 5.643 is relatively close to 6.41 and so the discrimination region needs to be wider.
This factor does markedly improve the overall fit, as you can see in both the Low and High fits how well the 2016 El Nino peak was captured.
Shown next are the correlations for the individual factors. The 5.643 period correlation Low vs High is at the bottom. The CC values are all ~0.8 or greater.
In contrast as shown next, the 3.447 period correlation Low vs High is not strong and the contribution to the fit is relatively weak
This does not imply the factor is not there, only that more work is needed to reveal it. The Mathieu modulation converges as well, but this is at least partly due to the overlapping interval fit: