# Last post on ENSO

The last of the ENSO charts.

This is how conventional tidal prediction is done:

Note how well it does in extrapolating a projection from a training interval.

This is an ENSO model fit to SOI data using an analytical solution to Navier-Stokes. The same algorithm is used to solve for the optimal forcing as in the tidal analysis solution above, but applying the annual solar cycle and monthly/fortnightly lunar cycles instead of the diurnal and semi-diurnal cycle.

The time scale transitions from a daily modulation to a much longer modulation due to the long-period tidal factors being invoked.

Next is an expanded view, with the correlation coefficient of 0.73:

This is a fit trained on the 1880-1950 interval (CC=0.76) and cross-validated on the post-1950 data

This is a fit trained on the post-1950 interval (CC=0.77) and cross-validated on the 1880-1950 data

Like conventional tidal prediction, very little over-fitting is observed. Most of what is considered noise in the SOI data is actually the tidal forcing signal. Not much more to say, except for others to refine.

Thanks to Kevin and Keith for all their help, which will be remembered.

# Correlation Coefficient of ENSO Power Spectra

The model fit to ENSO takes place in the time domain. However, the correlation coefficient between model and data of the corresponding power spectra is higher than in the time series. Below in Figure 1 the CC is 0.92, while the CC in the time series is 0.82.

Fig.1 : Power spectra of ENSO data against model

The model allows only 3 fundamental lunar frequencies along with the annual cycle, plus the harmonics caused by the non-linear orbital path and the seasonally impulsed modulation.

What this implies is that almost all the peaks in the power spectra shown above are caused by interactions of these 4 fundamental frequencies. Figure 2 shows a satellite view of peak splitting (also shown here).

Fig 2: Frequency sideband plot identifying components created by modulation of a biennial cycle with the lunar cycles (originally described here).

One of the reasons that the power spectrum gives a higher correlation coefficient — despite the fact that the spectrum wasn't used in the fit — is that the lunar tides are precisely determined and thus all the harmonics should align well in the frequency domain. And that's what is observed with the multiple-peak alignment.

Furthermore, according to Ref [1], this result is definitely not a characteristic of noise-driven system, and it also possesses a very low dimension of chaotic content. The same frequency content is observed largely independent of the prediction time profile, i.e. training interval.

## References

1. Bhattacharya, Joydeep, and Partha P. Kanjilal. "Revisiting the role of correlation coefficient to distinguish chaos from noise." The European Physical Journal B-Condensed Matter and Complex Systems 13.2 (2000): 399-403.

# NINO34 vs SOI

Experiment to compare training runs from 1880 to 1980 of the ENSO model against both the NINO34 time-series data and the SOI data. The solid red-curves are the extrapolated cross-validation interval..

NINO34

SOI

Many interesting inferences one can potentially draw from these comparisons. The SOI signal appears more noisy, but that could actually be signal. For example, the NINO34 extrapolation pulls out a split peak near 2013-2014, which does show up in the SOI data. And a discrepancy in the NINO34 data near 1934-1935 which predicts a minor peak, is essentially noise in the SOI data.  The 1984-1986 flat valley region is much lower in NINO34 than in SOI, where it hovers around 0. The model splits the difference in that interval, doing a bit of both. And the 1991-1992 valley predicted in the model is not clear in the NINO34 data, but does show up in the SOI data.

Of course these are subjectively picked samples, yet there may be some better combination of SOI and NINO34 that one can conceive of to get a better handle on the true ENSO signal.

click to enlarge

# GC41B-1022: Biennial-Aligned Lunisolar-Forcing of ENSO: Implications for Simplified Climate Models

In the last month, two of the great citizen scientists that I will be forever personally grateful for have passed away. If anyone has followed climate science discussions on blogs and social media, you probably have seen their contributions.

Keith Pickering was an expert on computer science, astrophysics, energy, and history from my neck of the woods in Minnesota. He helped me so much in working out orbital calculations when I was first looking at lunar correlations. He provided source code that he developed and it was a great help to get up to speed. He was always there to tweet any progress made. Thanks Keith

Kevin O'Neill was a metrologist and an analysis whiz from Wisconsin. In the weeks before he passed, he told me that he had extra free time to help out with ENSO analysis. He wanted to use his remaining time to help out with the solver computations. I could not believe the effort he put in to his spreadsheet, and it really motivated me to spending more time in validating the model. He was up all the time working on it because he was unable to lay down. Kevin was also there to promote the research on other blogs, right to the end. Thanks Kevin.

There really aren't too many people willing to spend time working analysis on a scientific forum, and these two exemplified what it takes to really contribute to the advancement of ideas. Like us, they were not climate science insiders and so will only get credit if we remember them.

# Derivation of an ENSO model using Laplace's Tidal Equations

Laplace developed his namesake tidal equations to mathematically explain the behavior of tides by applying straightforward Newtonian physics. In their expanded form, known as the primitive equations, Laplace's starting formulation is used as the basis of almost all detailed climate models. Since that's what they are designed to do, this post provides the details for solving Laplace's tidal equations in the context of the El Nino Southern Oscillation (ENSO) of the equatorial Pacific ocean. The derivation and results shown below essentially describe the framework of my presentation at this month's AGU meeting: Biennial-Aligned Lunisolar-Forcing of ENSO: Implications for Simplified Climate Models

The concise derivation for a model of ENSO depends on reducing Laplace's tidal equations along the equator. I could not find anyone taking a similar approach anywhere in the literature, even though it appears to be routinely obvious: (1) solve Laplace's tidal equations in a simplified context, then (2) apply the known tidal forcing and observe if the result correlates or matches the ENSO time series. In fact, it does, as I have shown before (and for QBO as well); but this is the first time that I have worked out the details in full for ENSO. Below is a two part solution.

# Machine Learning and the Climate Sciences

I've been applying equal doses of machine learning (and knowledge based artificial intelligence in general) and physics in my climate research since day one. Next month on December 12, I will be presenting Knowledge-Based Environmental Context Modeling at the AGU meeting which will cover these topics within the earth sciences realm :

Table 1: Technical approach to knowledge-based model building for the earth sciences

In my opinion, machine learning likely will eventually find all the patterns that appear in climate time-series but with various degrees of human assistance.

"Vipin Kumar, a computer scientist at the University of Minnesota in Minneapolis, has used machine learning to create algorithms for monitoring forest fires and assessing deforestation. When his team tasked a computer with learning to identify air-pressure patterns called teleconnections, such as the El Niño weather pattern, the algorithm found a previously unrecognized example over the Tasman Sea."

In terms of the ENSO pattern, I believe that machine learning through tools such as Eureqa could have found the underlying lunisolar forcing pattern, but would have struggled mightily to break through the complexity barrier. In this case, the complexity barrier is in (1) discovering a biennial modulation which splits all the spectral components and (2) discovering the modifications to the lunar cycles from a strictly sinusoidal pattern.

The way that Eureqa would have found this pattern would be through it's symbolic regression algorithm (which falls under the first row in Table 1 shown above). It essentially would start it's machine learning search by testing various combinations of sines and cosines and capturing the most highly correlated combinations for further expansion.   As it expands the combinations, the algorithm would try to reduce complexity by applying trigonometric identities such as this

${\displaystyle \sin(\alpha \pm \beta )=\sin \alpha \cos \beta \pm \cos \alpha \sin \beta }$

After a while, the algorithm will slow down under the weight of the combinatorial complexity of the search, and then the analyst would need to choose promising candidates from the complexity versus best-fit Pareto front. At this point one would need to apply knowledge of physical laws or mathematical heuristics which would lead to a potentially valid model.

So, in the case of the ENSO model, Eureqa could have discovered the (1) biennial modulation by reducing sets of trigonometric identities, and perhaps by applying a sin(A sin()) frequency modulation (which it is capable of) to discover the (2) second-order modifications to the sinusoidal functions, or (3) it could have been fed a differential equation structure to provide a hint to a solution  .... but, a human got there first by applying prior knowledge of signal processing and of the details in the orbital lunar cycles.

Yet as the Scientific America article suggests, that will likely not be the case in the future when the algorithms continue to improve and update their knowledge base with laws of physics.

This more sophisticated kind of reasoning involves the refined use of the other elements of Table 1.  For example, a more elaborate algorithm could have lifted an entire abstraction level out of a symbolic grouping and thus reduced its complexity. Or it could try to determine whether a behavior was stochastic or deterministic.  The next generation of these tools will be linked to knowledge-bases filled with physics patterns that are organized for searching and reasoning tasks. These will relate the problem under study to potential solutions automatically.

# High Resolution ENSO Modeling

An intriguing discovery is that the higher-resolution aspects of the SOI time-series (as illustrated by the Australian BOM 30-day SOI moving average) may also have a tidal influence.  Note the fast noisy envelope that rides on top of the deep El Nino of 2015-2016 shown below:

For the standard monthly SOI as reported by NCAR and NOAA, this finer detail disappears.  BOM provides the daily SOI value for about the past ~ 3 years here.

Yet if we retain this in the 1880-present monthly ENSO model, by simultaneously isolating [1] the higher frequency fine structure from 2015-2017, the fine structure also emerges in the model. This is shown in the lower panel below.

This indicates that the differential equation being used currently can possibly be modified to include faster-responding derivative terms which will simultaneously show the multi-year fluctuations as well as what was thought to be a weekly-to-monthly-scale noise envelope. In fact, I had been convinced that this term was due to localized weather but a recent post suggested that this may indeed be a deterministic signal.

Lunisolar tidal effects likely do impact the ocean behavior at every known time-scale, from the well-characterized diurnal and semi-diurnal SLH tides to the long-term deep-ocean mixing proposed by Munk and Wunsch.  It's not surprising that tidal forces would have an impact on the intermediate time-scale ENSO dynamics, both at the conventional low resolution (used for El Nino predictions) and at the higher-resolution that emerges from SOI measurements (the 30-day moving average shown above).  Obviously, monthly and fortnightly oscillations observed in the SOI are commensurate with the standard lunar tides of periods 13-14 days and 27-28 days. And non-linear interactions may result in the 40-60 day oscillations observed in LOD.

from Earth Rotational Variations Excited by Geophysical Fluids, B.F. Chao, http://ivs.nict.go.jp/mirror/publications/gm2004/chao/

It's entirely possible that removing the 30-day moving average on the SOI measurements can reveal even more detail/

## Footnote

[1] Isolation is accomplished by subtracting a 24-day average about the moving average value, which suppresses the longer-term SOI variation.

# The ENSO Forcing Potential - Cheaper, Faster, and Better

Following up on the last post on the ENSO forcing, this note elaborates on the math.  The tidal gravitational forcing function used follows an inverse power-law dependence, where a(t) is the anomalistic lunar distance and d(t) is the draconic or nodal perturbation to the distance.

$F(t) \propto \frac{1}{(R_0 + a(t) + d(t))^2}'$

Note the prime indicating that the forcing applied is the derivative of the conventional inverse squared Newtonian attraction. This generates an inverse cubic formulation corresponding to the consensus analysis describing a differential tidal force:

$F(t) \propto -\frac{a'(t)+d'(t)}{(R_0 + a(t) + d(t))^3}$

For a combination of monthly and fortnightly sinusoidal terms for a(t) and d(t) (suitably modified for nonlinear nodal and perigean corrections due to the synodic/tropical cycle)   the search routine rapidly converges to an optimal ENSO fit.  It does this more quickly than the harmonic analysis, which requires at least double the unknowns for the additional higher-order factors needed to capture the tidally forced response waveform. One of the keys is to collect the chain rule terms a'(t) and d'(t) in the numerator; without these, the necessary mixed terms which multiply the anomalistic and draconic signals do not emerge strongly.

As before, a strictly biennial modulation needs to be applied to this forcing to capture the measured ENSO dynamics — this is a period-doubling pattern observed in hydrodynamic systems with a strong fundamental (in this case annual) and is climatologically explained by a persistent year-to-year regenerative feedback in the SLP and SST anomalies.

Here is the model fit for training from 1880-1980, with the extrapolated test region post-1980 showing a good correlation.

The geophysics is now canonically formulated, providing (1) a simpler and more concise expression, leading to (2) a more efficient computational solution, (3) less possibility of over-fitting, and (4) ultimately generating a much better correlation. Alternatively, stated in modeling terms, the resultant information metric is improved by reducing the complexity and improving the correlation -- the vaunted  cheaper, faster, and better solution. Or, in other words: get the physics right, and all else follows.

# Is the SOI noisy or is it signal?

Applying the analytical solution to Laplace's tidal equations, we can isolate the parts of the Southern Oscillation Index signal that appear quite noisy (i.e. 1880-1885, 1900-1905, etc).

For this 3-month averaged SOI fit, it's a sin(sin(f(t))) function in the ENSO model that generates the folded signal which appears as a rapidly fluctuating and noisy signal. Although my simplification of Laplace's equation was originally applied to QBO, it is applicable to other equatorial standing wave phenomenon such as ENSO, of which the SOI is a measure. The SOI signal has always been considered noisy — especially in contrast to other ENSO measures such as NINO34 — but perhaps this needs to be rethought, as the higher frequency components may be real signal.

These results will be presented at next month's AGU meeting:

# Approximating the ENSO Forcing Potential

From the last post, we tried to estimate the lunar tidal forcing potential from the fitted harmonics of the ENSO model. Two observations resulted from that exercise: (1) the possibility of over-fitting to the expanded Taylor series, and (2) the potential of fitting to the ENSO data directly from the inverse power law.

The Taylor's series of the forcing potential is a power-law polynomial corresponding to the lunar harmonic terms. The chief characteristic of the polynomial is the alternating sign for each successive power (see here), which has implications for convergence under certain regimes. What happens with the alternating sign is that each of the added harmonics will highly compensate the previous underlying harmonics, giving the impression that pulling one signal out will scramble the fit. This is conceptually no different than eliminating any one term from a sine or cosine Taylor's series, which are also all compensating with alternating sign.

The specific conditions that we need to be concerned with respect to series convergence is when r (perturbations to the lunar orbit) is a substantial fraction of R (distance from earth to moon) :

$F(r) = \frac{1}{(R+r)^3}$

Because we need to keep those terms for high precision modeling, we also need to be wary of possible over-fitting of these terms — as the solver does not realize that the values for those terms have the constraint that they derive from the original Taylor's series. It's not really a problem for conventional tidal analysis, as the signals are so clean, but for the noisy ENSO time-series, this is an issue.

Of course the solution to this predicament is not to do the Taylor series harmonic fitting at all, but leave it in the form of the inverse power law. That makes a lot of sense — and the only reason for not doing this until now is probably due to the inertia of conventional wisdom, in that it wasn't necessary for tidal analysis where harmonics work adequately.

So this alternate and more fundamental formulation is what we show here.