Is there such a thing as too simple a model? Take a look at this fit to ENSO
My original approach to reconstructing the SOI time series using basis Mathieu functions is useful but not the only way to do the modeling. I believe that I took that as far as I could go and so decided to bite the bullet and solve the SOIM (Southern Oscillation Index Model) differential equations numerically. What pops out is a gloriously simple climate model that fits the data remarkably well -- and BTW, one of those models deemed not to exist by the climate science deniers.
What follows is the description of the model and how it connects to the previously described Chandler Wobble along with a crucial link to the quasi-biennial oscillation (QBO). Some type of forcing is responsible for providing the necessary energy to get the ENSO sloshing behavior going, and these are as good of candidates as any.
Read on to see how it can't possible be done 🙂
The SOIM differential equation that we want to solve numerically was introduced here. If we arrange it so that the left hand side is the perturbed wave equation, and the right-hand side is a forcing function, F(t), this gives us a non-linear diffEq (shorthand for differential equation) that will require numerical techniques to solve for anything more sophisticated than a delta impulse forcing stimulus, F(t) = δ(t).
This equation also describes a known variation of sloshing behavior in a fluid-filled tank . I will at some later point reproduce the derivation, but for now read Frandsen's thorough treatise, and in particular pay attention to how the forcings, vertical and horizontal,are introduced. (Note that I really don't care that the math has only been previously applied to tanks built on a human scale. Just because we are dealing with a "tank" the size of the Pacific Ocean doesn't change the underlying math and physics)
In trying to solve this equation we are definitely entering the computational physics realm, where industrial strength computer science techniques are required for finding solutions . What I lean on right now is a combination of the NonLinearModelFit and ParametricNDSolve capability available in Mathematica. After that, I use the Eureqa tool exploratory model explanation tool to identify that my nonlinear equation is on the Pareto frontier of plausible solutions.
The basic premise is that we generate a forcing function that is a combination of a Chandler Wobble-related inertial mechanism with radial frequency W and a QBO-related forcing of frequency Q. We know that the Chandler Wobble will be around a 6 year period and that the QBO frequency will be around 2.3 to 2.4 years. The periodic perturbation in the wave equation, ω, is also assumed to be approximately 6 years based on the earlier analysis assuming an impulse response only. So we have this equation that we want to solve numerically
The initial conditions of x(0) and x'(0) are also included but are weaker terms than the long-term cyclic forcings. This has to do with the fact that climate is based on boundary conditions whereas weather is based on initial conditions as described by Andrew Lacis  . In this case, the boundary conditions are set by the ENSO forcing functions, which will synchronize over time, thereby over-riding the initial conditions of any point in time.
Characterization of the SOI using the SOIM differential equation
The initial temptation is to apply this equation to the entire SOI time series and see if we can discover any kind of correlation at all. That will work, but we want to reduce the state search space beforehand. So the first tactic is to model the QBO dataset. The QBO data starting from the year 1953 is available here. Figure 2 below shows the result of a best fit to the QBO data using a few Fourier series terms. We will use this as an initial seed to the solver and then later come back to verify any adjustments made.
The next step is to generate a trial equation and zero in on the Mathieu equation modulation by hand (painful but instructive) or by an automated search mechanism (painless but risky if it does not converge or gets stuck in a local minimum). As a first try, we opt for the manual search mechanism.
Using the QBO model fit above as a seed, we start from 1932 and create a time series model that runs up to 2005. The fit was to the SOI data maintained by NCAR here. Since zero crossings are important in characterizing oscillatory behavior and long-term oscillations are a nuisance, the Tahiti and Darwin sets have the multidecadal variations removed via a LOESS filter (this is critical for a cross check performed later)
The result is shown in Figure 3 along with the Mathematica code used to generate the model. The fit evaluation was done by visual inspection and maximizing the correlation coefficient (where absolute scaling is not a vital factor)
This converges remarkably quickly to an acceptable correlation coefficient, largely because the initial QBO seed function was effective in narrowing the parameter search space.
Using this seed, the data pre-1932 was then fit to the same SOIM differential equation using the post-1932 parameters as seeding.
Again the fit is remarkable considering the model's simplicity, with only slight changes needed in the SOIM parameters.
Next we can reconstruct the modeled QBO that was parameter tweaked against the original QBO data. This is shown below in Figure 5. Since only a few periods were used to approximate the slightly varying QBO time series, the fit would not be expected to be perfect. Yet it does align along long stretches where the data and model are in phase, and is only poor in the regions highlighted in yellow. Unsurprisingly, these yellow regions correspond to the intervals where the DiffEq solution in Figure 3 also gives a poor SOIM fit with respect to the SOI data. For example, the interval starting in the late 1960's shows a phase reversal artifact in the QBO model and this corresponds to a bad fit in the same interval starting in month #1050 = late 1960's for the SOI model. Also note the interval starting in 1990. We can expect these disagreements to exist, as the initial model was purposely made to be as simple as possible so that we could not be accused of over-fitting.
Next, we reconstruct the Mathieu equation modulation over the entire interval. In a previous post, using the Mathieu basis functions alone, we identified coherence with a 6+ year cycle -- this was postulated to align with the Chandler Wobble beat frequency. But as Figure 6 below shows, the numerically computed model appears as an 8+ year cycle, with a subtle period change prior to the year 1932. This may not contradict the previous findings, as the QBO period of ~2.4 years interacting with an 8+ year cycle could create a strong difference harmonic at ~3.3 years -- which is the Mathieu period that will double due to the nonlinearity to a physical effect at 6.6 years. Also, in a previous post I found a Mathieu basis function with a semidiurnal tidal period of 8.85 years that showed some promise in fitting the SOI data. So this could also be a mis-identification of the observed frequencies.
The Chandler wobble also showed significant effects in the years from 1920 to 1930, which is the interval at which the Mathieu equation modulation changes perceptibly and we lose coherence over longer time scales. This loss of phase coherence is shown in the figure below
A very simple validation to the meat of the analysis above exists. Two possible alternatives are to estimate (1) the ratio of the divergence (second derivative) over the signal, and (2) to machine learn on the divergence. The first alternative is brute force as shown in a prior post and is subject to noise; as whenever the signal goes through a zero crossing, the ratio gives rise to a singularity. This makes it difficult to get anything but a rough guide to the periodic perturbation. The second much more effective approach involves using an unbiased model discovery tool called Eureqa. What we do is take the SOI data, with as much of the long-term multi-scale oscillation removed (recall the earlier warning), and then apply the Eureqa search target as follows
D(sma(soi, 12), Time, 2) = f(sma(soi, 12), Time)
This is essentially asking Eureqa to map the second derivative of SOI (smoothed at the 12 month level by a sma filter) as a function of unknown parameters with respect to time and the SOI value. The search will evaluate trig functions such as sine and cosine and any other mathematical expressions that exhibit low complexity. The addition of the SOI term is a slight coercion or hint to Eureqa to guide it to find solutions of the Mathiue equation at the top of this post, i.e.
After Eureqa grinds away for several hours, it astoundingly discovers our Mathieu equation with very good error reduction. This is in spite of the ridiculously noisy second derivative computation. See Figure 8 below, and in particular look at the yellow highlighted solution located at a moderate complexity point along the Pareto frontier.
The two higher-frequency sine waves have nearly the same time period as we identified in the QBO analysis ! And the Mathieu modulation of approximately 8 years exists as well ! This in effect provides a largely unbiased substantiation that the Mathieu model postulated earlier indeed exists in the data. One of Eureqa's strong suits is that it's machine fit has no conscience and will punish anyone with the truth if they dare to ask it to validate results.
The mystery of what causes the observed periods of 433 days in the Chandler Wobble and the origin of the 8 year Mathieu modulation and whether that is connected to the Chandler wobble is a bit unsettling. I think that interactions exist in the 1 year seasonal cycle, the QBO cycle, the Mathieu equation characteristic frequency set by √a, the Mathieu equation modulation period set by ω, and the Chandler wobble period. I wanted to characterize these relationships more in this post, but will have to save it for later.
 Frandsen, Jannette B. “Sloshing Motions in Excited Tanks.” Journal of Computational Physics 196, no. 1 (2004): 53–87.
 Kawale, J., Liess, S., Kumar, V., Lall, U., & Ganguly, A. (2012, October). Mining time-lagged relationships in spatio-temporal climate data. In Intelligent Data Understanding (CIDU), 2012 Conference on (pp. 130-135). IEEE.
"Some people are under the illusion that ∆T = λ ∆F governs something. But it doesn’t. No need to be a skeptic to write the real and different governing equations – there are 5 of them. The first one is : div(V) = 0 where V is the velocity field."