# SOIM and the Paul Trap

This post follows up on the idea of modeling the historical Southern Oscillation Index (SOI) record with details on how one can apply the SOIM to make accurate predictions.  Based on some some early encouraging success, I asserted that a more comprehensive model fitting would be possible.  That's what this follow-on post is about -- trying to verify that we can accomplish that "holy-grail" of prediction, the prediction of future El Nino / Southern Oscillation (ENSO) conditions.

To foreshadow what's to come, Figure 1 shows the comprehensive SOIM fit, which incorporates a grouping of optimally phased Mathieu functions (introduced in the previous post)

Fig. 1 : Fit of the full SOI historical record (in green) to the SOI Model (in blue).

This is a very promising result based on the premise of the last post. The principal additions to the simple model are (1) a multi-harmonic basis set of Mathieu functions and (2) a more constraining physical interpretation to the math.

What follows is the explanation and various verification checks, which include:

1. Sensitivity of the model to parameter selection
2. Comparison to fitting red noise (to show over-fitting is not an issue)
3. Hindcasts and forecasts based on restricted training intervals
4. Power spectrum of model and data

# The Southern Oscillation Index Model

(see later posts here)

A simple model of the Southern Oscillation Index (SOI) does not exist. I find it important to understand the origin of the SOI fluctuations, not only because it is an interesting scientific problem but for its potential predictive value -- in particular,  I could use a model to extrapolate the SOI factor needed by the CSALT model to make global surface temperature projections.  This has implications not only for long-term climate projections but for medium-term seasonal weather predictions, particularly in predicting the next El Nino.

The current thinking is that the index that characterizes the presence of El Nino and La Nina conditions (also known as ENSO)  is unpredictable enough to make any prediction beyond a  year or two pointless. That makes it a challenging problem, to say the least.

So although the SOI is defined as oscillatory (thus the name), these oscillations are not the typically sinusoidal, perfectly periodic waveforms that we are used to dealing with, but consist of uneven, sporadic pulses that remain virtually impossible to deconstruct. Yet, the problem may not be as intractable as we are lead to believe. The key to understanding the SOI is to decode the characteristics of the waveform itself shown below in Figure 1.

Fig 1 : The SOI is defined as the atmospheric pressure difference between Darwin, Australia and Tahiti in the south Pacific. Given that we have measurements that span over 130+ years, there may be a possibility that we can crack the code and decipher the fluctuating waveform. In the figure, the sloped dotted line gives us a clue to their nature.

The waveform is periodic alright but this is the periodicity that lies in the strange mathematical world of crystal lattices and warped coordinate systems.  Follow the math on the next page and the mystery is revealed.

# Relative strengths of the CSALT factors

For doing global surface temperature projections with the CSALT model, I find it critical to not over-fit if the training period is short. Over-fitting at short intervals can create oppositely compensating signs on factors, and these become sensitive to amplification when projected. The recommendation is then to rank the factors (or principal components) in order of their contributing strength to promoting a good fit via the correlation coefficient. See Fig. 1

Fig 1: Ranking of CSALT factors to generate best fit with fewest degrees of freedom.

With the original handful of CSALT factors, we can reach good correlation rather quickly. But after this point, the forcing factors from solar, lunar, and orbital become increasingly more subtle, providing progressively less thermal forcing as we run down the list of periods suggested by previous researchers. From the clear asymptotic trend, we would likely require several times as many factors to reach correlation coefficient levels arbitrarily close to 1.  Noise does not seem to be an issue as the vast majority of the temperature fluctuations appear to come from real forcing terms.  The noise residual in this case is at the 0.002 level or 0.2% of the measured signal.