The Southern Oscillation embedded with the ENSO behavior is what is called a dipole , or in other vernacular, a standing wave. Whenever the atmospheric pressure at Tahiti is high, the pressure at Darwin is low, and vice-versa. Of course the standing wave is not perfect and far from being a classic sine wave.
To characterize the quality of the dipole, we can use a measure such as a correlation coefficient applied to the two time series. Flipping the sign of Tahiti and applying a correlation coefficient to SOI, we get Figure 1 below:
Note that this correlation coefficient is "only" 0.55 when comparing the two time-series, yet the two sets of data are clearly aligned. What this tells us is that other factors, such as noise in the measurements, can easily drop correlated waveforms well below unity.
This is what we have to keep in mind when evaluating correlations of data with models as we can see in the following examples.
The following is one of the best correlations I discovered when applying my sloshing model to the SOI in Figure 2 below. I was able to get the correlation coefficient to above 0.76 over the years from 1932 to 2005, by adding a modified forcing between the years 1990 to 1994 during a minimum in a beat interval (see Figure 5 here). The fit is also very sensitive to the filtering applied to the source data. Because of added noise, a more raw version of the data will result in the correlation coefficient dropping below 0.7, even though the overall fit looks as good.
This is a view of the forcing, in Figure 2b
SOI versus Nino 3.4
I took the Nino 3.4 anomaly from the NCDC NOAA Equatorial Pacific SST site and let Eureqa crunch on it with the Mathieu equation formulation shown below:
this gets transcribed as the Eureqa candidate solution:
D(x, t, 2) = f(x, t)
What Eureqa eventually finds are values for the parameters a, q cos(), and F
Of course, these need to be fed back into the differential equation and the equation solved for initial conditions to see how it matches to the actual time series.
With a Eureqa supplied 12-month smoother (sma) applied to the data before hand. The following screenshot highlights one high correlation coefficient result along the Pareto front of solutions.
In this case, one of the sine wave components has a frequency of 83.28373 radians/year (Eureqa generates more significant digits than is displayed in the screenshot, and this extra precision is recovered by a copy&paste). That looks like unnecessary precision but watch what happens when we convert it to a period.
As it happens, the anomalistic lunar month is 27.55455 days ! This is the average time the Moon takes to go from perigee to perigee, or the point in the Moon's orbit when it is closest to Earth, and is a key factor in establishing the long term tidal periods.
So that is either an amazing coincidence of a random period, or this is telling us that the Nino 3.4 SST is sensitive to the anomalistic lunar month.
Yet this result is troubling as well, since the sampling period for Nino 3.4 is only a calendar month and Eureqa is picking out periods shorter than 30 days. This violates the Nyquist sampling criteria which says that sine waves would have to be at least 60 days in period to be detected. So somehow Eureqa is generating a "subsample" period that ordinarily would get aliased to a longer period. I have seen this before when applying Eureqa to the QBO data set .
This is good news but perplexing. It is good because it is all machine learning and completely hands off. All I did was apply a 12-month smoother to the data and Eureqa produced this result after a day of machine learning data crunching. I can't see how I could have influenced the results. But how it decides to violate the Nyquist criteria is puzzling. Is it because the sampling is on the same day of the month, yet each month is of different length, and thus the data is showing subtle inflection points that Eureqa's differential evolution algorithm is picking up? That is amazing sensitivity if that is the case.
The Arctic Oscillation
Another dipole data set to analyze is the Arctic Oscillation. A Mathieu equation fit is shown below in Figure 4. Based on the correlation coefficient (0.45), the fit is not very good, but because of the tight and erratic nature of the oscillations, this may be of better quality than first imagined. It all depends on whether there is a better physical model that capture the behavior more concisely.
 Kawale, Jaya, Stefan Liess, Arjun Kumar, Michael Steinbach, Auroop R. Ganguly, Nagiza F. Samatova, Fredrick HM Semazzi, Peter K. Snyder, and Vipin Kumar. "Data Guided Discovery of Dynamic Climate Dipoles." In CIDU, pp. 30-44. 2011. PDF link