Derivation of an ENSO model using Laplace's Tidal Equations

Laplace developed his namesake tidal equations to mathematically explain the behavior of tides by applying straightforward Newtonian physics. In their expanded form, known as the primitive equations, Laplace's starting formulation is used as the basis of almost all detailed climate models. Since that's what they are designed to do, this post provides the details for solving Laplace's tidal equations in the context of the El Nino Southern Oscillation (ENSO) of the equatorial Pacific ocean. The derivation and results shown below essentially describe the framework of my presentation at this month's AGU meeting: Biennial-Aligned Lunisolar-Forcing of ENSO: Implications for Simplified Climate Models

The concise derivation for a model of ENSO depends on reducing Laplace's tidal equations along the equator. I could not find anyone taking a similar approach anywhere in the literature, even though it appears to be routinely obvious: (1) solve Laplace's tidal equations in a simplified context, then (2) apply the known tidal forcing and observe if the result correlates or matches the ENSO time series. In fact, it does, as I have shown before (and for QBO as well); but this is the first time that I have worked out the details in full for ENSO. Below is a two part solution.

Continue reading

Machine Learning and the Climate Sciences

I've been applying equal doses of machine learning (and knowledge based artificial intelligence in general) and physics in my climate research since day one. Next month on December 12, I will be presenting Knowledge-Based Environmental Context Modeling at the AGU meeting which will cover these topics within the earth sciences realm :

Table 1: Technical approach to knowledge-based model building for the earth sciences

In my opinion, machine learning likely will eventually find all the patterns that appear in climate time-series but with various degrees of human assistance.

"Vipin Kumar, a computer scientist at the University of Minnesota in Minneapolis, has used machine learning to create algorithms for monitoring forest fires and assessing deforestation. When his team tasked a computer with learning to identify air-pressure patterns called teleconnections, such as the El Niño weather pattern, the algorithm found a previously unrecognized example over the Tasman Sea."

In terms of the ENSO pattern, I believe that machine learning through tools such as Eureqa could have found the underlying lunisolar forcing pattern, but would have struggled mightily to break through the complexity barrier. In this case, the complexity barrier is in (1) discovering a biennial modulation which splits all the spectral components and (2) discovering the modifications to the lunar cycles from a strictly sinusoidal pattern.

The way that Eureqa would have found this pattern would be through it's symbolic regression algorithm (which falls under the first row in Table 1 shown above). It essentially would start it's machine learning search by testing various combinations of sines and cosines and capturing the most highly correlated combinations for further expansion.   As it expands the combinations, the algorithm would try to reduce complexity by applying trigonometric identities such as this

{\displaystyle \sin(\alpha \pm \beta )=\sin \alpha \cos \beta \pm \cos \alpha \sin \beta }

After a while, the algorithm will slow down under the weight of the combinatorial complexity of the search, and then the analyst would need to choose promising candidates from the complexity versus best-fit Pareto front. At this point one would need to apply knowledge of physical laws or mathematical heuristics which would lead to a potentially valid model.

So, in the case of the ENSO model, Eureqa could have discovered the (1) biennial modulation by reducing sets of trigonometric identities, and perhaps by applying a sin(A sin()) frequency modulation (which it is capable of) to discover the (2) second-order modifications to the sinusoidal functions, or (3) it could have been fed a differential equation structure to provide a hint to a solution  .... but, a human got there first by applying prior knowledge of signal processing and of the details in the orbital lunar cycles.

Yet as the Scientific America article suggests, that will likely not be the case in the future when the algorithms continue to improve and update their knowledge base with laws of physics.

This more sophisticated kind of reasoning involves the refined use of the other elements of Table 1.  For example, a more elaborate algorithm could have lifted an entire abstraction level out of a symbolic grouping and thus reduced its complexity. Or it could try to determine whether a behavior was stochastic or deterministic.  The next generation of these tools will be linked to knowledge-bases filled with physics patterns that are organized for searching and reasoning tasks. These will relate the problem under study to potential solutions automatically.



High Resolution ENSO Modeling

An intriguing discovery is that the higher-resolution aspects of the SOI time-series (as illustrated by the Australian BOM 30-day SOI moving average) may also have a tidal influence.  Note the fast noisy envelope that rides on top of the deep El Nino of 2015-2016 shown below:

For the standard monthly SOI as reported by NCAR and NOAA, this finer detail disappears.  BOM provides the daily SOI value for about the past ~ 3 years here.

Yet if we retain this in the 1880-present monthly ENSO model, by simultaneously isolating [1] the higher frequency fine structure from 2015-2017, the fine structure also emerges in the model. This is shown in the lower panel below.

This indicates that the differential equation being used currently can possibly be modified to include faster-responding derivative terms which will simultaneously show the multi-year fluctuations as well as what was thought to be a weekly-to-monthly-scale noise envelope. In fact, I had been convinced that this term was due to localized weather but a recent post suggested that this may indeed be a deterministic signal.

Lunisolar tidal effects likely do impact the ocean behavior at every known time-scale, from the well-characterized diurnal and semi-diurnal SLH tides to the long-term deep-ocean mixing proposed by Munk and Wunsch.  It's not surprising that tidal forces would have an impact on the intermediate time-scale ENSO dynamics, both at the conventional low resolution (used for El Nino predictions) and at the higher-resolution that emerges from SOI measurements (the 30-day moving average shown above).  Obviously, monthly and fortnightly oscillations observed in the SOI are commensurate with the standard lunar tides of periods 13-14 days and 27-28 days. And non-linear interactions may result in the 40-60 day oscillations observed in LOD.

from Earth Rotational Variations Excited by Geophysical Fluids, B.F. Chao,

It's entirely possible that removing the 30-day moving average on the SOI measurements can reveal even more detail/


[1] Isolation is accomplished by subtracting a 24-day average about the moving average value, which suppresses the longer-term SOI variation.


The ENSO Forcing Potential - Cheaper, Faster, and Better

Following up on the last post on the ENSO forcing, this note elaborates on the math.  The tidal gravitational forcing function used follows an inverse power-law dependence, where a(t) is the anomalistic lunar distance and d(t) is the draconic or nodal perturbation to the distance.

F(t) \propto \frac{1}{(R_0 + a(t) + d(t))^2}'

Note the prime indicating that the forcing applied is the derivative of the conventional inverse squared Newtonian attraction. This generates an inverse cubic formulation corresponding to the consensus analysis describing a differential tidal force:

F(t) \propto -\frac{a'(t)+d'(t)}{(R_0 + a(t) + d(t))^3}

For a combination of monthly and fortnightly sinusoidal terms for a(t) and d(t) (suitably modified for nonlinear nodal and perigean corrections due to the synodic/tropical cycle)   the search routine rapidly converges to an optimal ENSO fit.  It does this more quickly than the harmonic analysis, which requires at least double the unknowns for the additional higher-order factors needed to capture the tidally forced response waveform. One of the keys is to collect the chain rule terms a'(t) and d'(t) in the numerator; without these, the necessary mixed terms which multiply the anomalistic and draconic signals do not emerge strongly.

As before, a strictly biennial modulation needs to be applied to this forcing to capture the measured ENSO dynamics — this is a period-doubling pattern observed in hydrodynamic systems with a strong fundamental (in this case annual) and is climatologically explained by a persistent year-to-year regenerative feedback in the SLP and SST anomalies.

Here is the model fit for training from 1880-1980, with the extrapolated test region post-1980 showing a good correlation.

The geophysics is now canonically formulated, providing (1) a simpler and more concise expression, leading to (2) a more efficient computational solution, (3) less possibility of over-fitting, and (4) ultimately generating a much better correlation. Alternatively, stated in modeling terms, the resultant information metric is improved by reducing the complexity and improving the correlation -- the vaunted  cheaper, faster, and better solution. Or, in other words: get the physics right, and all else follows.














Is the SOI noisy or is it signal?

Applying the analytical solution to Laplace's tidal equations, we can isolate the parts of the Southern Oscillation Index signal that appear quite noisy (i.e. 1880-1885, 1900-1905, etc).

For this 3-month averaged SOI fit, it's a sin(sin(f(t))) function in the ENSO model that generates the folded signal which appears as a rapidly fluctuating and noisy signal. Although my simplification of Laplace's equation was originally applied to QBO, it is applicable to other equatorial standing wave phenomenon such as ENSO, of which the SOI is a measure. The SOI signal has always been considered noisy — especially in contrast to other ENSO measures such as NINO34 — but perhaps this needs to be rethought, as the higher frequency components may be real signal.

These results will be presented at next month's AGU meeting:

Approximating the ENSO Forcing Potential

From the last post, we tried to estimate the lunar tidal forcing potential from the fitted harmonics of the ENSO model. Two observations resulted from that exercise: (1) the possibility of over-fitting to the expanded Taylor series, and (2) the potential of fitting to the ENSO data directly from the inverse power law.

The Taylor's series of the forcing potential is a power-law polynomial corresponding to the lunar harmonic terms. The chief characteristic of the polynomial is the alternating sign for each successive power (see here), which has implications for convergence under certain regimes. What happens with the alternating sign is that each of the added harmonics will highly compensate the previous underlying harmonics, giving the impression that pulling one signal out will scramble the fit. This is conceptually no different than eliminating any one term from a sine or cosine Taylor's series, which are also all compensating with alternating sign.

The specific conditions that we need to be concerned with respect to series convergence is when r (perturbations to the lunar orbit) is a substantial fraction of R (distance from earth to moon) :

F(r) = \frac{1}{(R+r)^3}

Because we need to keep those terms for high precision modeling, we also need to be wary of possible over-fitting of these terms — as the solver does not realize that the values for those terms have the constraint that they derive from the original Taylor's series. It's not really a problem for conventional tidal analysis, as the signals are so clean, but for the noisy ENSO time-series, this is an issue.

Of course the solution to this predicament is not to do the Taylor series harmonic fitting at all, but leave it in the form of the inverse power law. That makes a lot of sense — and the only reason for not doing this until now is probably due to the inertia of conventional wisdom, in that it wasn't necessary for tidal analysis where harmonics work adequately.

So this alternate and more fundamental formulation is what we show here.

Continue reading

Reverse Engineering the Moon's Orbit from ENSO Behavior

With an ideal tidal analysis, one should be able to apply the gravitational forcing of the lunar orbit1 and use that as input to solve Laplace's tidal equations. This would generate tidal heights directly. But due to aleatory uncertainty with respect to other factors, it becomes much more practical to perform a harmonic analysis on the constituent tidal frequencies. This essentially allows an empirical fit to measured tidal heights over a training interval, which is then used to extrapolate the behavior over other intervals.  This works very well for conventional tidal analysis.

For ENSO, we need to make the same decision: Do we attempt to work the detailed lunar forcing into the formulation or do we resort to an empirical bottoms-up harmonic analysis? What we have being do so far is a variation of a harmonic analysis that we verified here. This is an expansion of the lunar long-period tidal periods into their harmonic factors. So that works well. But could a geophysical model work too?

Continue reading

Improved Solver Target Error Metric

In addition to the algorithm used for solving optimization problems, an important criteria is the form of the metric used to minimize the error or maximize the similarity between data and model.

The commonly used forms such as error variance (i.e. mean squared error) have issues related to how well they can navigate the search space. Other forms such as correlation coefficient (CC) often work better, but at the expense of losing track of scale.  This indicates that CC is better at matching the general characteristics of a specific shape than a pure error criteria. And if weighted, it can deal with noisy intervals.

In fact, the symbolic reasoner Eureqa features a proprietary metric referred to as a hybrid correlation coefficient.  From my experiences with the tool, hybrid version does qualitatively work better.

So in my quest to find an alternative metric, I came up with something related to the Cosine Similarity (CS) measure. As defined, CS is not that different from Pearson's correlation coefficient as it does not subtract the mean. But with a slight modification it's an excellent "starter" metric for initial exploration.

The new metric is essentially a +/- excursion matching criteria (EMC), which is important for a behavior as cyclically erratic about the origin as ENSO.

The algorithm for the EMC can be described as a ratio of two factors. The numerator is the sum of the multiplications of the model and data values. The denominator is the normalizing factor, which is the sum of the multiplication of the absolute values of each value.

EMC =  \frac{\sum x_i \cdot y_i}{\sum |x_i| \cdot |y_i|}

The resulting metric ranges from -1 to 1, with 1 being a perfect sign excursion matching, and -1 if all excursions had the sane magnitude but were reversed in sign.

This of course is not a perfect criteria as it will tend to force the minimal excursions to zero while maximizing the maximum excursions, instead of first normalizing them as the true CS does.

The evidence to how well it works is mainly based on observations in massive reductions in search time. For ENSO model optimization search, the EMC reduces the time it takes to get in the ballpark by 100×, so what could take an hour reduces to about a minute of computational time. It is important not to let it overfit, so wait until the metric starts to slow in its improvement before stopping the search and switching to the CC metric for the final stages optimization.

As it is so fast I have been using it for minimally filtered ENSO time series, where I can start from minimally seed sets of parameters. This gives more confidence that results are not correlated from one search optimization run to the next.

The EMC is therefore a great metric for randomizing searches. I can imagine using it in a scenario with different initialized seed values and then waiting a fixed time to return an interim solution, and then using the best of these in a more refined CC search.

Why it works so well is something I am still trying to explain. It is a more efficient computation than CC, but that is not enough to explain 100x.




Interface-Inflection Geophysics

This paper that a couple of people alerted me to is likely one of the most radical research findings that has been published in the climate science field for quite a while:

Topological origin of equatorial waves
Delplace, Pierre, J. B. Marston, and Antoine Venaille. Science (2017): eaan8819.

An earlier version on ARXIV was titled Topological Origin of Geophysical Waves, which is less targeted to the equator.

The scientific press releases are all interesting

  1. Science Magazine: Waves that drive global weather patterns finally explained, thanks to inspiration from bagel-shaped quantum matter
  2. Science Daily: What Earth's climate system and topological insulators have in common
  3. Physics World: Do topological waves occur in the oceans?

What the science writers make of the research is clearly subjective and filtered through what they understand.

Continue reading