If we split the modern ENSO data into two training intervals — one from 1880 to 1950 and one from 1950 to 2016, we get roughly equal-length time series for model evaluation.

As **Figure 1** shows, a forcing stimulus due to monthly-range LOD variations calibrated to the interval between 2000 to 2003 (lower panel) is used to train the ENSO model in the interval from 1880 to 1950. The extrapolated model fit in **RED **does a good job in capturing the ENSO data in the period beyond 1950.

Next, we reverse the training and verification fit, using the period from 1950 to 2016 as the training interval and then back extrapolating. **Figure 2** shows this works about as well.

The possibility of over-fitting is always a concern. There are only two primary lunar long-periods, the Anomalistic lunar month (A) and the Draconic (D) lunar month. There is one yearly impulse with a fixed phase (IP). A table of the parameters for the lunisolar behavior is shown below. And for the differential equation model, there are an additional 5 parameters, as described in **Figure 4**.

Param |
Training1880-1950 |
Training1950-now |
Description of parameter |

IP | 0.1234 | 0.1228 | Yearly impulse phase (rad) |

pD | 0.1308 | 0.3610 | Draconic monthly phase (rad) |

D | 0.9370 | 0.5880 | Draconic month amplitude |

D2 | 2.530 | 2.201 | Draconic month squared amplitude |

D3 | -3.634 | -1.596 | Draconic month cubed amplitude |

D4 | -17.41 | -17.01 | Draconic month squared-squared proamplitude |

pA | -0.6370 | -1.261 | Anomalistic monthly phase (rad) |

A | -0.5918 | -0.5433 | Anomalistic month amplitude |

A3 | 4.258 | 3.579 | Anomalistic month cubed amplitude |

ampA | 0.9326 | 1.464 | Anomalistic frequency modulation amplitude (Fig 4) |

phaA | 0.2297 | -0.02378 | Anomalistic frequency modulation phase (Fig 4) |

DA | 0.1418 | 0.0602 | Draconic monthly times Anomalistic monthly |

DA2 | -2.685 | -2.615 | Draconic monthly times Anomalistic fortnightly |

LOD CC | 0.7407 | 0.7416 | LOD correlation coefficient for chosen lunar params |

deriv | 0.8816 | 0.8325 | Wave Differential Equation coeff (1 mo derivative) |

delay | -0.1467 | -0.1634 | Wave Delay Differential Equation coeff (1 yr delay) |

mA | 0.1712 | 0.2874 | Mathieu biennial modulation amplitude |

mP | 0.2782 | 0.3074 | Mathieu biennial modulation phase |

sina | 20.76 | 14.62 | Laplace tidal equation sin(A sin) scaling |

For each row, compare the values of the coefficients for the pre-1950 and post-1950 model fitting results. They do change, but not enough to imply a non-stationary underlying process.

So there appears to be plenty of room to over-fit the model to the data, yet the constraints imposed by the fixed lunar periods (with associated nonlinear products — squared, cubed, etc) and by the simultaneous fit to LOD prevents the curves from diverging too wildly in the extrapolated regions. That is the well-known behavior of poor cross-validation.

There is always the possibility that the model is missing another long-period tidal forcing that could incrementally improve the model. It may be more product terms such as the DA and DA2 used currently, so that perhaps D2A and D2A2 could be evaluated as well. From a Fourier analysis focusing on the residuals, we can get some other hints but this is where avoiding over-fitting is critical.

Another 1st half fit showing a striking extrapolation