2023年10月2日月曜日

Empirical Dynamic Modeling with rEDM: (1) Reconstructing the original dynamics

20231002_Blogger0020

*This is an English version of my previous post (first translated by ChatGPT and checked manually).

Reconstruction of dynamics using time series data

1. Time series data

Time series data is data obtained by observing and measuring a variable in a system over time. For example, if you continue to measure your weight (a variable of interest) every day (in a system of interest), it becomes a proper time series data.

When visualizing time series data, it is common to plot time on the horizontal axis (xx-axis) and the variable on the vertical axis (yy-axis). For example, the following graph shows the Nile dataset (a time series of changes in the flow of the Nile River) provided in R. Time is taken on the horizontal axis and the variable is plotted on the vertical axis.

enter image description here

2. Delineating/Reconstructing the dynamics of a system from time series data

“Delineating/Reconstructing the dynamics of a system from time series data” is an important concept in Empirical Dynamic Modeling, a nonlinear time series analysis that will be explained in the next post. Therefore, we will explain this concept using the famous three-variable differential equation called the “Butterfly Attractor” (also known as the Lorenz Attractor) as an example (Lorenz 1963). The Butterfly Attractor refers to the dynamics that appear when the differential equations below have the following parameters: p=10p = 10, r=28r = 28, and b=8/3b = 8/3.

dxdt=pX+pYdydt=XZ+rXYdzdt=XYbZ\frac{dx}{dt} = -pX + pY\\ \frac{dy}{dt} = -XZ + rX - Y\\ \frac{dz}{dt} = XY - bZ

In this equation, the time series data generated when p=10p = 10, r=28r = 28, and b=8/3b = 8/3 are shown in the figure below.
enter image description here

These three time series data alone represent the characteristics of the dynamics. However, is this the only way to represent this dynamics? It is possible to represent the same data in a completely different way. For example, if we plot XX, YY, and ZZ on the xx-axis, yy-axis, and zz-axis of a three-dimensional space and represent YY with color, it will look like the following (plotting XX and ZZ while representing YY with color).

enter image description here

This butterfly-shaped form (= Butterfly Attractor) contains the same information as the original three time series plots (except for time information). The information about which time and which location the points come from is not displayed here, but if we show the movement (temporal behaviour) of the points in this plot with an animation, we can represent the original information without leaving anything out.

Now, this time we drew one dynamics in a three-dimensional space based on three time series data, but we can also say that the three time series data were generated from one dynamics in a three-dimensional space. In other words, if we project the temporal movement of the points on the Butterfly Attractor onto the xx-axis, we get the time series data of XX, and if we project it onto the yy-axis and zz-axis, we get the time series data of YY and ZZ, respectively. Time series data can be seen as something obtained by projecting the “true dynamics” (assuming we can actually observe it) onto some axis. For example, a person’s weight can be seen as data obtained by projecting the “true dynamics” that involve complex interactions with various factors (such as food intake, physical activity, stress, etc.) onto one axis.

These explanations may be better understood by watching the following animation on YouTube:
Introduction to Empirical Dynamic Modeling

3. State space reconstruction using time-delay embedding

If we can accurately illustrate the dynamics of a system, it is extremely useful. For example, it may be possible to predict what is likely to happen in the near future by investigating to past behavior of the system, or to predict how much external force is needed to return the perturbed system to its original state.

However, if the system of interest is a natural system, it is not easy to accurately delineate its dynamics. This is because, in order to accurately delineate the dynamics, it is necessary to obtain time series data for all variables that are involved in the dynamics of the system, as in the example of the Butterfly Attractor. In the case of a natural system, we do not know how many variables are involved in the dynamics of the system. It could be two, ten, or even a hundred. Also, even if we knew how many variables need to be observed/measured, it could be technically impossible to measure all of them. Considering these factors, it seems almost impossible to measure all variables related to the dynamics of a system and accurately delineate the dynamics of the system of interest.

In such situations, there is a mathematical theorem that can help accurately delineate the state of the system (which we will call reconstruction). The Takens embedding theorem (Takens, 1981; https://en.wikipedia.org/wiki/Takens's_theorem) shows that the dynamics of a system can be reconstructed even from a single variable by taking a time-delay coordinate system.

Let’s take a look at what this means specifically.

First, let’s create a time-delay coordinate system based on X(t)X(t). Here, let’s create X(t1)X(t−1) and X(t2)X(t−2). Then, instead of the original {X(t),Y(t),Z(t)}\{X(t), Y(t), Z(t)\}, let’s plot {X(t),X(t1),X(t2)}\{X(t), X(t-1), X(t-2)\} in a three-dimensional space. As a result, the dynamics shown on the left of the figure appear. If we do the same for Y(t)Y(t), the dynamics shown in the right of the figure appear.

enter image description here

Please compare the figure where XX, YY, and ZZ are plotted in a three dimentional space with the plot of the time-delayed time series. You will see that the shape reconstructed by the time-delayed time series plot is very similar to the original shape.

In other words, even if you can only monitor one single variable from a system of interest, it is possible to reconstruct the dynamics of the system using the time-delay embedding.

4. Why can we reconstruct the original dynamics from a single time series?

It may be surprising (especially for empirical researchers) that the dynamics of a system can be reconstructed by taking time delays and plotting them. Why is this possible? Understanding the proof of the Takens’ theorem requires knowledge of differential geometry, which is challenging for empirical ecologists. Here, instead of understanding the theorem, let’s try to interpret the theorem using the procedure introduced by Munch et al. (2019).

First, let’s consider a measurable variable yy and an unmeasurable variable zz, whose temporal changes are expressed by the following equations:

yt+1=F(yt,zt)zt+1=G(yt,zt)y_{t+1} = F(y_t, z_t)\\ z_{t+1} = G(y_t, z_t)

That is, yt+1y_{t+1} is generated by transforming yty_t and ztz_t with a function FF, and zt+1z_{t+1} is generated by transforming yty_t and ztz_t with a function GG. In this situation, Takens’ embedding theorem states that the dynamics can be reconstructed using only yy.

Now, since zt+1=G(yt,zt)z_{t+1} = G(y_t, z_t), we have zt=G(yt1,zt1)z_{t} = G(y_{t-1}, z_{t-1}). Substituting this into the equation for yt+1y_{t+1}, we get:

yt+1=F(yt,zt)=F(yt,G(yt1,zt1))y_{t+1} = F(y_t, z_t) = F(y_t, G(y_{t-1}, z_{t-1}))

At this point, if we’re lucky, we might be able to express zt1z_{t-1} in a different way using yt+1=F(yt,zt)y_{t+1} = F(y_t, z_t). That is, we can go back in time and write yt=F(yt1,zt1)y_t = F(y_{t-1}, z_{t-1}), which is an equation involving yty_t, yt1y_{t-1}, and zt1z_{t-1}, so we might be able to solve for zt1z_{t-1}​. Let’s assume that we can express zt1z_{t-1}​ as zt1=Φ(yt,yt1)z_{t-1} = \Phi(y_t, y_{t-1}). Then, yt+1y_{t+1} can be expressed as follows:

yt+1=F(yt,zt)=F(yt,G(yt1,Φ(yt,yt1)))y_{t+1} = F(y_t, z_t) = F(y_t, G(y_{t-1}, \Phi(y_t, y_{t-1})))

Now, yt+1y_{t+1} is expressed in terms of time delays of yty_t and yt1y_{t-1}​, and zz disappears.

If we cannot solve zt1=Φ(yt,yt1)z_{t-1} = \Phi(y_t, y_{t-1}), we have to try to express zt2z_{t-2} in terms of yty_t​, yt1y_{t-1}​, and yt2y_{t-2} by setting zt1=G(yt2,zt2)z_{t-1} = G(y_{t-2}, z_{t-2}) and yt+1=F(yt,G(yt1,G(yt2,zt2)))y_{t+1} = F(y_t, G(y_{t-1}, G(y_{t-2}, z_{t-2}))) . By repeating this process, we can express what was originally represented by yy and zz in terms of time delays of yy. For more details, please refer to pages 3-4 of Munch et al. (2019).

In the next post, we will explain how to analyze time series data using the reconstructed dynamics, along with how to use the R package rEDM.

References

  • Lorenz, E. N. (1963) Deterministic Nonperiodic Flow, Journal of Atmospheric Sciences, Vol.20, pp.130-141
  • Takens, F. (1981) Detecting strange attractors in turbulence. In D. Rand & L.-S. Young (Eds.), Lecture Notes in Mathematics (Vol. 898, pp. 366–381).
  • Munch, S. B., Brias, A., Sugihara, G., Rogers, T. L. (2019) Frequently asked questions about nonlinear dynamics and empirical dynamic modelling. ICES Journal of Marine Science, doi:10.1093/icesjms/fsz209

Written with StackEdit.

0 件のコメント:

コメントを投稿