by J. S. Marron, R. Z. Li and C. A. Giuliani

Here some methods of studying time varying
frequencies
in time series are given. The motivating data were provided by C.
A. Giuliani, of Allied Health Sciences, UNC.

Here is one trace of "tap" data, a record of the
movement of a person, while tapping a stylus on a pad as rapidly as
possible.
Height is recorded as a function of time, with a resulting time series
as shown here:

Because the sampling rate is not high with respect
to the features of interest, the data have been "augmented" by an
upsampling
process, which consists of using part of the Fourier decomposition of
the
series to generate data points at 4 times the original sampling rate.

This series has a very strong periodic component, but both the height and the frequency change in time. Questions addressed here are:

(i) How do we understand these changes?

(ii) Are the insights we gain "really there", meaning are the
observed phenomena statistically significantly different from the
background
noise?

Approach 1: Classical Fourier Analysis

A simple Fourier approach to frequency modulation, is to apply a triangular weight function to the Fourier representation of the data, and then invert that transform. Then "low frequency modulation" can be derived as the "envelope of the high frequency carrier".

While the approach is simple and appealing, a critical assumption is that the carrier frequency has constant amplitude. This is clearly not true for the signal above, so the signal needs to be first adjusted to give nearly constant amplitude. This is done as follows.

Start with the raw (not upsampled) data shown at
the top, and the periodogram (proportional to the "discrete power
spectrum")
shown at the bottom:

The strong periodicity in the data shows up as a marked peak in the
periodogram. Since the interesting periodicities occur near that
peak (and other components will affect the frequency modulation
process),
reduce the data to only the Fourier components between the blue
vertical
bars (these were chosen by eye).

The resulting and limited part of the data are
shown
on the top of this picture, and a check on what was lost in the band
limiting
process is provided by the residuals at the bottom:

The residuals are visually smaller (note same axes), and do not appear
to have an interesting periodic component (at least visually).
The
"R square" values show how the "power of the data" are allocated
between
"power in this periodic component" =76% and "residual
power"
= 24%.

The "envelope" of the Full Band Filtered data
shows
the changes in magnitude. As noted above, to show changes in
frequency,
a triangular weight function can be applied to the spectrum, but this
requires
first removing the changing amplitude. This is done by obtaining
the envelope of the Full Band Filtered series, shown at the top, and
then
dividing the series by the envelope, with the result shown at the
bottom:

The envelope was obtained by finding the 0 crossing points of the first
differences, and taking the max of the series values on either
side.
Then linear interpolation was done to "connect the dots". Some
instabilities
in this were removed by using constant functions near each end.

Next, the periodogram of the Amplitude
Adjusted,
Full Band Filtered series is multiplied by a triangular weight
function.
The corresponding signal thus has "different frequencies shown as
amplitudes",
as shown in the top. Another application of the max envelope
operation
results in a curve whose height represents the "dominant frequency at
that
time", shown at the bottom:

This shows several interesting features that fit with ideas in human
movement. The large scale features are a fairly rapid increase in
frequency early on, to a fairly high frequency steady state, followed
by
a gradual decline. This fits with conventional movement ideas, as
the startup frequency is low, and is increased until a comfortable
rhythm
is settled into. Later, as fatigue sets in, the frequency falls
off.
A deeper question is the apparent smaller scale changes in
frequency.
An explanation for these exists: to avoid fatigue from the
repetitive
movement, one makes some rather minor changes in many components of the
movement, including body position, which have smaller impacts.

But are these small scale changes "really
there"?
Or are they simply artifacts of the noise in the movement and
measurement
processes, which has perhaps been magnified by the convoluted approach
taken to deriving this frequency curve? Another way to view this,
is can we somehow attach "statistical significance" to features seen in
the Frequency Modulation curve? I don't know of results of this
type,
but if you do, please tell me: marron@email.unc.edu.
If this has not been studied, then perhaps we are motivating some
mathematical
statistical work in the field of Fourier analysis of time series.
But not wanting to take the time to do this ourselves, or to wait for
others
to do it, we instead developed the following non-Fourier approach.

Approach 2: (Non-Fourier) Quasi Periodic Analysis

The main idea here can be understood by looking at wagon wheels in old Western movies. They appear to move in strange ways, e.g. often seeming to go backwards. If you look carefully, you will see that the motion depends a lot on the speed of the wagon. As the wagon is speeding up, the wheel can go from an apparent slow forward motion, to apparently stopping, to apparently going backwards. Of course this is a result of the movie being a succession of snapshots. When the wagon wheel is going slightly slower than the movie sampling rate, the wheel seems to go slowly backwards. When the speed reaches the sampling rate, it appears motionless. As the speed exceeds the sampling rate, it seems to go forwards. The key idea here is that a succession of snapshots can provide a tool for understanding changing frequencies.

To apply this idea to a signal, such as the tap motion trace at the top of the page, suppose the trace is on a strip of paper which is moved past a shuttered window. The shutters are opened periodically, at the "carrier frequency" (this is just what a movie camera does). If the trace is a sine wave, whose frequency is the carrier frequency, then the resulting movie shows a single arch of the sine function, and it holds still. If the sine wave frequency is slightly lower than the carrier frequency, then the arch of the sine wave moves to the left. If it is slightly higher, then it moves off to the right. In the presence of frequency modulation, the arch shifts location according to the frequency at the time.

Here is a toy example to illustrate this
principle.
(Caution: this is only a one frame screen
shot
of the movie. Pushing the buttons on the image won't do anything.)
To see the movie, go here. (If your
computer doesn't immediately show this movie, some advice can be found
at: http://www.unc.edu/~marron/marron_movies.html)

Again, to watch this as a movie, go here.

Some experimentation with toy examples, and with real data, showed that the visual impression of frequency modulation could be enhanced in several ways as shown here. First it is useful to overlay not only the curve in the present frame, but also the two curves before and the two curves after. To keep track of which curve is which, the current frame gets a thick line type, and successively thinner line types are used for the frames on each side. This gives a "fade in, then fade out" effect when watching the movie, which is especially helpful in the presence of noise. Second for easy viewing of interesting phenomena near the edge of the picture, we found it helpful to highlight the circular nature of this type of view (i.e. to "look beyond the boundary") by showing half cycle periodic continuations of the picture, beyond the boundaries (which are shown as vertical dashed lines). Thus, the part of the picture to the left of 0 is just a replication of the part just to left of the vertical dotted line at the boundary at 6.6, and similarly on the right. Next, since the motion of the peak (which is showing the important frequency modulation is not easy to remember, a light blue trace of the location of the maximum in each frame is drawn, at the top of the image. This trace is showing frequency as a function of time (except that it is rotated 90 degrees from the way in which functions are usually displayed).

The movie shows a fairly coarsely sampled sin
wave,
whose frequency seems to change in time. The sinusoidal shape of
the light blue curve suggests a sinusoidal phase shift, which is
equivalent
to a sinusoidal change in frequency. The trace was actually
generated
by evaluating a sine function, with period 6.5, at an unequally spaced
grid of time points. The unequally spaced grid was chosen
according
to the "time warping function" shown here:

The warping function (actually a piecewise quadratic) looks very nearly
linear, but to accentuate its nonlinear character, its difference with
a line is shown in the lower panel. This difference reflects the
frequency modulation in the generated data trace, and also the light
blue
curve in the movie.

This method is now applied to the real data trace (the
upsampled version, because its improved smoothness gives a better
visual
impression) shown at the top of this page. Go here
to see the resulting movie. (Caution:
this is only a one frame screen shot of the movie. Pushing the
buttons
on the image won't do anything.) {If this link doesn't
start
a movie on your computer, see the note at the toy example movie above}

Again, go here to run the movie.

This movie shows both the changes in amplitude and frequency that were observed from the Fourier analysis above. The blue curve shows changes in frequency in a simple and direct sense. The large scale frequency concepts are the same as found above: low frequency at startup, followed by increasing frequency moving into a settled rhythm, followed by lower frequency as fatigue sets in. The smaller scale changes in the frequency are once again visually apparent (and the motion in the movie seems suggestive of something like a change in body position happening), but again it is not so clear whether these are "really there" or not.

The advantage of this approach is that its
simplicity
allows use of known methods in tackling the main problem, such as
SiZer,
which is discussed in detail at http://www.unc.edu/~marron/DataAnalyses/SiZer_Intro.html.
A family of smooths of the light blue curve are shown in the top
panel, and the middle panel is the corresponding SiZer map. The
SiZer
map suggests that the only statistically significant features are the
overall
decrease (i.e. increase in frequency) at the beginning, and the overall
increase at the end. The "hesitancy" seen in the movie, around
time
1100, shows up only as a purple non-significance. However, SiZer is
relatively
weak at finding this type of structure, since this "hesitancy", and as
well as others, for example around times 500 and 2000, don't show up as
changes in the slope.

The bottom panel, shows a SiCon analysis of these data. This
works like SiZer, except that curvature, not slope is studied.
Scale
space locations where the smooth is significantly concave are shaded
cyan
(light blue), orange is used where the curve is significantly convex,
and
green is used where there is no significant curvature. This is
especially
useful in situations where there is a dominant slope, and it is desired
to find perturbations in that, as shown here.

The SiCon analysis does show that the "hesitancies", around times 600, 1200 and 1900 (recall that these are quite visible in the movies), are statistically significant, at the level alpha = 0.05. This provides the first statistical confirmation that frequencies change in this relatively "small scale" type of way. As noted above, this is consistent with changes, such as changes of body position, that are made to avoid fatigue.

An important weakness of this type of analysis is that it requires a fairly coherent signal. Signals with a large amount of noise, or whose frequencies do not change in a relatively smooth way, may not give sensible answers (although pre-smoothing may help).

Another critical aspect of such analysis is the need to finding a "carrier frequency". This can be done by trial and error (which was done for the above movies). It can also be done using Fourier Analysis, e.g. one could start with the Fourier peak that appears between the blue bands in the above spectrum.

An alternative approach, which does not use
Fourier
Analysis (and thus is not tied to *sin* and *cos* waves) is
based
on searching through "seasonal effects" in the data. This is
called
Visualization of PERiodicities. The idea is to study, for a range
of lags, *l=1,...,k*, how the "seasonal component of the series
at
lag *l*", relates to the rest of the signal. This is done
using
a "signal processing", i.e. "analysis of variance" viewpoint, but
thinking
of the "proportion of the power of the signal that is explained at that
lag", i.e. the "sum of squares that is explained by the component at
lag
*l*".

Here is an example of this type of analysis,
using
the (raw version) of the tap location trace above. The top left
panel
shows the raw data tap vertical location trace in yellow, and the
sample
mean is shown as a magenta horizontal line. As in usual in ANOVA
considerations, the mean is removed before consideration of ratios of
sums
of squares.

The lower right panel shows the percent of the total
(with the mean removed) sum of squares, that is represented by the
seasonal
effects at lags *l=1,...50*. The first large one occurs at
lag
*l = 9*, and note that at all succeeding multiples of *9*,
the
peak is at least this large (since the power of this seasonal effect is
also found for all later seasonal effects, at lags *l = 9j*).
For this same reason, there is a smaller "side peak", that is apparent
at lags *l = 10, 20, 30, 40,...* This suggests power in the
signal at frequencies between 9 and 10. To choose the "dominant
peak",
it is not enough to just take the biggest one, because of this
"additive
effect". To find the one that is "relatively largest", we use
standard
*F* statistic theory, and take the peak whose *F* statistic
is
"most significant" in the usual sense. The result in this case is
highlighted with the light blue vertical line, at lag *l = 9*.

The upper right panel shows this seasonal component, and also shows that the percent of the power in the signal (after the mean is subtracted) is about 20% (the peak looks shorter in the lower right panel because of the imprecision of the graphics). The vertical scale is the same as that of the raw data, to give a visual impression of what "20% of the power" means.

Additional insight comes from looking at the
residuals,
after the lag* l = 9* seasonal component is subtracted.
These
are shown in the plot on the lower left. Again the same visual
scale
is used to allow simple viewing of this and the seasonal components as
a decomposition of the data trace. Note that substantial
"periodic
structure" seems to remain in the data, which is quite consistent with
a changing frequency (as shown above).

It is tempting to try to find additional
periodicity
in the residual trace on the lower left above, by the same
method.
Here is the result:

The top row is the same as above. The center left panel is same
as the lower left panel above, and is the starting point of this
analysis.

Again, all lags *l = 1,...,50* are
considered
in the lower right panel. This time the lag *l = 10* shows
up
as the strongest (highlighted as the light blue vertical line).
The
power of this seasonal component is only about 6% of the total, which
fits
the fact that it looks much smaller.

The seasonal component at lag *l = 10* is
shown
in the center right panel (again using the same vertical scale as
elsewhere,
for visual comparison).

The residuals from subtraction of this additional
lag *l = 10* seasonal component are shown in the lower
left.
Because the seasonal component is small, these residuals look similar
to
the ones immediately above. This shows that the apparent
periodicity
is not a "pure periodicity", which again suggests the apparent
periodicity
has some shifts of frequency. A natural next step in such an
analysis
would be to try the movies, of the type used above, to see if the
changes
in frequency can be tracked over time.

Note: this type of analysis assumes that all "trend"
has
been removed from the time series. Otherwise, the trend will
seriously
affect the lagged components.

For more about this type of analysis, inquire by
email from marron@email.unc.edu.