Expertise:
Beginner
What data am I working with?
We will work here with height measurements on 54 girls taken from the
Berkeley growth study. Each girl was measured at
31 ages, and these range from 1 to 18 years. The ages are not equally spaced:
Measurements were taken every three months until two, every year until eight,
and finally every six months from eight to eighteen years. More measurements
were taken where growth was more rapid, such as infancy and the adolescent
growth spurt, and fewer during were taken in the early childhood years when
growth was more steady.
The data can be obtained from the FDA software site.
What's the first step in converting my raw data into functional form?
A functional data object expresses
a function as a weighted sum or linear
combination
of elementary functional building blocks called basis
functions.
The conversion of the growth data to functional form requires two
step: choosing and defining a set of basis functions, and computing
the best linear combination for each girl's set of
discrete height measurements. We choose Bspline basis functions
as our basis function system. three decades, such as kernel and
spline smoothing methods, have been applied to growth data. These
methods have been successful at detecting new features missed
by parametric models, but they are not guaranteed.
We choose Bspline basis functions as our basis function system.
Why Bspline basis functions?
We want a basis system that can fit the most general kind of data,
and growth data seem to be like this. The data for a single girl
range from heights of 75 or so centimeters at one year old to about
165 centimeters at full height (most people are surprised that an
adult is only a bit over two times as tall as a one year old child.)
Bspline functions are extremely flexible building blocks for fitting
curves ("B" stands for "basis").
Moreover, we will want to look at the velocity and acceleration
of height later, and this will require controlling the smoothness
of our basis functions. We can do this easily with splines. You
may wish to go to the description of basis systems at this point.
By contrast, Fourier series are often useful, but only for data
which are strongly periodic, and thus show clear repeating cycles. The growth
data don't do this.
However, let us note here for later consideration that the fitting methods
that we use here do not recognize that height increases, and that our curves
should in principle be always increasing.
