page 1  (20 pages) 2

Pseudo Splines

Trevor Hastie

Statistics and Data Analysis Research Group

AT&T Bell Laboratories

Murray Hill, New Jersey

January 20, 1994
c AT&T Bell Laboratories

Abstract

We describe a method for constructing a family of low rank, penalized, scatterplot smoothers. These pseudo splines have shrinking behavior similar to that of smoothing splines. They require two ingredients: a basis and a penalty sequence. The smooth is then computed by a generalized ridge regression.

The family can be used to approximate existing high rank smoothers in terms of their dominant eigenvectors. Our motivating example uses linear combinations of orthogonal polynomials to approximate smoothing splines, where the linear combination and the penalty sequence depend on the particular instance of the smoother being approximated.

As a leading application, we demonstrate the use of these pseudo splines in additive model computations. Additive models are typically fit by an iterative smoothing algorithm, and any features other than the fit itself are hard to compute. These include standard error curves, degrees of freedom, GCV, and influence diagnostics. By using a low rank pseudo-spline approximation for each of the smoothers involved, the entire additive fit can be approximated by a corresponding low rank approximation. This can be computed exactly and efficiently, and opens the door to a variety of computations that were not feasible before.

Keywords: Cubic smoothing splines; Ridge regression; Eigen-decomposition; Penalized least squares.

1 Introduction

Let x and y denote a set of n observations. A scatterplot smooth of y against x is a function of the data: s(x0) = S(x0 j x; y), which at each x0 summarizes the dependence of y on x, usually in a flexible but smooth way. A smoother is linear if S(x0 j x; y) = Pni=1 s(i; x0; x)yi