International Journal of Analysis and Applications

Volume 18, Number 5 (2020), 689-698

URL: https://doi.org/10.28924/2291-8639

DOI: 10.28924/2291-8639-18-2020-689

SOME NOTES ON ERROR ANALYSIS FOR KERNEL BASED REGULARIZED

INTERPOLATION

QING ZOU∗

Applied Mathematics and Computational Sciences, The University of Iowa, IA 52242, USA

∗Corresponding author: zou-qing@uiowa.edu

Abstract. Kernel based regularized interpolation is one of the most important methods for approximating

functions. The theory behind the kernel based regularized interpolation is the well-known Representer

Theorem, which shows the form of approximation function in the reproducing kernel Hilbert spaces. Because

of the advantages of the kernel based regularized interpolation, it is widely used in many mathematical

and engineering applications, for example, dimension reduction and dimension estimation. However, the

performance of the approximation is not fully understood from the theoretical perspective. In other word,

the error analysis for the kernel based regularized interpolation is lacking. In this paper, some error bounds

in terms of the reproducing kernel Hilbert space norm and Sobolev space norm are given to understand the

behavior of the approximation function.

1. Introduction

Approximating functions in high dimensional spaces is one of the central problems in both mathematics

and engineering. Many real-world problems can be viewed as a function approximation problem. For

example, the classification problem in engineering can be viewed as approximating a function whose function

values give the classes that the inputs belong to [7] and in image processing, the patch-based image denoising

problem can be seen as approximating a function from noisy patches to clean pixels.

Received May 12th, 2020; accepted June 4th, 2020; published June 17th, 2020.

2010 Mathematics Subject Classification. 65D05, 41A05, 41A10, 65J05, 65G99.

Key words and phrases. error bound; kernel based regularized interpolation; reproducing kernel Hilbert space; Sobolev

space.

©2020 Authors retain the copyrights

of their papers, and all open access articles are distributed under the terms of the Creative Commons Attribution License.

689

https://doi.org/10.28924/2291-8639
https://doi.org/10.28924/2291-8639-18-2020-689


Int. J. Anal. Appl. 18 (5) (2020) 690

Mathematically speaking, the function approximation problem is to approximate an unknown continuous

function f : X → R from the knowledge of some observations {(xi,yi)}ni=1 ⊂ X × R, where X ⊂ R
d,d ≥ 1

is the input space and n ∈ N is the number of observations.

One of the classical ways for approximating functions to explain the real-world phenomenon is called

Kriging (a.k.a Gaussian process regression) [4]. However, to use Kriging, we need a large amount of obser-

vations, which is expensive to achieve. When only a few samples are given, Kriging will yield non-stationary

behavior [10].

Another alternative for approximating multivariate functions is the inverse distance weighting (IDW)

method, which was originally proposed in [9]. The key assumption of IDW is that the points that are

close to each other are more alike than those that are far away from each other. Therefore, IDW produces

prediction at a point relying on observed points close to that point. In other word, the measured values close

to the prediction location have more effects on the predicted value than those which are far away. Based on

which, the approximation function can be represented as

f̂(x) =

∑n
i=1 ||x−xi||

−αyi∑n
i=1 ||x−xi||−α

,

where α is a parameter and usually it takes the value 2. IDW can produce good accuracy for the points

that are near the observations. However, for the points that are away from the observations, IDW cannot

work well. This is one drawback of IDW. Another drawback of IDW is that the gradient vanishes at the

observations.

Considering the drawbacks of these methods, the kernel based regularized interpolation (see (2.2) for

details) was proposed. The kernel based regularized interpolation in 1D is similar to the spline interpolation.

It has several advantages. First of all, the kernel based regularized interpolation works for any dimensions

with many different situations [1, 2]. It also enjoys solid theoretical understandings — the representer

theorem — which we will show it below. Furthermore, comparing to other methods, one can have the same

approximation accuracy using kernel based regularized interpolation by solving a better conditioned linear

system. The kernel based regularized interpolation also works for the case that the observations have some

noise (see Section 2 for analysis). This is one of the biggest advantages that other interpolation methods do

not have. Last but not least, we have much flexibility in choosing kernels when using kernel based regularized

interpolation. Kernel based regularized interpolation is also widely used in other engineering problems, for

example, clustering, dimension reduction [1] and dimension estimation [11].

In view of the advantages of kernel based regularized interpolation, there is pretty much work on using

kernel based regularized interpolation to deal with different situations, as we mentioned above. However,

the work on the error analysis for the kernel based regularized interpolation is very limited. It is actually

worthwhile to reflect on the nature of error bounds that are needed to characterize the behavior of the


Int. J. Anal. Appl. 18 (5) (2020) 691

learned functions from observations. In this paper, we would like to provide some error analysis for the

kernel based regularized interpolation from the function space point of view. In other word, we focus

specifically on deriving Hilbert space type and Sobolev space type error estimates for the kernel based

regularized interpolation.

2. Preliminary

Let us recall here some basic facts from kernel methods for our analysis. We consider a positive definite

kernel K : X ×X → R, i.e., for n ∈ N, x1, · · · ,xn ∈ X and a1, · · · ,an ∈ R,
n∑
i=1

n∑
j=1

aiajK(xi,xj) ≥ 0.

Associated with the kernel K, there exists a unique native Hilbert space HK of functions from X to R. The

elements in HK are of the form

f(·) =
∑
i∈I

αiK(·,xi),

where I is a countable set. The norm defined for HK is given by

〈f,g〉HK =:
∑
i∈I

∑
j∈J

αiβjK(xi,xj),

where g =
∑
j∈J βjK(·,xj) ∈HK. The Hilbert space HK is called reproducing kernel Hilbert space (RKHS)

because the kernel K acts as a reproducing kernel on HK, i.e.,

f(x) = 〈f(·),K(·,x)〉HK ∀f ∈HK.

We require the kernel to be positive definite because when we approximate the functions, linear systems

will be involved. The positive-definite property guarantees the linear systems are well conditioned and the

approximation problem can be solved in HK using the given observations {(xi,yi)}ni=1. Indeed, one can

solve the functions approximation problem by considering the regularization problem with the regularization

parameter λ > 0,

min
f∈HK

n∑
i=1

||f(xi) −yi||2 + λ||f||2HK. (2.1)

The solution of this regularization problem is characterized by the Representer Theorem [8], which states

that the solution of (2.1) is given as

f =

n∑
i=1

αiK(x,xi). (2.2)

This expression is called kernel based regularized interpolation. In the rest of this paper, we use the notation

IXf to stand for the kernel based regularized interpolation, i.e.,

IXf =

n∑
i=1

αiK(x,xi).


Int. J. Anal. Appl. 18 (5) (2020) 692

To obtain error analysis, we need to have the true function. We use the notation f to represent the true

function that is only known at the sampling locations, i.e., we only have the information yi = f(xi) about

the true function f. The goal of this work is to obtain the bounds for the error between the true function f

and the kernel based regularized interpolation IXf.

It is worth mentioning that the kernel based regularized interpolation is well defined for positive definite

kernels, but it is not a pure interpolation. This means that we may not have IXf(xi) = yi = f(xi). But

in the case of strictly positive definite kernels, the regularization problem (2.1) will become an interpolation

problem which exactly interpolates f at the observations and hence the resulted kernel interpolation is a

pure interpolation. However, there is no need to consider strictly positive definite kernels. Using positive

definite kernel to obtain kernel based regularized interpolation is more general. This is because we have a

tunable parameter λ in the regularization problem and it provides a trade-off between pointwise accuracy

and stability. Another important reason is that in real-world applications, the observations may be corrupted

by noise and it makes no sense to have pure interpolations anymore.

3. Hilbert space type error analysis

In this section, we would like to provide a special case about the error analysis for the kernel based

regularized interpolation. The term “special case” is used here because we will assume in this section that

the true function f is living in the RKHS. While in some cases, we cannot guarantee that the true function

is within the RKHS. We will show the error analysis for the general case in the next section.

Before proceeding to the error analysis, we first recall an important concept in numerical analysis: best

approximation [5]. Given f ∈HK, define

ρ := inf
p∈S
||f −p||HK,

where S ⊂HK. Then the number ρ is called the minimax error for the approximation of the function f by

functions in S. In fact [5], there is a unique p̂ ∈ S such that

||f − p̂||HK = ρ = inf
p∈S
||f −p||HK.

The function p̂ ∈ S is called the best approximation of f w.r.t. the HK-norm.

Now, let us look at the inner product of the RKHS. By Cauchy-Schwarz inequality [3], we have that

|〈h,K(·,x)〉HK| ≤ ||h||HK · ||K(·,x)||HK, ∀h ∈HK. (3.1)

Based on which, we can claim from (3.1) that there exists a constant 0 < β ≤ 1 such that

sup
x∈X

|〈h,K(·,x)〉HK|
||K(·,x)||HK

≥ β · ||h||HK. (3.2)

We then have the first error estimation.


Int. J. Anal. Appl. 18 (5) (2020) 693

Theorem 3.1. Let S = span{K(x,xi)}ni=1 ⊂ HK. Assume that the true function f ∈ HK. IXf is the

kernel based regularized interpolation. Then we have the following error bound

||f − IXf||HK ≤ (1 +
C

β
) ·ρ,

where C > 1, 0 < β ≤ 1 are two constants and ρ is the minimax error for the approximation of the function

f by functions in S in terms of the HK-norm.

Proof. First of all, let us consider the trivial case that the true function f ∈ S ⊂HK. In this case, the kernel

based regularized interpolation is exact, meaning that we have IXf = f. Then both ||f − IXf||HK and ρ

are zero. So the conclusion becomes trivial in this case.

Next, we assume that f ∈HK but f 6∈ S. Then for any g ∈ S, we have

||f − IXf||HK ≤ ||f −g||HK + ||IXf −g||HK. (3.3)

Replacing h by IXf −g in (3.2), we have

β||IXf −g||HK ≤ sup
x∈X

|〈IXf −g,K(·,x)〉HK|
||K(·,x)||HK

.

Because we have f 6= g, then for the right-hand side (RHS) of the above inequality, there exists a constant

C > 1 such that

RHS = sup
x∈X

|〈f −g + IXf −f,K(·,x)〉HK|
||K(·,x)||HK

≤ sup
x∈X

[
|〈f −g,K(·,x)〉HK|
||K(·,x)||HK

+
|〈IXf −f,K(·,x)〉HK|

||K(·,x)||HK

]
= sup
x∈X

|〈f −g,K(·,x)〉HK|
||K(·,x)||HK

+ sup
x∈X

|〈IXf −f,K(·,x)〉HK|
||K(·,x)||HK

≤ C · sup
x∈X

|〈f −g,K(·,x)〉HK|
||K(·,x)||HK

≤ C · ||f −g||HK.

The last inequality is due to the Cauchy-Schwarz inequality.

Therefore, we have

||IXf −g||HK ≤
C

β
||f −g||HK.

Then by (3.3), we can obtain that

||f − IXf||HK ≤ (1 +
C

β
)||f −g||HK, ∀g ∈ S,

which implies that

||f − IXf||HK ≤ (1 +
C

β
) inf
g∈S
||f −g||HK = (1 +

C

β
) ·ρ.

This completes the proof. �


Int. J. Anal. Appl. 18 (5) (2020) 694

In fact, this error bound can be further tightened. The improvement is due to the observation that if

the true function f is within HK, the kernel based regularized interpolation is actually a projection. Let

S = span{K(x,xi)}ni=1 ⊂HK as defined in Theorem 3.1. Then if f ∈HK, we can see that the kernel based

regularized interpolation is a projection ϕ : HK → S. The most important property of a projection is the

idempotence. If the function f is already in S, the projection does nothing but to keep the function, i.e.,

ϕ(g) = g, ∀g ∈ S. This implies the idempotence of the projector:

ϕ2(f) = ϕ(ϕ(f)) = ϕ(f), ∀f ∈HK.

If we define ||ϕ|| to be the operator norm, which is given by

||ϕ|| =: max
06=g∈HK

||ϕ(g)||HK
||g||HK

.

Then we have the identity

||ϕ|| = ||I −ϕ||,

where I is the identity operator. This conclusion is due to the following result given in [6].

Lemma 3.1. Let H be a Hilbert space and U ⊂H. Let P : H→ U be an (oblique) projection (P2 = P ) and

0 6= P 6= I. Then

||P || = ||I −P ||.

Using the observation that the kernel based regularized interpolation is a projection if f ∈ HK, we then

have the following tightened error bound.

Theorem 3.2. Suppose S = span{K(x,xi)}ni=1 ⊂ HK. The kernel based regularized interpolation is given

by IXf = ϕ(f), where ϕ is a projector from HK to S. Then we have the following error bounds

||f − IXf||HK ≤
ρ

β
,

where 0 < β ≤ 1 is a constant and ρ is the minimax error for the approximation of the function f by

functions in S in terms of the HK-norm.

Proof. Since ϕ is a projection, we have ||ϕ|| = ||I −ϕ||. Based on which, we can derive that for all g ∈ S,

||f − IXf||HK =||f −g − IXf + g||HK

=||f −g −ϕ(f −g)||HK

=||(I −ϕ)(f −g)||HK

≤||I −ϕ|| · ||f −g||HK

=||ϕ|| · ||f −g||HK.


Int. J. Anal. Appl. 18 (5) (2020) 695

To show the error bound, we need to show that ||ϕ|| ≤ 1
β

. First, it is straightforward that if f ∈HK, we

have

||IXf||HK ≤ ||f||HK.

Since 0 < β ≤ 1, we can get 1
β
≥ 1. So for f ∈HK,

||IXf||HK ≤
1

β
||f||HK.

Therefore, we can obtain that

||ϕ|| = max
06=f∈HK

||ϕ(f)||HK
||f||HK

= max
06=f∈HK

||IXf||HK
||f||HK

≤
1
β
||f||HK
||f||HK

=
1

β
.

This gives us the desired bound. �

4. Sobolev space type error analysis

In the previous section, we have introduced a special case for the error analysis about kernel based

regularized interpolation, where we assumed that the true function f lives in the reproducing kernel Hilbert

space HK. However, in general we do not have this guarantee. The true functions are usually within a larger

function space. In this section, we consider the case that the true function f is in the Sobolev spaces, which

are in fact the spaces that consist of all f ∈ Lp(X) with certain properties. The formal definition of the

Sobolev space is given as follows.

Definition 4.1. Let k be a nonnegative integer, p ∈ [1,∞]. The Sobolev space Wk,p(X) is the set of all the

functions f ∈ Lp(X) such that for each multi-index α with |α| ≤ k, the αth weak derivative ∂αf exists and

∂αf ∈Lp(X). The norm of the sobolev space is defined as

||f||Wk,p(X) =



[∑
|α|≤k ||∂

αf||pLp(X)
]1/p

, 1 ≤ p < ∞

max|α|≤k||∂
αf||L∞(X), p = ∞

.

When p = 2, we usually write Wk,2(X) := Hk(X). For simplicity, we replace ||f||Wk,p(X) by ||f||k,p,X

when no confusion and we ignore the p when p = 2.

Except for the standard norm defined for the Sobolev space, we also have the definition of the seminorm.

The standard seminorm over the space Wk,p(X) is given as

|f|Wk,p(X) =



[∑
|α|=k ||∂

αf||pLp(X)
]1/p

, 1 ≤ p < ∞

max|α|=k||∂
αf||L∞(X), p = ∞

.


Int. J. Anal. Appl. 18 (5) (2020) 696

Similarly, we write |f|Wk,p(X) as |f|k,p,X when no confusion and ignore the p when p = 2.

We now define the continuous embedding between two Banach spaces. Let W,V be two Banach spaces

with V ⊂ W . We say the space V is continuously embedded in W and write V ↪→ W , if there exists a

constant c > 0 such that for all v ∈ V , we have ||v||W ≤ c||v||V .

With these concepts in the theory of Sobolev spaces, we can then have the error bound for the kernel

based regularized interpolation. We assume that the true function f is within Hk(X).

Theorem 4.1. Let k and m be nonnegative integers with k ≥ 0,k + 1 ≥ m. Suppose HK ⊃ S =

span{K(x,xi)}. Then there exists a constant c > 0 such that

|f − IXf|m,X ≤ c inf
g∈S
||f + g||k+1,X.

When the positive definite kernel is chosen to be the polynomial kernel with degree d and k + 1 > d, we have

|f − IXf|m,X ≤ c|f|k+1,X.

Proof. Since k > 0, we then have Hk+1(X) ↪→ C(X), where C(X) is the set of all the continuous functions

defined on X. So f ∈ Hk+1(X) is continuous and IXf is well defined. Besides, there exists a constant c1 > 0

such that

||IXf||m,X ≤
n∑
i=1

|αi| · ||K(x,xi)||m,X ≤ c1||f||C(X).

Since Hk+1(X) ↪→ C(X), we then have that there exists a constant c2 > 0 such that

||f||C(X) ≤ c2||f||k+1,X.

Therefore, there exists some c3 > 0 such that

||IXf||m,X ≤ c3||f||k+1,X. (4.1)

Now, for all f ∈ Hk+1 and g ∈ S, we can obtain from (4.1) that there exist some constant c > 0,

|f − IXf|m,X ≤ ||f − IXf||m,X

= ||f − IXf + g − IXg||m,X

= ||(f + g) − IX(f + g)||m,X

≤ ||f + g||m,X + ||IX(f + g)||m,X

≤ c||f + g||k+1,X.

By which we have

|f − IXf|m,X ≤ c inf
g∈S
||f + g||k+1,X.


Int. J. Anal. Appl. 18 (5) (2020) 697

We now consider the case that the positive definite kernel K is the polynomial kernel. First, we have the

following inequality using the norm equivalence theorem [12]

||f||k+1,X ≤ c4


|f|k+1,X + ∑

|α|≤k

∣∣∣∣
∫
X

∂αf(x)dx

∣∣∣∣

 , (4.2)

∀f ∈ Hk+1(X).

Since the kernel is a polynomial kernel and k + 1 > d, we have that for all g ∈ S, ∂αg = 0 for |α| = k + 1.

Replacing f by f + g in (4.2) gives us that ∀f ∈ Hk+1(X),g ∈ S,

||f + g||k+1,X

≤c4


|f + g|k+1,X + ∑

|α|≤k

∣∣∣∣
∫
X

∂α(f + g)dx

∣∣∣∣



=c4


|f|k+1,X + ∑

|α|≤k

∣∣∣∣
∫
X

∂α(f + g)dx

∣∣∣∣

 .

For the term
∫
X
∂α(f + g)dx, we should note that for any f ∈ Hk+1(X), one can always have a g ∈ S such

that for |α| ≤ k, ∫
X

∂α(f + g)dx = 0. (4.3)

This is because when we set |α| = k,k−1,k−2, · · · , we will have a set of equations from (4.3). We can then

solve the set of equations to obtain the coefficients of the polynomial g ∈ S. Besides, we note that such set

of equations is always solvable because we assumed that k + 1 > d. Then the g which was constructed by

such way will annihilate the integral
∫
X
∂α(f + g)dx for |α| ≤ k.

Therefore, we have that

inf
g∈S
||f + g||k+1,X ≤ c4|f|k+1,X,

for some constant c4 > 0. Then by the conclusion we obtained from the first part, we know that there exists

some c > 0 such that

|f − IXf|m,X ≤ c|f|k+1,X.

This completes the proof of the result. �

From this error bound, we can see that if we consider the polynomial kernel and the true function is in

the RKHS HK, we will have

|f − IXf|m,X ≤ c||f||HK,

for some constant c > 0. This is because the RKHS HK is continuously embedded in Hk+1(X) and we then

have |f|k+1,X ≤ c||f||HK .


Int. J. Anal. Appl. 18 (5) (2020) 698

5. Conclusion

In this paper, some error bounds for the kernel based regularized interpolation are provided. We derived

these error bounds in terms of the reproducing kernel Hilbert space norm and Sobolev space norm. We also

introduced a error bound for a special type of a commonly used kernel: polynomial kernel. The error bounds

then provide theoretical understandings of the approximation performance.

Conflicts of Interest: The author(s) declare that there are no conflicts of interest regarding the publication

of this paper.

References

[1] M. Belkin and P. Niyogi, Semi-supervised learning on riemannian manifolds, Mach. Learn. 56 (2004), 209–239.

[2] M. Belkin, P. Niyogi, and V. Sindhwani, Manifold regularization: A geometric framework for learning from labeled and

unlabeled examples, J. Mach. Learn. Res. 7 (2006), 2399–2434.

[3] R. Bhatia and C. Davis, A cauchy-schwarz inequality for operators with applications, Linear Algebra Appl. 223 (1995),

119–129.

[4] N. Cressie, The origins of kriging, Math. Geol. 22 (1990), 239–252.

[5] F. R. Deutsch, Best approximation in inner product spaces, Springer Science & Business Media, 2012.

[6] T. Kato, Estimation of iterated matrices, with application to the von neumann condition, Numer. Math. 2 (1960), 22–29.

[7] H.-J. Rong, G.-B. Huang, N. Sundararajan, and P. Saratchandran, Online sequential fuzzy extreme learning machine for

function approximation and classification problems, IEEE Trans. Syst. Man Cybern. Part B (Cybernetics), 39 (2009),

1067–1072.

[8] B. Schölkopf, R. Herbrich, and A. J. Smola, A generalized representer theorem, in International conference on computational

learning theory, Springer, 2001, 416–426.

[9] D. Shepard, A two-dimensional interpolation function for irregularly-spaced data, in Proceedings of the 1968 23rd ACM

national conference, 1968, 517–524.

[10] M. Thakur, B. Samanta, and D. Chakravarty, A non-stationary geostatistical approach to multigaussian kriging for local

reserve estimation, Stoch. Environ. Res. Risk Assess. 32 (2018), 2381–2404.

[11] J. J. Thiagarajan, P.-T. Bremer, and K. N. Ramamurthy, Multiple kernel interpolation for inverting non-linear dimen-

sionality reduction and dimension estimation, in 2014 IEEE International Conference on Acoustics, Speech and Signal

Processing (ICASSP), IEEE, 2014, 6751–6755.

[12] A. Wannebo, Equivalent norms for the sobolev space W
m,p
0 (Ω), Ark. Mat. 32 (1994), 245–254.


	1. Introduction
	2. Preliminary
	3. Hilbert space type error analysis
	4. Sobolev space type error analysis
	5. Conclusion
	References