Hilbert projection theorem

From HandWiki
Short description: On closed convex subsets in Hilbert space

In mathematics, the Hilbert projection theorem is a famous result of convex analysis that says that for every vector x in a Hilbert space H and every nonempty closed convex CH, there exists a unique vector mC for which cx is minimized over the vectors cC; that is, such that mxcx for every cC.

Finite dimensional case

Some intuition for the theorem can be obtained by considering the first order condition of the optimization problem.

Consider a finite dimensional real Hilbert space H with a subspace C and a point x. If mC is a minimizer or minimum point of the function N:C defined by N(c):=cx (which is the same as the minimum point of ccx2), then derivative must be zero at m.

In matrix derivative notation[1] xc2=cx,cx=2cx,c Since c is a vector in C that represents an arbitrary tangent direction, it follows that mx must be orthogonal to every vector in C.

Statement

Hilbert projection theorem — For every vector x in a Hilbert space H and every nonempty closed convex CH, there exists a unique vector mC for which xm is equal to δ:=infcCxc.

If the closed subset C is also a vector subspace of H then this minimizer m is the unique element in C such that xm is orthogonal to C.

Detailed elementary proof

Proof by reduction to a special case

It suffices to prove the theorem in the case of x=0 because the general case follows from the statement below by replacing C with Cx.

Hilbert projection theorem (case x=0)[2] — For every nonempty closed convex subset CH of a Hilbert space H, there exists a unique vector mC such that infcCc=m.

Furthermore, letting d:=infcCc, if (cn)n=1 is any sequence in C such that limncn=d in [note 1] then limncn=m in H.

Consequences

Proposition — If C is a closed vector subspace of a Hilbert space H then[note 3] H=CC.

Proof[3]

Proof that CC={0}:

If cCC then 0=c,c=c2, which implies c=0.


Proof that C is a closed vector subspace of H:

Let P:=cC𝔽 where 𝔽 is the underlying scalar field of H and define L:HPh(h,c)cC which is continuous and linear because this is true of each of its coordinates hh,c. The set C=L1(0)=L1({0}) is closed in H because {0} is closed in P and L:HP is continuous. The kernel of any linear map is a vector subspace of its domain, which is why C=kerL is a vector subspace of H.


Proof that C+C=H:

Let xH. The Hilbert projection theorem guarantees the existence of a unique mC such that xmxc for all cC (or equivalently, for all xcxC). Let p:=xm so that x=m+pC+p and it remains to show that pC. The inequality above can be rewritten as: pz for all zxC. Because mC and C is a vector space, m+C=C and C=C, which implies that xC=x+C=p+m+C=p+C. The previous inequality thus becomes pz for all zp+C. or equivalently, pp+c for all cC. But this last statement is true if and only if p,c=0 every cC. Thus pC.

Properties

Expression as a global minimum

The statement and conclusion of the Hilbert projection theorem can be expressed in terms of global minimums of the followings functions. Their notation will also be used to simplify certain statements.

Given a non-empty subset CH and some xH, define a function dC,x:C[0,) by cxc. A global minimum point of dC,x, if one exists, is any point m in domaindC,x=C such that dC,x(m)dC,x(c) for all cC, in which case dC,x(m)=mx is equal to the global minimum value of the function dC,x, which is: infcCdC,x(c)=infcCxc.

Effects of translations and scalings

When this global minimum point m exists and is unique then denote it by min(C,x); explicitly, the defining properties of min(C,x) (if it exists) are: min(C,x)C and xmin(C,x)xc for all cC. The Hilbert projection theorem guarantees that this unique minimum point exists whenever C is a non-empty closed and convex subset of a Hilbert space. However, such a minimum point can also exist in non-convex or non-closed subsets as well; for instance, just as long is C is non-empty, if xC then min(C,x)=x.

If CH is a non-empty subset, s is any scalar, and x,x0H are any vectors then min(sC+x0,sx+x0)=smin(C,x)+x0 which implies: min(sC,sx)=smin(C,x)min(C,x)=min(C,x) min(C+x0,x+x0)=min(C,x)+x0min(Cx0,xx0)=min(C,x)x0 min(C,x)=min(C+x,0)xmin(C,0)+x=min(C+x,x)min(Cx,0)=min(C,x)x

Examples

The following counter-example demonstrates a continuous linear isomorphism A:HH for which min(A(C),A(x))A(min(C,x)). Endow H:=2 with the dot product, let x0:=(0,1), and for every real s, let Ls:={(x,sx):x} be the line of slope s through the origin, where it is readily verified that min(Ls,x0)=s1+s2(1,s). Pick a real number r0 and define A:22 by A(x,y):=(rx,y) (so this map scales the xcoordinate by r while leaving the ycoordinate unchanged). Then A:22 is an invertible continuous linear operator that satisfies A(Ls)=Ls/r and A(x0)=x0, so that min(A(Ls),A(x0))=sr2+s2(1,s) and A(min(Ls,x0))=s1+s2(r,s). Consequently, if C:=Ls with s0 and if (r,s)(±1,1) then min(A(C),A(x0))A(min(C,x0)).

See also

Notes

  1. Because the norm :H is continuous, if limnxn converges in H then necessarily limnxn converges in . But in general, the converse is not guaranteed. However, under this theorem's hypotheses, knowing that limncn=d in is sufficient to conclude that limncn converges in H.
  2. Explicitly, this means that given any ϵ>0 there exists some integer N>0 such that "the quantity" is ϵ whenever m,nN. Here, "the quantity" refers to the inequality's right hand side 2cm2+2cn24d2 and later in the proof, "the quantity" will also refer to cmcn2 and then cmcn. By definition of "Cauchy sequence," (cn)n=1 is Cauchy in H if and only if "the quantity" cmcn satisfies this aforementioned condition.
  3. Technically, H=KK means that the addition map K×KH defined by (k,p)k+p is a surjective linear isomorphism and homeomorphism. See the article on complemented subspaces for more details.

References

Bibliography