Riesz representation theorem

From HandWiki
Short description: Theorem about the dual of a Hilbert space


The Riesz representation theorem, sometimes called the Riesz–Fréchet representation theorem after Frigyes Riesz and Maurice René Fréchet, establishes an important connection between a Hilbert space and its continuous dual space. If the underlying field is the real numbers, the two are isometrically isomorphic; if the underlying field is the complex numbers, the two are isometrically anti-isomorphic. The (anti-) isomorphism is a particular natural isomorphism.

Preliminaries and notation

Let H be a Hilbert space over a field 𝔽, where 𝔽 is either the real numbers or the complex numbers . If 𝔽= (resp. if 𝔽=) then H is called a complex Hilbert space (resp. a real Hilbert space). Every real Hilbert space can be extended to be a dense subset of a unique (up to bijective isometry) complex Hilbert space, called its complexification, which is why Hilbert spaces are often automatically assumed to be complex. Real and complex Hilbert spaces have in common many, but by no means all, properties and results/theorems.

This article is intended for both mathematicians and physicists and will describe the theorem for both. In both mathematics and physics, if a Hilbert space is assumed to be real (that is, if 𝔽=) then this will usually be made clear. Often in mathematics, and especially in physics, unless indicated otherwise, "Hilbert space" is usually automatically assumed to mean "complex Hilbert space." Depending on the author, in mathematics, "Hilbert space" usually means either (1) a complex Hilbert space, or (2) a real or complex Hilbert space.

Linear and antilinear maps

By definition, an antilinear map (also called a conjugate-linear map) f:HY is a map between vector spaces that is additive: f(x+y)=f(x)+f(y) for all x,yH, and antilinear (also called conjugate-linear or conjugate-homogeneous): f(cx)=cf(x) for all xH and all scalar c𝔽, where c is the conjugate of the complex number c=a+bi, given by c=abi.

In contrast, a map f:HY is linear if it is additive and homogeneous: f(cx)=cf(x) for all xH and all scalars c𝔽.

Every constant 0 map is always both linear and antilinear. If 𝔽= then the definitions of linear maps and antilinear maps are completely identical. A linear map from a Hilbert space into a Banach space (or more generally, from any Banach space into any topological vector space) is continuous if and only if it is bounded; the same is true of antilinear maps. The inverse of any antilinear (resp. linear) bijection is again an antilinear (resp. linear) bijection. The composition of two antilinear maps is a linear map.

Continuous dual and anti-dual spaces

A functional on H is a function H𝔽 whose codomain is the underlying scalar field 𝔽. Denote by H* (resp. by H*) the set of all continuous linear (resp. continuous antilinear) functionals on H, which is called the (continuous) dual space (resp. the (continuous) anti-dual space) of H.[1] If 𝔽= then linear functionals on H are the same as antilinear functionals and consequently, the same is true for such continuous maps: that is, H*=H*.

One-to-one correspondence between linear and antilinear functionals

Given any functional f:H𝔽, the conjugate of f is the functional f:H𝔽hf(h).

This assignment is most useful when 𝔽= because if 𝔽= then f=f and the assignment ff reduces down to the identity map.

The assignment ff defines an antilinear bijective correspondence from the set of

all functionals (resp. all linear functionals, all continuous linear functionals H*) on H,

onto the set of

all functionals (resp. all antilinear functionals, all continuous antilinear functionals H*) on H.

Mathematics vs. physics notations and definitions of inner product

The Hilbert space H has an associated inner product H×H𝔽 valued in H's underlying scalar field 𝔽 that is linear in one coordinate and antilinear in the other (as described in detail below). If H is a complex Hilbert space (meaning, if 𝔽=), which is very often the case, then which coordinate is antilinear and which is linear becomes a very important technicality. However, if 𝔽= then the inner product is a symmetric map that is simultaneously linear in each coordinate (that is, bilinear) and antilinear in each coordinate. Consequently, the question of which coordinate is linear and which is antilinear is irrelevant for real Hilbert spaces.

Notation for the inner product

In mathematics, the inner product on a Hilbert space H is often denoted by , or ,H while in physics, the bra–ket notation or H is typically used instead. In this article, these two notations will be related by the equality: x,y:=yx for all x,yH.

Competing definitions of the inner product

The maps , and are assumed to have the following two properties:

  1. The map , is linear in its first coordinate; equivalently, the map is linear in its second coordinate. Explicitly, this means that for every fixed yH, the map that is denoted by y=,y:H𝔽 and defined by hyh=h,y for all hH is a linear functional on H.
    • In fact, this linear functional is continuous, so y=,yH*.
  2. The map , is antilinear in its second coordinate; equivalently, the map is antilinear in its first coordinate. Explicitly, this means that for every fixed yH, the map that is denoted by y=y,:H𝔽 and defined by hhy=y,h for all hH is an antilinear functional on H.
    • In fact, this antilinear functional is continuous, so y=y,H*.

In mathematics, the prevailing convention (i.e. the definition of an inner product) is that the inner product is linear in the first coordinate and antilinear in the other coordinate. In physics, the convention/definition is unfortunately the opposite, meaning that the inner product is linear in the second coordinate and antilinear in the other coordinate. This article will not choose one definition over the other. Instead, the assumptions made above make it so that the mathematics notation , satisfies the mathematical convention/definition for the inner product (that is, linear in the first coordinate and antilinear in the other), while the physics bra–ket notation | satisfies the physics convention/definition for the inner product (that is, linear in the second coordinate and antilinear in the other). Consequently, the above two assumptions makes the notation used in each field consistent with that field's convention/definition for which coordinate is linear and which is antilinear.

Canonical norm and inner product on the dual space and anti-dual space

If x=y then xx=x,x is a non-negative real number and the map x:=x,x=xx

defines a canonical norm on H that makes H into a normed space.[1] As with all normed spaces, the (continuous) dual space H* carries a canonical norm, called the dual norm, that is defined by[1] fH*:=supx1,xH|f(x)| for every fH*.

The canonical norm on the (continuous) anti-dual space H*, denoted by fH*, is defined by using this same equation:[1] fH*:=supx1,xH|f(x)| for every fH*.

This canonical norm on H* satisfies the parallelogram law, which means that the polarization identity can be used to define a canonical inner product on H*, which this article will denote by the notations f,gH*:=gfH*, where this inner product turns H* into a Hilbert space. There are now two ways of defining a norm on H*: the norm induced by this inner product (that is, the norm defined by ff,fH*) and the usual dual norm (defined as the supremum over the closed unit ball). These norms are the same; explicitly, this means that the following holds for every fH*: supx1,xH|f(x)|=fH*=f,fH*=ffH*.

As will be described later, the Riesz representation theorem can be used to give an equivalent definition of the canonical norm and the canonical inner product on H*.

The same equations that were used above can also be used to define a norm and inner product on H's anti-dual space H*.[1]

Canonical isometry between the dual and antidual

The complex conjugate f of a functional f, which was defined above, satisfies fH*=fH* and gH*=gH* for every fH* and every gH*. This says exactly that the canonical antilinear bijection defined by Cong:H*H*ff as well as its inverse Cong1:H*H* are antilinear isometries and consequently also homeomorphisms. The inner products on the dual space H* and the anti-dual space H*, denoted respectively by ,H* and ,H*, are related by f|gH*=f|gH*=g|fH* for all f,gH* and f|gH*=f|gH*=g|fH* for all f,gH*.

If 𝔽= then H*=H* and this canonical map Cong:H*H* reduces down to the identity map.

Riesz representation theorem

Two vectors x and y are orthogonal if x,y=0, which happens if and only if yy+sx for all scalars s.[2] The orthogonal complement of a subset XH is X:={yH:y,x=0 for all xX}, which is always a closed vector subspace of H. The Hilbert projection theorem guarantees that for any nonempty closed convex subset C of a Hilbert space there exists a unique vector mC such that m=infcCc; that is, mC is the (unique) global minimum point of the function C[0,) defined by cc.

Statement

Riesz representation theorem — Let H be a Hilbert space whose inner product x,y is linear in its first argument and antilinear in its second argument and let yx:=x,y be the corresponding physics notation. For every continuous linear functional φH*, there exists a unique vector fφH, called the Riesz representation of φ, such that[3] φ(x)=x,fφ=fφx for all xH.

Importantly for complex Hilbert spaces, fφ is always located in the antilinear coordinate of the inner product.[note 1]

Furthermore, the length of the representation vector is equal to the norm of the functional: fφH=φH*, and fφ is the unique vector fφ(kerφ) with φ(fφ)=φ2. It is also the unique element of minimum norm in C:=φ1(φ2); that is to say, fφ is the unique element of C satisfying fφ=infcCc. Moreover, any non-zero q(kerφ) can be written as q=(q2/φ(q)) fφ.

Corollary — The canonical map from H into its dual H*[1] is the injective antilinear operator isometry[note 2][1] Φ:HH*y,y=y| The Riesz representation theorem states that this map is surjective (and thus bijective) when H is complete and that its inverse is the bijective isometric antilinear isomorphism Φ1:H*Hφfφ. Consequently, every continuous linear functional on the Hilbert space H can be written uniquely in the form y|[1] where y|H*=yH for every yH. The assignment yy,=|y can also be viewed as a bijective linear isometry HH* into the anti-dual space of H,[1] which is the complex conjugate vector space of the continuous dual space H*.

The inner products on H and H* are related by Φh,ΦkH*=h,kH=k,hH for all h,kH and similarly, Φ1φ,Φ1ψH=φ,ψH*=ψ,φH* for all φ,ψH*.

The set C:=φ1(φ2) satisfies C=fφ+kerφ and Cfφ=kerφ so when fφ0 then C can be interpreted as being the affine hyperplane[note 3] that is parallel to the vector subspace kerφ and contains fφ.

For yH, the physics notation for the functional Φ(y)H* is the bra y|, where explicitly this means that y|:=Φ(y), which complements the ket notation |y defined by |y:=y. In the mathematical treatment of quantum mechanics, the theorem can be seen as a justification for the popular bra–ket notation. The theorem says that, every bra ψ| has a corresponding ket |ψ, and the latter is unique.

Historically, the theorem is often attributed simultaneously to Riesz and Fréchet in 1907 (see references).

Proof[4]

Let 𝔽 denote the underlying scalar field of H.

Proof of norm formula:

Fix yH. Define Λ:H𝔽 by Λ(z):=y|z, which is a linear functional on H since z is in the linear argument. By the Cauchy–Schwarz inequality, |Λ(z)|=|y|z|yz which shows that Λ is bounded (equivalently, continuous) and that Λy. It remains to show that yΛ. By using y in place of z, it follows that y2=y|y=Λy=|Λ(y)|Λy (the equality Λy=|Λ(y)| holds because Λy=y20 is real and non-negative). Thus that Λ=y.

The proof above did not use the fact that H is complete, which shows that the formula for the norm y|H*=yH holds more generally for all inner product spaces.


Proof that a Riesz representation of φ is unique:

Suppose f,gH are such that φ(z)=f|z and φ(z)=g|z for all zH. Then fg|z=f|zg|z=φ(z)φ(z)=0 for all zH which shows that Λ:=fg| is the constant 0 linear functional. Consequently 0=fg|=fg, which implies that fg=0.


Proof that a vector fφ representing φ exists:

Let K:=kerφ:={mH:φ(m)=0}. If K=H (or equivalently, if φ=0) then taking fφ:=0 completes the proof so assume that KH and φ0. The continuity of φ implies that K is a closed subspace of H (because K=φ1({0}) and {0} is a closed subset of 𝔽). Let K:={vH:v|k=0 for all kK} denote the orthogonal complement of K in H. Because K is closed and H is a Hilbert space,[note 4] H can be written as the direct sum H=KK[note 5] (a proof of this is given in the article on the Hilbert projection theorem). Because KH, there exists some non-zero pK. For any hH, φ[(φh)p(φp)h]=φ[(φh)p]φ[(φp)h]=(φh)φp(φp)φh=0, which shows that (φh)p(φp)hkerφ=K, where now pK implies 0=p|(φh)p(φp)h=p|(φh)pp|(φp)h=(φh)p|p(φp)p|h. Solving for φh shows that φh=(φp)p|hp2=φpp2p|h for every hH, which proves that the vector fφ:=φpp2p satisfies φh=fφ|h for every hH.

Applying the norm formula that was proved above with y:=fφ shows that φH*=fφ|H*=fφH. Also, the vector u:=pp has norm u=1 and satisfies fφ:=φ(u)u.


It can now be deduced that K is 1-dimensional when φ0. Let qK be any non-zero vector. Replacing p with q in the proof above shows that the vector g:=φqq2q satisfies φ(h)=g|h for every hH. The uniqueness of the (non-zero) vector fφ representing φ implies that fφ=g, which in turn implies that φq0 and q=q2φqfφ. Thus every vector in K is a scalar multiple of fφ.

The formulas for the inner products follow from the polarization identity.

Observations

If φH* then φ(fφ)=fφ,fφ=fφ2=φ2. So in particular, φ(fφ)0 is always real and furthermore, φ(fφ)=0 if and only if fφ=0 if and only if φ=0.

Linear functionals as affine hyperplanes

A non-trivial continuous linear functional φ is often interpreted geometrically by identifying it with the affine hyperplane A:=φ1(1) (the kernel kerφ=φ1(0) is also often visualized alongside A:=φ1(1) although knowing A is enough to reconstruct kerφ because if A= then kerφ=H and otherwise kerφ=AA). In particular, the norm of φ should somehow be interpretable as the "norm of the hyperplane A". When φ0 then the Riesz representation theorem provides such an interpretation of φ in terms of the affine hyperplane[note 3] A:=φ1(1) as follows: using the notation from the theorem's statement, from φ20 it follows that C:=φ1(φ2)=φ2φ1(1)=φ2A and so φ=fφ=infcCc implies φ=infaAφ2a and thus φ=1infaAa. This can also be seen by applying the Hilbert projection theorem to A and concluding that the global minimum point of the map A[0,) defined by aa is fφφ2A. The formulas 1infaAa=supaA1a provide the promised interpretation of the linear functional's norm φ entirely in terms of its associated affine hyperplane A=φ1(1) (because with this formula, knowing only the set A is enough to describe the norm of its associated linear functional). Defining 1:=0, the infimum formula φ=1infaφ1(1)a will also hold when φ=0. When the supremum is taken in (as is typically assumed), then the supremum of the empty set is sup= but if the supremum is taken in the non-negative reals [0,) (which is the image/range of the norm when dimH>0) then this supremum is instead sup=0, in which case the supremum formula φ=supaφ1(1)1a will also hold when φ=0 (although the atypical equality sup=0 is usually unexpected and so risks causing confusion).

Constructions of the representing vector

Using the notation from the theorem above, several ways of constructing fφ from φH* are now described. If φ=0 then fφ:=0; in other words, f0=0.

This special case of φ=0 is henceforth assumed to be known, which is why some of the constructions given below start by assuming φ0.

Orthogonal complement of kernel

If φ0 then for any 0u(kerφ), fφ:=φ(u)uu2.

If u(kerφ) is a unit vector (meaning u=1) then fφ:=φ(u)u (this is true even if φ=0 because in this case fφ=φ(u)u=0u=0). If u is a unit vector satisfying the above condition then the same is true of u, which is also a unit vector in (kerφ). However, φ(u)(u)=φ(u)u=fφ so both these vectors result in the same fφ.

Orthogonal projection onto kernel

If xH is such that φ(x)0 and if xK is the orthogonal projection of x onto kerφ then[proof 1] fφ=φ2φ(x)(xxK).

Orthonormal basis

Given an orthonormal basis {ei}iI of H and a continuous linear functional φH*, the vector fφH can be constructed uniquely by fφ=iIφ(ei)ei where all but at most countably many φ(ei) will be equal to 0 and where the value of fφ does not actually depend on choice of orthonormal basis (that is, using any other orthonormal basis for H will result in the same vector). If yH is written as y=iIaiei then φ(y)=iIφ(ei)ai=fφ|y and fφ2=φ(fφ)=iIφ(ei)φ(ei)=iI|φ(ei)|2=φ2.

If the orthonormal basis {ei}iI={ei}i=1 is a sequence then this becomes fφ=φ(e1)e1+φ(e2)e2+ and if yH is written as y=iIaiei=a1e1+a2e2+ then φ(y)=φ(e1)a1+φ(e2)a2+=fφ|y.

Example in finite dimensions using matrix transformations

Consider the special case of H=n (where n>0 is an integer) with the standard inner product zw:=zTw for all w,zH where w and z are represented as column matrices w:=[w1wn] and z:=[z1zn] with respect to the standard orthonormal basis e1,,en on H (here, ei is 1 at its ith coordinate and 0 everywhere else; as usual, H* will now be associated with the dual basis) and where zT:=[z1,,zn] denotes the conjugate transpose of z. Let φH* be any linear functional and let φ1,,φn be the unique scalars such that φ(w1,,wn)=φ1w1++φnwn for all w:=(w1,,wn)H, where it can be shown that φi=φ(ei) for all i=1,,n. Then the Riesz representation of φ is the vector fφ:=φ1e1++φnen=(φ1,,φn)H. To see why, identify every vector w=(w1,,wn) in H with the column matrix w:=[w1wn] so that fφ is identified with fφ:=[φ1φn]=[φ(e1)φ(en)]. As usual, also identify the linear functional φ with its transformation matrix, which is the row matrix φ:=[φ1,,φn] so that fφ:=φT and the function φ is the assignment wφw, where the right hand side is matrix multiplication. Then for all w=(w1,,wn)H, φ(w)=φ1w1++φnwn=[φ1,,φn][w1wn]=[φ1φn]Tw=fφTw=fφw, which shows that fφ satisfies the defining condition of the Riesz representation of φ. The bijective antilinear isometry Φ:HH* defined in the corollary to the Riesz representation theorem is the assignment that sends z=(z1,,zn)H to the linear functional Φ(z)H* on H defined by w=(w1,,wn)zw=z1w1++znwn, where under the identification of vectors in H with column matrices and vector in H* with row matrices, Φ is just the assignment z=[z1zn]zT=[z1,,zn]. As described in the corollary, Φ's inverse Φ1:H*H is the antilinear isometry φfφ, which was just shown above to be: φfφ:=(φ(e1),,φ(en)); where in terms of matrices, Φ1 is the assignment φ=[φ1,,φn]φT=[φ1φn]. Thus in terms of matrices, each of Φ:HH* and Φ1:H*H is just the operation of conjugate transposition vvT (although between different spaces of matrices: if H is identified with the space of all column (respectively, row) matrices then H* is identified with the space of all row (respectively, column matrices).

This example used the standard inner product, which is the map zw:=zTw, but if a different inner product is used, such as zwM:=zTMw where M is any Hermitian positive-definite matrix, or if a different orthonormal basis is used then the transformation matrices, and thus also the above formulas, will be different.

Relationship with the associated real Hilbert space

Assume that H is a complex Hilbert space with inner product . When the Hilbert space H is reinterpreted as a real Hilbert space then it will be denoted by H, where the (real) inner-product on H is the real part of H's inner product; that is: x,y:=rex,y.

The norm on H induced by , is equal to the original norm on H and the continuous dual space of H is the set of all real-valued bounded -linear functionals on H (see the article about the polarization identity for additional details about this relationship). Let ψ:=reψ and ψi:=imψ denote the real and imaginary parts of a linear functional ψ, so that ψ=reψ+iimψ=ψ+iψi. The formula expressing a linear functional in terms of its real part is ψ(h)=ψ(h)iψ(ih) for hH, where ψi(h)=iψ(ih) for all hH. It follows that kerψ=ψ1(i), and that ψ=0 if and only if ψ=0. It can also be shown that ψ=ψ=ψi where ψ:=suph1|ψ(h)| and ψi:=suph1|ψi(h)| are the usual operator norms. In particular, a linear functional ψ is bounded if and only if its real part ψ is bounded.

Representing a functional and its real part

The Riesz representation of a continuous linear function φ on a complex Hilbert space is equal to the Riesz representation of its real part reφ on its associated real Hilbert space.

Explicitly, let φH* and as above, let fφH be the Riesz representation of φ obtained in (H,,,), so it is the unique vector that satisfies φ(x)=fφx for all xH. The real part of φ is a continuous real linear functional on H and so the Riesz representation theorem may be applied to φ:=reφ and the associated real Hilbert space (H,,,) to produce its Riesz representation, which will be denoted by fφ. That is, fφ is the unique vector in H that satisfies φ(x)=fφx for all xH. The conclusion is fφ=fφ. This follows from the main theorem because kerφ=φ1(i) and if xH then fφx=refφx=reφ(x)=φ(x) and consequently, if mkerφ then fφm=0, which shows that fφ(kerφ). Moreover, φ(fφ)=φ2 being a real number implies that φ(fφ)=reφ(fφ)=φ2. In other words, in the theorem and constructions above, if H is replaced with its real Hilbert space counterpart H and if φ is replaced with reφ then fφ=freφ. This means that vector fφ obtained by using (H,,,) and the real linear functional reφ is the equal to the vector obtained by using the origin complex Hilbert space (H,,,) and original complex linear functional φ (with identical norm values as well).

Furthermore, if φ0 then fφ is perpendicular to kerφ with respect to , where the kernel of φ is be a proper subspace of the kernel of its real part φ. Assume now that φ0. Then fφ∉kerφ because φ(fφ)=φ(fφ)=φ20 and kerφ is a proper subset of kerφ. The vector subspace kerφ has real codimension 1 in kerφ, while kerφ has real codimension 1 in H, and fφ,kerφ=0. That is, fφ is perpendicular to kerφ with respect to ,.

Canonical injections into the dual and anti-dual

Induced linear map into anti-dual

The map defined by placing y into the linear coordinate of the inner product and letting the variable hH vary over the antilinear coordinate results in an antilinear functional: y=y,:H𝔽 defined by hhy=y,h.

This map is an element of H*, which is the continuous anti-dual space of H. The canonical map from H into its anti-dual H*[1] is the linear operator InHH*:HH*yy=y, which is also an injective isometry.[1] The Fundamental theorem of Hilbert spaces, which is related to Riesz representation theorem, states that this map is surjective (and thus bijective). Consequently, every antilinear functional on H can be written (uniquely) in this form.[1]

If Cong:H*H* is the canonical antilinear bijective isometry ff that was defined above, then the following equality holds: CongInHH*=InHH*.

Extending the bra–ket notation to bras and kets

Main page: Bra–ket notation

Let (H,,H) be a Hilbert space and as before, let y|xH:=x,yH. Let Φ:HH*ggH=,gH which is a bijective antilinear isometry that satisfies (Φh)g=hgH=g,hH for all g,hH.

Bras

Given a vector hH, let h| denote the continuous linear functional Φh; that is, h|:=Φh so that this functional h| is defined by ghgH. This map was denoted by h earlier in this article.

The assignment hh| is just the isometric antilinear isomorphism Φ:HH*, which is why cg+h|=cg+h| holds for all g,hH and all scalars c. The result of plugging some given gH into the functional h| is the scalar h|gH=g,hH, which may be denoted by hg.[note 6]

Bra of a linear functional

Given a continuous linear functional ψH*, let ψ denote the vector Φ1ψH; that is, ψ:=Φ1ψ.

The assignment ψψ is just the isometric antilinear isomorphism Φ1:H*H, which is why cψ+ϕ=cψ+ϕ holds for all ϕ,ψH* and all scalars c.

The defining condition of the vector ψ|H is the technically correct but unsightly equality ψgH=ψg for all gH, which is why the notation ψg is used in place of ψgH=g,ψH. With this notation, the defining condition becomes ψg=ψg for all gH.

Kets

For any given vector gH, the notation |g is used to denote g; that is, g:=g.

The assignment g|g is just the identity map IdH:HH, which is why cg+h=cg+h holds for all g,hH and all scalars c.

The notation hg and ψg is used in place of hgH=g,hH and ψgH=g,ψH, respectively. As expected, ψg=ψg and hg really is just the scalar hgH=g,hH.

Adjoints and transposes

Let A:HZ be a continuous linear operator between Hilbert spaces (H,,H) and (Z,,Z). As before, let yxH:=x,yH and yxZ:=x,yZ.

Denote by ΦH:HH*ggH and ΦZ:ZZ*yyZ the usual bijective antilinear isometries that satisfy: (ΦHg)h=ghH for all g,hH and (ΦZy)z=yzZ for all y,zZ.

Definition of the adjoint

Main pages: Hermitian adjoint and Conjugate transpose

For every zZ, the scalar-valued map zA()Z[note 7] on H defined by hzAhZ=Ah,zZ

is a continuous linear functional on H and so by the Riesz representation theorem, there exists a unique vector in H, denoted by A*z, such that zA()Z=A*zH, or equivalently, such that zAhZ=A*zhH for all hH.

The assignment zA*z thus induces a function A*:ZH called the adjoint of A:HZ whose defining condition is zAhZ=A*zhH for all hH and all zZ. The adjoint A*:ZH is necessarily a continuous (equivalently, a bounded) linear operator.

If H is finite dimensional with the standard inner product and if M is the transformation matrix of A with respect to the standard orthonormal basis then M's conjugate transpose MT is the transformation matrix of the adjoint A*.

Adjoints are transposes

Main page: Transpose of a linear map

It is also possible to define the transpose or algebraic adjoint of A:HZ, which is the map tA:Z*H* defined by sending a continuous linear functionals ψZ* to tA(ψ):=ψA, where the composition ψA is always a continuous linear functional on H and it satisfies A=tA (this is true more generally, when H and Z are merely normed spaces).[5] So for example, if zZ then tA sends the continuous linear functional zZZ* (defined on Z by gzgZ) to the continuous linear functional zA()ZH* (defined on H by hzA(h)Z);[note 7] using bra-ket notation, this can be written as tAz=zA where the juxtaposition of z with A on the right hand side denotes function composition: HAZz.

The adjoint A*:ZH is actually just to the transpose tA:Z*H*[2] when the Riesz representation theorem is used to identify Z with Z* and H with H*.

Explicitly, the relationship between the adjoint and transpose is:

tAΦZ=ΦHA*

 

 

 

 

(Adjoint-transpose)

which can be rewritten as: A*=ΦH1tAΦZ and tA=ΦHA*ΦZ1.

Alternatively, the value of the left and right hand sides of (Adjoint-transpose) at any given zZ can be rewritten in terms of the inner products as: (tAΦZ)z=zA()Z and (ΦHA*)z=A*zH so that tAΦZ=ΦHA* holds if and only if zA()Z=A*zH holds; but the equality on the right holds by definition of A*z. The defining condition of A*z can also be written zA=A*z if bra-ket notation is used.

Descriptions of self-adjoint, normal, and unitary operators

Assume Z=H and let Φ:=ΦH=ΦZ. Let A:HH be a continuous (that is, bounded) linear operator.

Whether or not A:HH is self-adjoint, normal, or unitary depends entirely on whether or not A satisfies certain defining conditions related to its adjoint, which was shown by (Adjoint-transpose) to essentially be just the transpose tA:H*H*. Because the transpose of A is a map between continuous linear functionals, these defining conditions can consequently be re-expressed entirely in terms of linear functionals, as the remainder of subsection will now describe in detail. The linear functionals that are involved are the simplest possible continuous linear functionals on H that can be defined entirely in terms of A, the inner product on H, and some given vector hH. Specifically, these are Ah and hA()[note 7] where Ah=Φ(Ah)=(ΦA)h and hA()=(tAΦ)h.

Self-adjoint operators


A continuous linear operator A:HH is called self-adjoint it is equal to its own adjoint; that is, if A=A*. Using (Adjoint-transpose), this happens if and only if: ΦA=tAΦ where this equality can be rewritten in the following two equivalent forms: A=Φ1tAΦ or tA=ΦAΦ1.

Unraveling notation and definitions produces the following characterization of self-adjoint operators in terms of the aforementioned continuous linear functionals: A is self-adjoint if and only if for all zH, the linear functional zA()[note 7] is equal to the linear functional Az; that is, if and only if

zA()=Az for all zH

 

 

 

 

(Self-adjointness functionals)

where if bra-ket notation is used, this is zA=Az for all zH.

Normal operators


A continuous linear operator A:HH is called normal if AA*=A*A, which happens if and only if for all z,hH, AA*zh=A*Azh.

Using (Adjoint-transpose) and unraveling notation and definitions produces[proof 2] the following characterization of normal operators in terms of inner products of continuous linear functionals: A is a normal operator if and only if

AhAzH*=h|A()zA()H* for all z,hH

 

 

 

 

(Normality functionals)

where the left hand side is also equal to AhAzH=AzAhH. The left hand side of this characterization involves only linear functionals of the form Ah while the right hand side involves only linear functions of the form hA() (defined as above[note 7]). So in plain English, characterization (Normality functionals) says that an operator is normal when the inner product of any two linear functions of the first form is equal to the inner product of their second form (using the same vectors z,hH for both forms). In other words, if it happens to be the case (and when A is injective or self-adjoint, it is) that the assignment of linear functionals Ahh|A() is well-defined (or alternatively, if h|A()Ah is well-defined) where h ranges over H, then A is a normal operator if and only if this assignment preserves the inner product on H*.

The fact that every self-adjoint bounded linear operator is normal follows readily by direct substitution of A*=A into either side of A*A=AA*. This same fact also follows immediately from the direct substitution of the equalities (Self-adjointness functionals) into either side of (Normality functionals).

Alternatively, for a complex Hilbert space, the continuous linear operator A is a normal operator if and only if Az=A*z for every zH,[2] which happens if and only if AzH=z|A()H* for every zH.

Unitary operators


An invertible bounded linear operator A:HH is said to be unitary if its inverse is its adjoint: A1=A*. By using (Adjoint-transpose), this is seen to be equivalent to ΦA1=tAΦ. Unraveling notation and definitions, it follows that A is unitary if and only if A1z=zA() for all zH.

The fact that a bounded invertible linear operator A:HH is unitary if and only if A*A=IdH (or equivalently, tAΦA=Φ) produces another (well-known) characterization: an invertible bounded linear map A is unitary if and only if AzA()=z for all zH.

Because A:HH is invertible (and so in particular a bijection), this is also true of the transpose tA:H*H*. This fact also allows the vector zH in the above characterizations to be replaced with Az or A1z, thereby producing many more equalities. Similarly, can be replaced with A() or A1().

See also

Citations

  1. 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 Trèves 2006, pp. 112–123.
  2. 2.0 2.1 2.2 Rudin 1991, pp. 306-312.
  3. Roman 2008, p. 351 Theorem 13.32
  4. Rudin 1991, pp. 307−309.
  5. Rudin 1991, pp. 92-115.

Notes

  1. If 𝔽= then the inner product will be symmetric so it does not matter which coordinate of the inner product the element y is placed into because the same map will result. But if 𝔽= then except for the constant 0 map, antilinear functionals on H are completely distinct from linear functionals on H, which makes the coordinate that y is placed into is very important. For a non-zero yH to induce a linear functional (rather than an antilinear functional), y must be placed into the antilinear coordinate of the inner product. If it is incorrectly placed into the linear coordinate instead of the antilinear coordinate then the resulting map will be the antilinear map hy,h=hy, which is not a linear functional on H and so it will not be an element of the continuous dual space H*.
  2. This means that for all vectors yH: (1) Φ:HH* is injective. (2) The norms of y and Φ(y) are the same: Φ(y)=y. (3) Φ is an additive map, meaning that Φ(x+y)=Φ(x)+Φ(y) for all x,yH. (4) Φ is conjugate homogeneous: Φ(sy)=sΦ(y) for all scalars s. (5) Φ is real homogeneous: Φ(ry)=rΦ(y) for all real numbers r.
  3. 3.0 3.1 This footnote explains how to define - using only H's operations - addition and scalar multiplication of affine hyperplanes so that these operations correspond to addition and scalar multiplication of linear functionals. Let H be any vector space and let H# denote its algebraic dual space. Let 𝒜:={φ1(1):φH#} and let ^ and +^ denote the (unique) vector space operations on 𝒜 that make the bijection I:H#𝒜 defined by φφ1(1) into a vector space isomorphism. Note that φ1(1)= if and only if φ=0, so is the additive identity of (𝒜,+^,^) (because this is true of I1()=0 in H# and I is a vector space isomorphism). For every A𝒜, let kerA=H if A= and let kerA=AA otherwise; if A=I(φ)=φ1(1) then kerA=kerφ so this definition is consistent with the usual definition of the kernel of a linear functional. Say that A,B𝒜 are parallel if kerA=kerB, where if A and B are not empty then this happens if and only if the linear functionals I1(A) and I1(B) are non-zero scalar multiples of each other. The vector space operations on the vector space of affine hyperplanes 𝒜 are now described in a way that involves only the vector space operations on H; this results in an interpretation of the vector space operations on the algebraic dual space H# that is entirely in terms of affine hyperplanes. Fix hyperplanes A,B𝒜. If s is a scalar then s^A={hH:shA}. Describing the operation A+^B in terms of only the sets A=φ1(1) and B=ψ1(1) is more complicated because by definition, A+^B=I(φ)+^I(ψ):=I(φ+ψ)=(φ+ψ)1(1). If A= (respectively, if B=) then A+^B is equal to B (resp. is equal to A) so assume A and B. The hyperplanes A and B are parallel if and only if there exists some scalar r (necessarily non-0) such that A=rB, in which case A+^B={hH:(1+r)hB}; this can optionally be subdivided into two cases: if r=1 (which happens if and only if the linear functionals I1(A) and I1(B) are negatives of each) then A+^B= while if r1 then A+^B=11+rB=r1+rA. Finally, assume now that kerAkerB. Then A+^B is the unique affine hyperplane containing both AkerB and BkerA as subsets; explicitly, ker(A+^B)=span(AkerBBkerA) and A+^B=AkerB+ker(A+^B)=BkerA+ker(A+^B). To see why this formula for A+^B should hold, consider H:=3, A:=φ1(1), and B:=ψ1(1), where φ(x,y,z):=x and ψ(x,y,z):=x+y (or alternatively, ψ(x,y,z):=y). Then by definition, A+^B:=(φ+ψ)1(1) and ker(A+^B):=(φ+ψ)1(0). Now AkerB=φ1(1)ψ1(0)(φ+ψ)1(1) is an affine subspace of codimension 2 in H (it is equal to a translation of the z-axis {(0,0)}×). The same is true of BkerA. Plotting an x-y-plane cross section (that is, setting z= constant) of the sets kerA,kerB,A and B (each of which will be plotted as a line), the set (φ+ψ)1(1) will then be plotted as the (unique) line passing through the AkerB and BkerA (which will be plotted as two distinct points) while (φ+ψ)1(0)=ker(A+^B) will be plotted the line through the origin that is parallel to A+^B=(φ+ψ)1(1). The above formulas for ker(A+^B):=(φ+ψ)1(0) and A+^B:=(φ+ψ)1(1) follow naturally from the plot and they also hold in general.
  4. Showing that there is a non-zero vector v in K relies on the continuity of ϕ and the Cauchy completeness of H. This is the only place in the proof in which these properties are used.
  5. Technically, H=KK means that the addition map K×KH defined by (k,p)k+p is a surjective linear isomorphism and homeomorphism. See the article on complemented subspaces for more details.
  6. The usual notation for plugging an element g into a linear map F is F(g) and sometimes Fg. Replacing F with h:=Φh produces h(g) or hg, which is unsightly (despite being consistent with the usual notation used with functions). Consequently, the symbol is appended to the end, so that the notation hg is used instead to denote this value (Φh)g.
  7. 7.0 7.1 7.2 7.3 7.4 The notation zA()Z denotes the continuous linear functional defined by gzAgZ.

Proofs

  1. This is because xK=xx,fφfφ2fφ. Now use fφ2=φ2 and x,fφ=φ(x) and solve for fφ.
  2. A*Azh=AzAhH=ΦAhΦAzH* where ΦAh:=Ah and ΦAz:=Az. By definition of the adjoint, A*hA*z=hAA*z so taking the complex conjugate of both sides proves that AA*zh=A*zA*h. From A*=Φ1tAΦ, it follows that AA*z|hH=A*zA*hH=Φ1tAΦzΦ1tAΦhH=tAΦhtAΦzH* where (tAΦ)h=h|A() and (tAΦ)z=z|A().

Bibliography