Hilbert Space
This study note is based on a few sources
- Applied Analysis by Hunter and Nuchtergael (Hereafter abbreviated as H.N.).
- Frames and Riesz Bases in Hilbert Space
Spaces Studied in H.N.
In the book by H.N., they studied, in increasing specificity, topological spaces $\rightarrow$ metirc spaces $\rightarrow$ normed linear (vector) spaces $\rightarrow$ Banach Space $\rightarrow$ Hilbert Space.
The Hilbert Space is of particular importance to my field of research (Electrical Engineering in general, or maybe even engineering in general). The space imposes an inner product on Banach Space and thereby endows itself with some very nice geometric properties which are central to fields like Digial Signal Processing.
In this note-to-self, I document some useful definitions and properties of Hilbert Space, with particularly emphasis on representation of signals in a Hilbert Space.
Hilbert Space $\mathcal{H}$
A Hilbert Space is a complete, inner product space.
Note that the definition of inner product space is that a linear space with an inner product. Therefore, the crucial properties of the Hilbert Space is that:
Complete, linear space, inner product (which also uniquely define a norm)
In the following subsections, we will dissect this definition into its components.
Component 1 - Completeness
A metric space $(X,d)$ (not defined here) is complete if every Cauchy sequnce in $X$ converges to a limit in $X$.
To fully understand this definition we require the following concepts:
A sequence of real numbers $(x_{n})$ converges to $x\in \mathbb{R}$ if for every $\epsilon > 0$, there is $N(\epsilon)\in \mathbb{N}$ such that $|x_{n}-x|<\epsilon, \forall n\geq N$. The point $x$ is the limit of $(x_{n})$.
An interpretation of this concept is that, if we draw a ball $B_{\epsilon}(x)$ centered at $x$ of radius $\epsilon$, then this sequence $(x_{n})$ will not only go inside the ball, it will stay there forever for $n\geq N$.
A sequence $(x_{n})$ is a Cauchy sequence if for every $\epsilon >0$, there is an $N(\epsilon)\in \mathbb{N}$ such that $|x_{m}-x_{n}|<\epsilon, \forall m,n \geq N$
Naturally, we would consider that the Cauchy sequence should also be convergent, since the terms getting arbitrarily close together means it should eventually go to something stationary (the limit). However, critically, whether such limit exists in the same space as the sequence is the defining characteristics of a complete metric space.
Every Convergent sequnce is a Cauchy sequence. If the converse is true than the space that the sequences live in is _complete_.
Cauchy Sequences on Real Numbers are bounded.
The reason why complete spaces are useful analytically is that we can have a cauchy sequence $(x_{n})$ in the space and be guaranteed that its limit w.r.t. to the space’s own metric is attainable within the same space. On the other hand, we can also find a good complete space and construct a sequence of approximation to the solution of our problem and be guaranteed that the exact solution can be reached by this approximation in this space. This comes in very handy because proving that a sequence is cauchy is often a MUCH easier task to do than to show that it is convergent in the same space.
Component 2 - Linear
- Additive Commutativity $$ \forall x,y,z \in X, x+y = y+x $$
- Additive Associativity $$ \forall x,y,z \in X, x+(y+z) = (x+y)+z$$
- Additive Identity $0$ $$\exists 0 \in X \\ \text{ s.t. } \\ x+0 = x, \forall x\in X $$
- Additive Inverse $-x$ $$\text{for each } x \in X, \exists! -x \\ \text{ s.t. } \\ x+(-x)=0 $$
- Scalar Multiplicative Identity $1$ $$\forall x,y \in X, \lambda, \mu \in \mathbb{C}, 1x = x $$
- Scalar Multiplication Distributivity w.r.t. Vector Addition $$\forall x,y \in X, \lambda, \mu \in \mathbb{C}, (\lambda+\mu)x = \lambda x + \mu x $$
- Scalar Multiplicatiion Distributivity w.r.t. Field Addition $$\forall x,y \in X, \lambda, \mu \in \mathbb{C}, \lambda(x+y) = \lambda x + \lambda y $$
- Scalar Multiplication Compatibility with Field Multiplication $$\forall x,y \in X, \lambda, \mu \in \mathbb{C}, \lambda(\mu x) = (\lambda \mu)x$$
NOTE: A couple of the conditions can be combined into the following condition \[\forall x,y \in X, \lambda, \mu \in \mathbb{C}, \lambda(x+y) = \lambda x + \lambda y \]
Component 3 - Inner Product
- Linear in the second argument: $$(x,\lambda y+\mu z = \lambda(x,y)+\mu(x,z)$$
- Hermitian (Conjugate) symmetric: $$(y,x) = \overline{(x,y)}$$
- Nonnegative: $$(x,x) \geq 0$$
- Positive Definite: $$(x,x)=0 \leftrightarrow x=0$$
Note 1: $(\cdot,\cdot)$ is antilinear in the first term. i.e. \[(\lambda x+\mu y,z) = \bar{\lambda}(x,z)+\bar{\mu}(y,z)\]
Note 2: The notation $\langle \cdot,\cdot \rangle$ is equivalent to $(\cdot,\cdot)$ but is reserved for Hilbert spaces in the H.N.’s discussion. This is to differentiate the inner-product in pre-Hilbert Spaces (Inner product space) and the inner-product in Hilbert spaces (complete inner product space).
Orthogonality & Projection Theorem
The introduction of inner-prodcut gives Hilbert space a geometric interpretation. We are now able to discuss the concept of orthogonality, which is defined below:
To me, one of the most useful concept that is enabled by $(\cdot,\cdot)$ is the concept of projection. It is important because it central to many theorems in Signal Processing. Particularly, consider the pre-filtering operation in Shannon’s Sampling procedure where a signal $u(t)$ of bandwidth, say, 100Hz is first convolved with an ideal low-pass filter $h(t)$ of bandwidth,say, 50Hz before being passed through the sampler to obtain discrete-time signal $\tilde{u}[n]$. The result of the filtering operation $\tilde{u}(t)$ has only a bandwidth of $50Hz$ and the higher frequency information in the $[50,100]Hz$ range is lost even if enough samples are collected and $\tilde{u}(t)$ is faithfully recovered from the sampler output $\tilde{u}[n]$.
If we consider the above phenomenon geometrically, we note that the set of $50Hz$ signals form a closed subset of the larger space (say $L^2$ with bandwith $100Hz$). By low-pass filtering the signal $u(t)$, we are projecting a signal in the original space $\mathcal{H}$ onto its subset. As such, all the information that is in the null-space of projection operator is irrecoverable. Nevertheless, we know that this projected signal $\tilde{u}(t)$ is going to be the best approximation of $u(t)$ given the constrain of $50Hz$ bandwidth.
Formally, we re-state the last comment above as the Projection Theorem:
- For each $x\in \mathcal{H}$, there is a unique closest point $y\in \mathcal{M}$ such that $$\|x-y\| = \min_{z\in \mathcal{M}} \|x-z\|$$
- The point $y\in \mathcal{M}$ closest to $x\in \mathcal{H}$ is the unique element of $\mathcal{M}$ with the property that $(x-y)\perp \mathcal{M}$
Note: The uniqueness of the closest point suggests convexity of the subspace $\mathcal{M}$. Indeed, since a linear subspace is closed under addtion and scalar multiplication, let $x,y\in \mathcal{M}, 0\geq t \leq 1 $, we necessarily have that: \(tx + (1-t)y \in \mathcal{M}\).
Orthonormal Basis, Riesz Basis, Frame
As alluded to above, a function $x\in \mathcal{H}$ can be represented by a sum of its projections.
Here, the completeness is achieved by satisfying either 1 of the below 5 equivalent conditions:
- $\langle u_{\alpha},x\rangle = 0$ for all $\alpha \in I$ implies $x = 0$
- $ x = \sum_{\alpha \in I} \langle u_{\alpha},x\rangle u_{\alpha} $ for all $x \in mathcal{H}$
- $\|x\|^2 = \sum_{\alpha \in I} |\langle u_{\alpha},x\rangle|^2$ for all $x \in \mathcal{H}$
- $[U] = \mathcal{H}$ where $[U]$ is defined as $\{\sum_{u\in U} c_u \cdot u\}$ with $ c_u \in \mathbb{C}$ and $\sum_{u \in U} c_u \cdot u$ converges unconditionally
- $U$ is a maximal orthonormal set, which means that no other element of $\mathcal{H}$ is orthonormal to $U$
To generalize the notion of orthonormal basis, we can remove the orthogonal constraint on the basis vectors, which gives us the Riesz Basis:
The Riesz basis vectors, though not orthogonal, are still linearly independent. This can be see by observing the lower-bound in the inequality as see that when $\sum_{k} c_k u_k =0 $, $|c|_{\ell^2}^2 = 0$ implying that $c=0$. Therefore, we can still write every element $x \in \mathcal{H}$ as a linear combination of projections onto the basis vectors.
To further relax the above definition, we can remove even the linear independence constraint. Note that since the vectors ${u_k}$ are no longer linearly independent, they no longer form a basis. Instead, they are now called a Frame of the Hilbert Space $\mathcal{H}$ which is defined as below:
Note that since we did not stipulate even linear indepdendence of the frame vectors ${u_k}$, a frame is often used to form an over-complete representations of the signal, meaning the projection of signal $x(t)$ onto the frame ${\langle x, u_k \rangle}$ can contain redundant measurements about the original signal. This is, however, desirable in cases where the input may be noisy. By having an over-complete representation of the signal, the representation can be robust against noise that are skewed along certain subsets of the frame vectors.
We observe the following relationships between the three sets of vectors:
Frame Inequality | Linear Independence | Orthonormal | |
---|---|---|---|
Frame | Y | N | N |
Riesz Basis | Y | Y | N |
Orthonormal Baiss | Y | Y | Y |
Analysis and Synthesis
The concept of frame is often invoked in conjunction with the two operators:
- analysis operator \(T: \mathcal{H} \rightarrow \ell^2l, x \mapsto \{c_k\}, c_k =\langle x,u_k\rangle\)
- synthesis operator \(T: \ell^2 \rightarrow \mathcal{H}, \{c_k\} \mapsto x, x = \sum_k c_k u_k\).
As the notation has already suggested, the two operators are adjoint.
Define Frame Operator $S: \mathcal{H} \rightarrow \mathcal{H}$ as : \(Sx = \sum_k \langle x, u_k \rangle u_k = T^{\ast} T x\) combined with frame inequality gives \(A\|x\|^2 \leq \langle Sx, x \rangle \leq B\|x\|^2, \quad \text{for each } x \in \mathcal{H}\) We note that the frame operator $S$ is self-adjoint, positive definite and has positive upper/lower bounds.
It can be shown that $S$ is a bounded linear bijection and the inverse $S^-1$ defines a dual frame \(\tilde{u}_k = S^{-1} u_k\) Combining the frame and dual frame, we get a faithful representation of a signal $x$ as follows: \(x= \sum_{k} \langle x, u_k\rangle \tilde{u}_k =\sum_{k} \langle x, \tilde{u}_k\rangle u_k\)
Dual Space, Riesz Representation Theorem & Adjoint Operator
Riesz Representation Theorem is one of the fundamental facts of the Hilbert Space. Before presenting this theorem, we need to define the concept of a bounded lienarly functional and the dual space.
With the above definition, we present the following important theorem:
The Riesz Representation theorem is useful in many applications and also has an important consequence regarding the existence of uniqueness of the adjoint of bounded operator. Specifically, a bounded operator $A: \mathcal{H} \rightarrow \mathcal{H}$ on a Hilbert Space has a unique adjoint operator $A^{\ast}:\mathcal{H} \rightarrow \mathcal{H}$ such that \[\langle x, Ay\rangle = \langle A^{\ast}x, y\rangle, \quad \text{for all } x,y\in \mathcal{H}\].
Naturally, this gives us self-adjoint operators $A$ such that $A^{\ast} = A$ which correspond to Hermitian matrices when $\mathcal{H} = \mathbb{R}^n$.
Other Important Theorems
Two notes regarding unitary operators:
- Two Hilbert Spaces $\mathcal{H_1,H_2}$ are isomorphic if there is a unitary linear map between them.
- A linear map $U: \mathcal{H} \rightarrow \mathcal{H}$ is unitary if and only if $U^{\ast}U = UU^{\ast} = I$