Chapter 13: Lie’s Third Theorem and Differential Equations in a Less-Than-Full-Rank Lie Algebra

So far our discussion has been almost wholly about the Lie group and not much about the Lie algebra, aside from the idea that the latter encodes the group product fully at a local level: two Lie groups with the same Lie algebra are locally – but not needfully globally – isomorphic. We know the two basic properties of a real Lie algebra:

  1. It is a finite dimensional vector space over the reals;
  2. It is closed under the binary Lie bracket operation, which is any binary operation that is billinear, skew-Symmetric and fulfills the Jacobi identity.

Every such mathematical system is the Lie algebra of some Lie group. This is Lie’s third theorem.

Theorem 13.1: (Lie’s Third Theorem)

Every abstract finite dimensional ($N$) Lie algebra $\h$ over $\R$ (i.e. as a vector space is a vector space over the reals) is the Lie algebra of some finite dimensional connected Lie group $\H$ i.e. $\h=\operatorname{Lie}(\H)$.

Proof: By Ado's theorem ([Tao (2011)], [De Graaf (2014)]), $\h$ has a faithful linear representation i.e. is isomorphic as a Lie algebra to at least one real Lie algebra of $M\times M$ square matrices ($M$ not related to $N$, aside from that $M$ must be big enough that the dimension of the Lie algebra is $N$). Once we have such an algebra $\h$ of matrices, we consider the set $\Nid$ of all exponentials of members $X\in\h$ of $\h$ which are also small enough that the matrix Mercator-Newton series for the logarithm converges for $e^X$; that is $\left\|e^X-\id\right\|<1$ and we then consider the smallest mallest matrix group $\H=\bigcup\limits_{k=1}^\infty\,\Nid^k$ generated by $\Nid$. We then apply the Lie Correspondence, thinking of the algebra $\h\subset\g$ as a Lie subalgebra of the Lie algebra $\g=\mathfrak{gl}(M,\,\R)$ of the general matrix group $\G=GL(M,\,\R)$. That is, we now apply Theorem 10.5, Lemma 10.6 and Theorem 12.4 with $\G, \,\H,\,\g,\,\h$ as stated here to show that this smallest matrix group $\H=\bigcup\limits_{k=1}^\infty\,\Nid^k$ is connected Lie group when we take the fundamental neighbourhood $\Nid$ to be as just described and the labeller map to be the matrix logarithm and, furthermore, that the Lie algebra of $\H$ is precisely $\h$ and not some bigger Lie algebra. $\quad\square$

Let’s look at a beautiful corollary, which is good background for a second proof of Lie’s third theorem which doesn’t call on Ado’s theorem.

Theorem 13.2: (Corollary of Lie’s Third Theorem)

Every abstract finite dimensional ($N$) Lie algebra $\g$ over $\R$ correponds to precisely one simply connected Lie group $\tilde{\G}$. In other words, the functor $\operatorname{Lie}: SimConLie\to ReaLieAlg$ between the category $SimConLie$ of simply connected finite dimensional Lie groups and the category $ReaLieAlg$ of finite dimensional Lie algebras over the field of reals is a fully faithful (bijective) functor.

Proof: By Lie’s Third Theorem shows there is at least one connected Lie group $\G$ with Lie algebra $\g$, the procedure of Definition 14.20 shows how to construct the simply connected universal cover $\tilde{\G}$ of $\G$ and Corollary 14.17 shows that $\tilde{\G}$ is unique. $\quad\square$

So, with this concept of the unique, simply connected Lie group corresponding to any Lie algebra over the reals, we can talk about a second proof to Lie’s third theorem. One can give a direct existence proof of Lie’s Third Theorem, but be warned it is not at all helpful for visualising what such a beast might look like. The idea is the following: once we have a Lie algebra over $\R$, we have a unique notion of a group product in-the-small through the Campbell-Baker-Hausdorff Theorem. For the group in-the-large, we can define the Lie group members to be equivalence classes of formal products of the form $\prod\limits_{k=1}^M \exp(X_k)$.

Lie’s Third Theorem; Second Proof: Define $\g_\epsilon\{X|\,X\in\g;\;\left\|X\right\|<\epsilon\}$ and then $\G_0=\{\prod\limits_{k=1}^M \exp(X_k)|\;M\in\mathbb{N};\;X_k\in\g_\epsilon\}$ where $\epsilon>0$ is small enough that the Campbell-Baker-Hausdorff series $\varphi_{CBH}:\g_\epsilon\times\g_\epsilon\to\g$ converges for all members of its domain. So $\G_0$ is the set of all formal finite products of the stated form. Then for each $X_k$, define a $C^0$ path $\sigma_k:\left[0,\,1\right]\to\g_\epsilon$ such that $\sigma_k(0)=\id$ and $\sigma_k(1) = X_k$. We now define Lie group members to be equivalence classes $\G = \G_0/\sim$, where two formal products $\gamma_1 = \prod\limits_{k=1}^{M_1} \exp\left(X_{k,\,1}\right)$ and $\gamma_2 = \prod\limits_{k=1}^{M_2} \exp\left(X_{k,\,2}\right)$ are equivalent by $\sim: \G_0\times \G_0\to\{\text{true},\,\text{false}\}$ if and only if a finite sequence of (i) deformations of the endpoints $X_j\mapsto X_j^\prime$, $X_{j+1}\mapsto X_{j+1}^\prime$ where $\exp(X_j)\,\exp(X_{j+1})=\exp(X_j^\prime)\,\exp(X_{j+1}^\prime)$ as reckonned by the Campbell-Baker-Hausdorff formula and (ii) refinements of the path segmentation as described in the proof of Lemma 14.14 can map the sequence $\{X_{k,\,1}\}_{k=1}^{M_1}$ into $\{X_{k,\,2}\}_{k=1}^{M_2}$. This definition (i) irons out any inconsitency that might arise from an element’s having horribly many potential representations as different formal products and (ii) ensures the set of equivalence classes, as a set of homotopy classes is simply connected by construction. It is now a simple matter to check that $\G=\G_0/\sim$ fulfills axioms 1 through 5. $\quad\square$

In the light of Theorem 13.2, we see that the above construction realises the unique, simply connected Lie group with a given Lie algebra over the reals as its Lie group. However, without a matrix realisation, good luck to you in any quest trying to work out the group’s centre so as to characterise all Lie groups with a given Lie algebra! So, whilst interesting, the second proof is not particularly useful for other Lie theoretic work! I liken it to the Cayley theorem that every finite group is the subgroup of some group of permutations: something that sounds very powerful, but really not workable for use in other proofs or inferences in group theory.

Example 13.3: (The Cross Product Lie Algebra)

The vector space $\R^3$ kitted with the vector cross product is an abstract three dimensional Lie algebra over $\R$, for the cross product is (i) billinear, in particular the cross product is distributive on both the left and right, (ii) skew-symmetric, since $X\times Y = -Y\times X$ and (iii) fulfils the Jacobi identity, as $A\times(B\times C) + B\times(C\times A) + C\times(A\times B) = 0$. Therefore, there is precisely one simply connected Lie group with this Lie algebra as its Lie algebra. It is indeed the group $SU(2)$, and when members of $\mathfrak{su}(2)\cong\mathfrak{so}(3)$ are represented as $3\times 1$ column vectors whose components are the superposition weights of the basis vector matrices $\hat{s}_x,\,\hat{s}_y,\,\hat{s}_z$, then the Lie bracket between them can be calculated as the cross product between the $3\times 1$ column vectors. $\mathfrak{su}(2)\cong\mathfrak{so}(3)$ is sometimes called the “cross product” Lie algebra. Ado’s theorem plays out very simply with the cross product algebra: since the algebra is centreless its image under the adjoint representation is itself. So we simply convert the cross product with a given vector $X$, to wit the linear function $\ad(X):\R^3\to\R^3;\;\ad(X)(Y) = X\times Y$ to its matrix in the adjoint representation. Thus the basis matrices for the image algebra under the adjoint representation are the matrices $\hat{S}_j$ of Example 1.3, with linear combinations thereof exponentiating to rotation matrices in $SO(3)$.

The Lie Correspondence is the first principle we shall use in exploring more deeply the Lie algebra’s structure, in telling Lie groups apart by differences between their Lie algebras. Here we shall look at the a class of “problems” broadly described as Lie Theoretic Control Theory. This is precisely the kind of problem that first kindled my own interest in Lie theory and I shall begin with an informal description.

We think about a control system that can impart some kind of transformation on a system. We think of the set of all transformations as a group: the composition of two transformations is of course another transformation and we generally assume that if we can impart a transformation, then we can impart its inverse on the system. We think of a transformation $U(t)$ evolving continuously with time $t$: for small times, the transformations look roughly like the identity (no transformation), and evolve continuously away from the identity. We steer the transformation with controller. If we think of the transformations $U\in\G$ as belonging to a Lie group $\G$, then we can idealise the controller and its steering action as follows. At any time, the transformation’s velocity must belong to the tangent space of the Lie group at the $U(t)$, so that a general transformation evolution is described by $\d_t\,U(t) = U(t)\,X(t)$ where $X(t)\in\g$, where $\g=\operatorname{Lie}(\G)$ is the Lie algebra of $\G$. Here as always this equation is a literal matrix equation if $\G$ is a matrix Lie group, but otherwise, if $\G$ is a general rather than matrix Lie group, we take this equation in its “figurative” meaning, as the shorthand for $\left.\d_s \left(U(t)^{-1}\,U(t+s)\right)\right|_{s=0} = X(t)$. We can write this out in full as:

\begin{equation}\label{Controller}\left.\d_s \left(U(t)^{-1}\,U(t+s)\right)\right|_{s=0} = \sum_{j=1}^M x_j(t)\,\hat{X}_j;\quad U(0) = \id\end{equation}

where $\hat{X}_j$ are basis vectors for $\g$ and the $x_j(t)$, which are $C^0$ functions of time, describe the controller’s instantaneous output in steering the transformation. As discussed in Theorem 5.5, a $C^0$ $X(t)$ in the Lie group setting guarantees a unique $U(t)$, for the Lie group multiplication laws enforce uniqueness of the solutions that must exist by dint of the Peano Existence theorem. We can think of the $x_j$ as our “levers” or controller setting. If the $\hat{X}_j$ span the whole Lie algebra $\g$, then our discussion (Theorem 6.2) of geodesic co-ordinates make it clear that, so long as $x_j(t)$ can assume any value in a nonzero open interval $(-\epsilon,\,+\epsilon)$ for any $\epsilon>0$ no matter how small, then, given a long enough time interval, we can steer $U(t)$ to any value in $\exp(\g)$ and therefore to any value in the identity-connected component of $\G$. So hereafter we shall assume $\G$ is a connected Lie group, as no solution of $\eqref{Controller}$ can reach any transformation in a non-identity coset of the identity connected component (otherwise, by definition this component would be path connected to the identity by our solution, gainsaying our assumption!).

However, what of the case where the $\hat{X}_j$ in $\eqref{Controller}$ do not span the whole Lie algebra? What if they span a vector subspace that is not a Lie algebra at all? This is quite possible. Not every tangent may be practical, nor even theoretically possible, given the constraints of the physics of a particular controller. I shall first prove a theorem that essentially answers what happens when we can’t steer with the full tangent space, and then talk about some interesting physics / control system problems that illustrate the theorem’s meaning. Note that in this theorem, unwontedly, we talk about $C^\omega$ (analytic, i.e. defined by convergent Taylor series) paths rather than $C^1$ paths as we do most of the time.

Theorem 13.4: (Steering a Lie Group Path Without the full rank Tangent Space)

Let $\G$ be a connected Lie group, $\g$ its Lie algebra. Kit the appropriate sized nucleus $\Nid$ with geodesic co-ordinates so that the labeller map $\lambda = \log:\Nid\to\g$ and let $\sigma_j:[-1,\,1]\to\G;\;\sigma_j(0)=\id;\,\left.\d_\tau \sigma_j(\tau)\right|_{\tau=0}=\hat{X}_j;\;j=1,\,2,\,\cdots,\,M$ be $M$ $C^\omega$ paths (in the geodesic co-ordinates) through $\G$ with $\sigma_j(-\tau) = \sigma_j(\tau)^{-1}$. The $\hat{X}_j$ may or may not span the whole of $\g$; the typical situation is where they do not.

Let furthermore $\h$ be the smallest Lie algebra containing the $\hat{X}_j$; $\h=\g$ if the $\hat{X}_j$ span $\g$, otherwise $\h$ is the intersection of all Lie algebras containing the $\hat{X}_j$ and is the set of all entities that can be gotten from the $\hat{X}_j$ by finite sequences of linear (scaling and addition) operations and Lie bracket operations.

Then for every $X\in\h$ there is a finite concatenation of the basic paths defined by:

\begin{equation}\label{LieTheoreticControllerTheorem_1}\sigma:[-1,\,1]\to\G;\;\sigma(\tau) = \prod\limits_{k=1}^R\,\sigma_k(\alpha_k\,\tau)\end{equation}

such that $\left.\d_\tau \sigma(\tau)\right|_{\tau=0}=X$.

From this result, and from Theorem 3.9 it follows that every member of $\exp(\h)$ and thus every element of the Lie subgroup $\H = \bigcup\limits_{k=1}^\infty\exp(\h)$ corresponding to $\h\subset\g$ under the Lie Correspondence can be realised as a finite product of the form $\prod\limits_{k=1}^Q \sigma_{j(k)}(\tau_k)$ i.e. as a finite product of terms of the form $\sigma_j(\tau_k)$.

Proof: Show Proof

Given two $C^\omega$ paths $\sigma_X:[-1,\,1]\to\G$, $\sigma_Y:[-1,\,1]\to\G$ through $\G$ with $\sigma_X(0)=\sigma_Y(0) = \id$, $\sigma_X(-\tau) = (\sigma_X(\tau))^{-1}$, $\sigma_Y(-\tau) = (\sigma_Y(\tau))^{-1}$ and with tangents $X = \left.\d_\tau\sigma_X(\tau)\right|_{\tau=0}$, $Y= \left.\d_\tau\sigma_Y(\tau)\right|_{\tau=0}$ to the identity, we first show how to realise a path with tangent $[X,\,Y]$ to the identity as such a finite concatentation. We assume, of course, that $[X,\,Y]$ must be nonzero and linearly independent of $X$ and $Y$.

We have $\sigma_X(\tau) = \exp\left(\tau\,X + \sum\limits_{j=1}^N p_j(\tau)\,\hat{X}_j\right)$ and $\sigma_Y(\tau) = \exp\left(\tau\,Y + \sum\limits_{j=1}^N q_j(\tau)\,\hat{X}_j\right)$ where the $p_j(\tau)$ and $q_j(\tau)$ are analytic functions of $\tau$ comprising only second and higher powers of $\tau$ and we must allow for the possibility of all the $\hat{X}_j$ that span $\g$ being present as terms multiplied by second and higher powers of $\tau$. So now we consider the tangent at the identity to the family of $C^\omega$ paths, parameterised by the parameter $s$ as follows:

\begin{equation}\label{LieTheoreticControllerTheorem_2}\begin{array}{lcl}\sigma_s:[-1,\,1]\to\G;\,\sigma_s(\tau) &=& \sigma_X(s)\,\sigma_Y(\tau)\,\sigma_X(-s)\\ &=& \exp\left(\exp\left(\ad(X)\,s + \sum\limits_{j=1}^N p_j(s)\,\ad(\hat{X}_j)\right) \left(\tau Y + \sum\limits_{j=1}^N q_j(\tau)\,\hat{X}_j\right)\right)\end{array}\end{equation}

so that the tangent to a family member at the identity as a function of the parameter $s$ is:

\begin{equation}\label{LieTheoreticControllerTheorem_3}\begin{array}{lcl}T(s) &=& \exp\left(\ad(X)\,s + \sum\limits_{j=1}^N p_j(s)\,\ad(\hat{X}_j)\right)\, Y \\ &=& (1+\tilde{p}_0(s))\,Y + (s+\tilde{p}_1(s))\,\ad(X)\,Y + \sum\limits_{j=2}^{N-1} \tilde{p}_j(s)\,\hat{X}_j\end{array}\end{equation}

where we have absorbed all the higher powers $s^2 \ad(X)^2\,Y/2!,\,s^3 \ad(X)^3\,Y/3!,\,\cdots$ into the sum $\sum\limits_{j=2}^{N-1} \tilde{p}_j(s)\,\hat{X}_j$ with modified analytic co-efficient functions $\tilde{p}_j(s)$. We have also allowed for the possibility of the vector $\ad(X)\,Y$ showing up in the high order powers of $s$ by adding $\tilde{p}_1(s)$ to the multplier $s$ of the term $\ad(X)\,Y$. We do this so as to remove any $\ad(X)\,Y$ component present in the high order powers. We do likewise with any $Y$ component present in the high order powers with the $\tilde{p}_0(s)$ term. Having removed these components from the high order powers, the sum $\sum\limits_{j=2}^{N-1} \tilde{p}_j(s)\,\hat{X}_j$ includes only vectors which are mutually linearly independent and also linearly independent from either $Y$ or $[X,\,Y]$. The co-efficient functions $\tilde{p}_j(s)$ are all different, because they include different powers of $s$. Therefore, they too are linearly independent on any interval. Therefore, we can choose $M$ discrete $s_j$ which beget $M$ linearly independent tangent vectors $T(s_j)$ where $M\leq N$ is the total number of linearly independent vectors in the set $\{Y,\,[X,\,Y],\,\hat{X}_1,\,\hat{X}_2,\,\cdots,\,\ad(X)^2\,Y,\,ad(X)^3\,Y,\,\cdots\}$. Otherwise put, we can choose precisely as many $s_j$ as there are linearly independent vectors in the this set to get the right same number of linearly independent $T(s_j)$ in $\eqref{LieTheoreticControllerTheorem_3}$. Therefore, through Gaussian elimination, we can linearly combine the tangent vectors $T(s_j)$ so that their linear combination equals the term $[X,\,Y]$. That is, we can find superpositions weights $\alpha_j$ such that:

\begin{equation}\label{LieTheoreticControllerTheorem_4}[X,\,Y] = \sum\limits_{k=1}^M\alpha_k\,T(s_k)\end{equation}

So, from Lemma 3.2, the path $\sigma(\tau) = \prod\limits_{k=1}^M\sigma_X(s_k)\,\sigma_Y(\alpha_k\,\tau)\,\sigma_X(-s_k)$ has the tangent $[X,\,Y]$ to the identity. Note that in most cases, not all the vectors $\hat{X}_j$ spanning the Lie algebra $\g$ are present in the sum in $\eqref{LieTheoreticControllerTheorem_3}$, so that the sum in $\eqref{LieTheoreticControllerTheorem_4}$ contains fewer ($M$) than $N$, terms. It may even be that none of these terms are there; the point is that $Y$ and $[X,\,Y]$ are guaranteed to be in $\eqref{LieTheoreticControllerTheorem_3}$ with the nonzero weights shown there, so that the procedure above can be summarised as: $T(s)$ includes $Y, \,[X,\,Y]$ and some other linear independent vectors in the Lie algebra with linearly independet co-efficient functions of $s$. Therefore, there exists a linear combination of the some $T(s_k)$ that sums to any of these linearly independent vectors, thus, in particular we can retrieve $Y$ or $[X,\,Y]$ in this way.

So if we have a set of $C^\omega$ paths with tangents $X_j$, we can build paths with a tangent to the identity equal to any linear superpositon of the $X_j$ by the method of Lemma 3.2, and we can build a path with a tangent to the identity equal to the Lie bracket of any pair of these paths with the procedure described above. Therefore a finite sequence of operations comprising the above procedure together with linear superposition operations by the method of Lemma 3.2 can realise a $C^\omega$ path with any tangent inside some neighbourhood $\mathcal{H}$ of $\Or$ in the smallest Lie algebra $\h$ containing the $X_j$.

So, using these operations, we can build paths whose tangents span $\h$. Therefore, arguing as in Theorem 3.9, we can realise any member of $\exp(\mathcal{H})$ on a finite product of these spanning paths. From there we can see that a finite product of such paths can realise any member of the connected Lie group $\H = \bigcup\limits_{k=1}^\infty\exp(\mathcal{H})^k = \bigcup\limits_{k=1}^\infty \exp(\h)^k$. $\quad\square$

Note that, although the paths $\sigma_X,\,\sigma_Y,\cdots$ in the concatenations like $\sigma_X(s)\,\sigma_Y(\tau)\,\sigma_X(-s)$ need to be $C^\omega$, we end up in general with a concatenation like $\prod\limits_{k=1}^M\sigma_X(s_k)\,\sigma_Y(\alpha_k\,\tau)\,\sigma_X(-s_k)$ that are still $C^\omega$ functions of $\tau$, but once we’ve chosen $\tau$ to realise a given Lie group member, the concatenation is of the form $\gamma=C^\omega$ if we think of it as a concatenation of paths where we reach $\gamma$ from the identity by going from $\id$ to $\sigma_1(s_1)$ by following the path $\sigma_1(\tau\,s_1)$ from $\tau=0$ to $\tau=1$, then by following the path $\sigma_1(s_1)\,\sigma_2(\tau\,s_2)$ from $\tau=0$ to $\tau=1$ and so forth. The most common concatenation that we will come across will be of the form $\prod\limits_{k=0}^M\,e^{X_k\,s_k}$, i.e. we follow piecewise geodesic paths, with discontinuous changes in direction at the end of each piecewise geodesic section.

We shall see how this theorem can be applied when I write about Lie Theoretic Systems Theory.

References:

  1. Terence Tao, “Ado’s Theorem”; a proof of Ado’s theorem sketched out on Terrence Tao’s Personal Blog, 10th May, 2011
  2. Willem A. De Graaf, “Constructing Faithful Matrix Representations of Lie Algebras”, Downloaded from ResearchGate 16th June 2014