Chapter 6: The Exponential Map 2

With the background of the last chapter, we now use the exponential function to define what is maybe the most important co-ordinate system of all for a connected Lie group: the geodesic co-ordinate system.

Geodesic Co-ordinates

Friedrich Schur’s Deft Trick

We have seen that the exponential map maps a neighbourhood of $\Or\in\G$ in the Lie algebra into the group $\G$ and that $\exp(\tau\,X)$ is a $C^1$ path through $\G$ that passes through the identity at $\tau=0$ for any $X\in\g$. Therefore, $\exp(\tau\,(X+\alpha Y))$ for $X,\,Y\in\g$ and $\alpha\in\R$ defines both a one-parameter group and a $C^1$ path through $\Nid$ that passes through the identity. So we know that $\exp(\tau\,(X+\alpha Y))$ is $C^1$ in $\tau$; what we do not know is whether the path is $C^1$ in $\alpha$, i.e. whether the path defined by $\sigma:[-\tau_0\,\,\tau_0]\to\Nid;\,\sigma(\alpha) = \exp(\tau\,(X+\alpha \,Y))$ that passes through $\exp(\tau\,X)$ at $\alpha=0$ is $C^1$ there. If $\exp(\tau\,X)$ and $\exp(\tau\,Y)$ do not commute, $\exp(\tau\,(X+\alpha \,Y))$ is in general different from $\exp(\tau\,X)\,\exp(\tau\,\alpha \,Y)$ and so the $\sigma$ is not simply the one defined by the standard Cauchy initial value problem $\left.\d_u(\sigma(\alpha)^{-1} \,\sigma(\alpha+u))\right|_{u=0}=\tau\,Y;\,\sigma(0)=\exp(\tau\,X)$.

This problem was overcome with a deft trick by Friedrich Schur in 1891. A modern retelling is given in Rossmann, §1.2, Theorem 5, although bear in mind that Rossmann’s treatment is there restricted to matrix groups and so its reasoning, although like the following, nonetheless flows a little differently from the following, which is worked in a general Lie group setting.

Theorem 6.1 (Differential of $\exp$):

The path:

\begin{equation}
\label{DifferentialOfExpTheorem_1}
\sigma:[-\tau_0\,\,\tau_0]\to\Nid;\,\sigma(\tau) = \exp(X+\tau\,Y)
\end{equation}

with $\tau_0$ small enough that $\sigma([-\tau_0\,\,\tau_0])$ lies inside $\Nid$ is $C^1$ at $\tau=0$ and:

\begin{equation}
\label{DifferentialOfExpTheorem_2}
\left.\d_\tau\left(\exp(-X)\,\exp(X+\tau\,Y)\right)\right|_{\tau=0} = \sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(X)^k}{(k+1)!} \,Y
\end{equation}

and so, when the $\hat{X}_j$ span $\V$, $\exp\left(\sum\limits_{j=1}^N \tau_j \hat{X}_j\right)$ is a $C^1$ function of the $\tau_j$ for all combinations of the $\tau_j$ fulfilling $|\tau_j|<\tau_{max}$ for some suitable $\tau_{max}>0$ such that $\exp\left(\sum\limits_{j=1}^N \tau_j \hat{X}_j\right)\in\Nid$.

More generally, if $X(\tau)\in\g$ is a $C^1$ path through the Lie algebra $\g$ then:

\begin{equation}
\label{DifferentialOfExpTheorem_3}
\left.\d_\tau\left(\exp(-X(s))\,\exp(X(\tau))\right)\right|_{\tau=s} = \sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(X)^k}{(k+1)!} \,\d_\tau X(\tau)
\end{equation}

or, equivalently:

\begin{equation}
\label{DifferentialOfExpTheorem_4}
\d_\tau\lambda\left(\exp(X(\tau))\right) = \mathbf{M}(\id,\,\exp(X(\tau))\,\left(\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(X)^k}{(k+1)!} \,\d_\tau X(\tau)\right)
\end{equation}

Proof: Show Proof

Let:

\begin{equation}
\label{DifferentialOfExpTheorem_5}
\mathscr{Y}(s,\,\tau) = \exp(-s\,X)\,\exp(s\,(X+\tau\,Y))
\end{equation}

Then witness that (i) $\mathscr{Y}(s,\,0) = \mathscr{Y}(0,\,\tau)=\id,\,\forall,\,s,\,\tau\in\R$, that (ii) both $\exp(-s\,X)$ and $\exp(s\,(X+\tau\,Y))$ as $s$ varies define $C^1$ paths through the identity, passing through there when $s=0$ and with tangents $-X$ and $X+\tau\,Y$ there, respectively so that $\exp(-s\,X)\, \exp(s\,(X+\tau\,Y))$ is a $C^1$ path in $\Nid$ for $s\in[0,\,1]$ by the Group Product Continuity Axiom 3 and the Nontrivial Continuity Axiom 4 and (iii) furthermore, by Lemma 5.13:

\begin{equation}
\label{DifferentialOfExpTheorem_6}
e^{-(u+v)\,X}\,e^{s\,(X+\tau\,Y)}=e^{-u\,X}\,e^{s\,(X+\tau\,Y)}\,\exp\left(-v\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,X\right)
\end{equation}

so that then:

\begin{equation}
\label{DifferentialOfExpTheorem_7}
\begin{array}{lcl}
\partial_s\lambda(\mathscr{Y}(s,\,\tau)) &=& \left.\partial_u\lambda\left(e^{-(s+u)\,X}\,e^{s\,(X+\tau\,Y)}\right)\right|_{u=0} + \left.\partial_u\lambda\left(e^{-s\,X}\,e^{(s+u)\,(X+\tau\,Y)}\right)\right|_{u=0}\\
&=&\left.\partial_u\lambda\left(e^{-s\,X}\,e^{s\,(X+\tau\,Y)}\,e^{-u\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,X}\right)\right|_{u=0} + \left.\partial_u\lambda\left(e^{-s\,X}\,e^{s\,(X+\tau\,Y)}\,e^{u\,(X+\tau\,Y)}\right)\right|_{u=0}\\
&=&\mathbf{M}\left(\id,\,e^{-s\,X}\,e^{s\,(X+\tau\,Y)}\right)\left(X+\tau\,Y-\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,X\right)\\
&=&\mathbf{M}\left(\id,\,e^{-s\,X}\,e^{s\,(X+\tau\,Y)}\right)\left(\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,(X+\tau\,Y)-\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,X\right)\\
&=&\mathbf{M}\left(\id,\,e^{-s\,X}\,e^{s\,(X+\tau\,Y)}\right)\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,\tau\,Y
\end{array}
\end{equation}

where we have used Lemma 5.8 to reach the fourth from the third line in Equation $\eqref{DifferentialOfExpTheorem_7}$ and, as always, $\mathbf{M}$ is the tangent map induced by left translation. Take heed that we have:

\begin{equation}
\label{DifferentialOfExpTheorem_8}
\mathbf{M}\left(\id,\,e^{-s\,X}\,e^{s\,(X+\tau\,Y)}\right)=\id_N + \mathscr{D}(s,\,\tau) \,\ni\, \mathscr{D}(s,\,\tau)\to0\,\text{as}\,\tau\to0
\end{equation}

where $\mathscr{D}(s,\,\tau)$ is a locally analytic function of both $\tau$ and $s$ by Theorem 5.14. Then, given $\lambda(\mathscr{Y}(0,\,\tau))=\Or\,\forall\,\tau\in\R$ we can invert the differentiation with respect to $s$ that we brought to bear in Equation $\eqref{DifferentialOfExpTheorem_7}$ thus:

\begin{equation}
\label{DifferentialOfExpTheorem_9}
\begin{array}{lcl}
\lambda(\mathscr{Y}(1,\,\tau))&=&\int_0^1\,\mathbf{M}\left(\id,\,e^{-s\,X}\,e^{s\,(X+\tau\,Y)}\right)\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,\tau\,Y\,\d s\\
&=&\int_0^1\,\left(\id_N+\mathscr{D}(s,\,\tau)\right)\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,\tau\,Y\,\d s\\
&=&\int_0^1\,\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,Y\,\d s \,\tau + \int_0^1\,\mathscr{D}(s,\,\tau)\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,Y\,\d s\,\tau
\end{array}
\end{equation}

whence:

\begin{equation}
\label{DifferentialOfExpTheorem_10}
\tau^{-1}\left(\lambda(\mathscr{Y}(1,\,\tau))-\lambda(\mathscr{Y}(1,\,0))\right)=\int_0^1\,\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,Y\,\d s + \int_0^1\,\mathscr{D}(s,\,\tau)\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,Y\,\d s
\end{equation}

so that, thanks to Equation $\eqref{DifferentialOfExpTheorem_8}$, the limit of Equation $\eqref{DifferentialOfExpTheorem_10}$ as $\tau\to0$ patently exists and:

\begin{equation}
\label{DifferentialOfExpTheorem_11}
\begin{array}{lcl}
\left.\partial_\tau \lambda\left(e^{-s\,X}\,e^{s\,(X+\tau\,Y)}\right)\right|_{\tau=0}&=&\lim\limits_{\tau\to0}\tau^{-1}\left(\lambda(\mathscr{Y}(1,\,\tau))-\lambda(\mathscr{Y}(1,\,0))\right)\\
&=&\lim\limits_{\tau\to0} \int_0^1\,\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,Y\,\d s + \\
&&\qquad\qquad\lim\limits_{\tau\to0} \int_0^1\,\mathscr{D}(s,\,\tau)\,\Ad\left(e^{-s\,(X+\tau\,Y)}\right)\,Y\,\d s\\
&=&\int_0^1\,\,\Ad\left(e^{-s\,X}\right)\,\d s\,Y\\
&=&\int_0^1\,\,e^{-s\,\ad(X)}\,\d s\,Y\\
&=&\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(X)^k}{(k+1)!} \,Y
\end{array}
\end{equation}

recalling that $\Ad\left(e^{-s\,X}\right)$ is a matrix exponential, and so can be integrated like a scalar can (since the exponent at all values of $s$ commutes with the exponent at all other values of $s$). Furthermore it is an analytic function of $s$ with a universally convergent power series is $s\,X$. Likewise, $\mathscr{D}(s,\,\tau)$ is an analytic function of $\tau$ and $s$. Therefore the order of the integrations and limits in Equation $\eqref{DifferentialOfExpTheorem_11}$ can be switched. Thus we have shown that $\exp(X+\tau\,Y)$ is differentiable at $\tau=0$ with the derivative as stated in the theorem.

To prove that $\exp\left(\sum\limits_{j=1}^N \tau_j \hat{X}_j\right)$ is a $C^1$ function of the $\tau_j$ for all combinations of the $\tau_j$ fulfilling $|\tau_j|<\tau_{max}$, we simply apply the above finding (that $\exp(X+\tau\,Y)$ is differentiable at $\tau=0$) to each of the $\tau_j$ by setting $X = \sum\limits_{j=1}^N \tau_j \hat{X}_j$ and $Y = \hat{X}_j$; the derivative $\partial_{\tau_j}\,\exp\left(\sum\limits_{j=1}^N \tau_j \hat{X}_j\right)$ is then simply $\left.\partial_\tau \exp(X+\tau\,Y)\right|_{\tau=0}$.

Lastly, the more general formula also readily follows; we repeat the calculation of $\left.\partial_\tau \exp(X+\tau\,Y)\right|_{\tau=0}$ instead for $\left.\partial_u \exp(X(\tau+u)+u\,\dot{X}(\tau+u)+\O(u^2))\right|_{u=0}$ to find the more general formula stated in the theorem. $\quad\square$

Note on Notation:

The universally convergent series $\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(X)^k}{(k+1)!}$ is often written mnemonically:

\begin{equation}
\label{AdMnemonic}
\frac{1-e^{-\ad(X)}}{\ad(X)} \stackrel{def}{=}\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(X)^k}{(k+1)!}
\end{equation}

One should take heed that this is simply a mnemonic notation, because the matrix $\ad(X)$ is always singular: witness that $\Ad(e^{s\,X}) X=\exp(s\,\ad(X))\,X =X\,\forall\,s\in\R$, by Lemma 5.8 which means that $X+s\,\ad(X) \,X+ s^2,\ad(X))\,X/2!+\cdots = \,X,\forall\,s\in\R$ so that $\ad(X))\,X=\Or$. Therefore the left hand side of the equation above has no literal meaning, other than through the definition above.

The Definition of Geodesic Co-ordinates

We have seen that $\exp:\g\to\G$ maps any Lie algebra member $X(\tau_1,\,\tau_2,\,\cdots,\,\tau_N)=\sum\limits_{j=1}^N \tau_j\,\hat{X}_j$ into the Lie group (here the $\hat{X}_j$ are a basis spanning the Lie algebra) and furthermore map the neighbourhood:

\begin{equation}
\label{MappedNeighbourhood}
\mathcal{J}(\tau_{max})=\left(-\tau_{max},\,\tau_{max}\right)^N=\{X(\tau_1,\,\tau_2,\,\cdots,\,\tau_N):\,\|tau_j|<\tau_{max}\}
\end{equation}

into any nucleus $\K\subseteq\Nid$ when we choose $\tau_{max}=\tau_{\K}>0$ to be appropriately small for that nucleus. Moreover, we have seen in Theorem 6.1 that $\mu(\tau_1,\,\tau_2,\,cdots,\,\tau_N) = \exp(X(\tau_1,\,\tau_2,\,\cdots,\,\tau_N))$ is differentiable for all of its $N$ arguments at every point in the neighbourhood $\mathcal{J}(\tau_{max})\subset\g$ of $\Or$ in the Lie algebra. The differential of $\mu$ at $\Or\in\g$ is the $N\times N$ identity matrix $\id_N$. So now we can put this function $\mu$ into the proof of Theorem 3.9 and work that proof through in exactly the same way for our new $\mu$. Therefore, $\exp$ is a differentiable homeomorphism between some neighbourhood $\U$ of $\Or\in\g$ in the Lie algebra $\g$ and some neighbourhood $\V$ of $\id$ in $\Nid\subset\G$. We have now defined a new system of co-ordinates for some nucleus our connected Lie group, which can perfectly well serve as the $\Nid$ in the defining axioms for $\G$ and $\exp$ has a unique inverse map $\log$ which can also perfectly well play the role of the Labeller function $\lambda$ ($\lambda$ being the Greek equivalent for the beginning letter both of “Labeller” and “Logarithm”).

Theorem 6.2 (Geodesic Co-ordinates):

In any connected Lie group $\G$ there is a nucleus $\mathfrak{K}\subset\G$ and a neigbourhood $\mathcal{J}$ of $\Or\in\g$ in the Lie algebra $\g$ such that $exp:\mathcal{J}\to\K$ is one-to-one and onto and is differentiable at all points in $\mathcal{J}$. Therefore there is a unique inverse map $\log:\K\to\mathcal{J}$ which is one-to-one and onto and differentiable at all points in $\K$. The co-ordinates $(\tau_1,\,\cdots,\,\tau_N)\in\R^N$ mapped by:

\begin{equation}
\label{GeodesicCoordinatesTheorem_1}
\mu:\R^N\to\G;\,\mu(\tau_1,\,\cdots,\,\tau_N) = \exp\left(\sum\limits_{j=1}^N \tau_j\,\hat{X}_j$\right)
\end{equation}

are the Geodesic Co-ordinates, Canonical Co-ordinates of the First Kind or Exponential Co-ordinates, which co-ordinates are retrieved by the unqiue inverse map $\log$ from a $\gamma\in\K$ as:

\begin{equation}
\label{GeodesicCoordinatesTheorem_2}
\tau_j = \left(\log(\gamma)\right)_j
\end{equation}

where the operation $(\,)_j:\g\to\R$ means “find the superposition weight of $\hat{X}_j$ in the unique expansion of $X\in\g$ as a sum of basis vectors $\{\hat{X}_k\}_{k=1}^N$” or “find the component of the vector $X\in\g$ in the direction $\hat{X}_j$, whether by defining an inner product on the Lie algebra $\g$ or otherwise”.

In geodesic co-ordinates, the tangent map induced by left translation at the point $\gamma\in\K\subseteq\Nid\subset\G$ is:

\begin{equation}
\label{GeodesicCoordinatesTheorem_3}
\mathbf{M}(\id,\,\gamma) = \left(\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(\log\gamma)^k}{(k+1)!} \right)^{-1}
\end{equation}

whilst the tangent map induced by right translation is:

\begin{equation}
\label{GeodesicCoordinatesTheorem_4}
\tilde{\mathbf{M}}(\id,\,\gamma) = \left(\sum\limits_{k=0}^\infty \frac{\ad(\log\gamma)^k}{(k+1)!} \right)^{-1}
\end{equation}

Proof: Show Proof

The existence of the everywhere differentiable inverse $\log:\K\to\mathcal{J}$ of $\exp:\mathcal{J}\to\K$ has been proven in the text above, using the inverse function theorem together with Theorem 6.1. It is only left to prove the two formulas $\eqref{GeodesicCoordinatesTheorem_3}$ and $\eqref{GeodesicCoordinatesTheorem_4}$ for the tangent maps. But by Theorem 6.1, Equation $\eqref{DifferentialOfExpTheorem_2}$, we see that the rate of change is the left-translated Lie algebra member:

\begin{equation}
\label{GeodesicCoordinatesTheorem_5}
\left(\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(\log\gamma)^k}{(k+1)!}\right)\,\hat{X}_j = \left(\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(\log\gamma)^k}{(k+1)!}\right)_j
\end{equation}

when the $j^{th}$ exponential co-ordinate changes at a rate of one “length” unit per “time” unit. So when this is so, the left hand side of the equation $\d_\tau\log\gamma = \mathbf{M}(\id,\,\gamma) \id_\tau X(t\tau)$ is the $j^{th}$ column of the $N\times N$ identity matrix and, on the right hand side, $\d_\tau X(t\tau)$ is the $j^{th}$ column of $\sum\limits_{k=0}^\infty \frac{\ad(\log\gamma)^k}{(k+1)!}$. On doing this calculation for each basis vector $\hat{X}_k,\,k=1,\,\cdots,\,N$, we get get $N$ separate column vectors, exactly as in the method used to calculate $\mathbf{M}(\id,\,\gamma)$ for exponential canonical co-ordinates of the second kind. This method was explained in the text before Theorem 5.14. On assembling all these column vectors, we get:

\begin{equation}
\label{GeodesicCoordinatesTheorem_6}
\begin{array}{clcl}&\id_N &=& \mathbf{M}(\id,\,\gamma)\, \left(\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(\log\gamma)^k}{(k+1)!}\right)\\
\Leftrightarrow&\mathbf{M}(\id,\,\gamma) &=& \left(\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad(\log\gamma)^k}{(k+1)!} \right)^{-1}
\end{array}
\end{equation}

The calculation for the tangent map induced by right translation is wholly analogous.

$\square$

 

Geodesic

Figure 6.1: The Wayfarer’s Ken from the Bridge on the High Seas of the Lie Group Following the Geodesic Path $\exp(\tau\,X)$

Figure 6.1 tries to justify what for me is the evocative name “geodesic co-ordinates”. A geodesic’s most general definition in a manifold is a path whose tangent is unchanged by parallel transport along that path. In a Lie group, an obvious definition of parallel transport between two tangent spaces is that gotten by left (or right, it doesn’t matter as long as it is consistent) translation. Take heed that this definition begets a notion of parallel transport that is independent of path and only depends on the endpoint, namely the relevant point in the group. Therefore the Lie group, although not generally Euclidean, is “flat” in the sense that the Riemann curvature tensor with the left/right translation connexion must vanish. Naturally, however, the torsion tensor is in general nonzero. So intuitively, we set our ship’s heading to $X$ and let it run for a short while, then we check what the parallel transported version of $X$ is and make sure our heading sticks to it. In other words, we are setting our velocity $\d_\tau\sigma$ so that it is always equal to $\sigma\,X = \mathbf{M}(\id,\,\sigma)\,X$, the unique version of $X$ parallel transported from $\g$ to the tangent space at $\sigma$. If we begin at $\id$, therefore, we must be following the path $\exp(\tau\,X)$.

As with Exponential Canonical Co-ordinates of the Second Kind, we can show that if we label a small enough nucleus $\mathfrak{K}\subset\G$ with geodesic co-ordinates and we choose $\log(\K)$ to be open in $\g$, then:

Theorem 6.3 (Group Operations are $C^1$ in Geodesic Co-ordinates):

  1. Given a connected Lie group $\G$ whose Lie group structure is defined by the set $\Nid$, the labeller map $\lambda$ and the open subset $\V$ of $\R^N$ such the five defining axioms hold; and
  2. Given an open nucleus $\K\subset\G$ small enough to be uniquely labelled by geodesic co-ordinates;

we can replace $\Nid$ by $\mathfrak{K}$, use geodesic co-ordinates, replace $\lambda$ by $\log$ and replace $\V$ by $\log(\K)\subset\g$ and the five defining axioms will still hold for the new co-ordinates.

Proof: Show Proof

The proof is wholly analogous to the proof of Theorem 5.15:

  1. We consider the product:
    $$\sigma(\tau) = \exp\left(\sum\limits_{j=1}^N \omega_j(\tau)\hat{X}_j\right)= \exp\left(\sum\limits_{j=1}^N \alpha_j(\tau)\hat{X}_j\right)\,\exp\left(-\sum\limits_{j=1}^N \beta_j(\tau)\hat{X}_j\right)$$
  2. Then use Lemma 5.13 as well as Theorem 6.1, $\eqref{DifferentialOfExpTheorem_2}$ to write $\d_s (\sigma(\tau)^{-1}\,\sigma(\tau+s)$ as a Lie algebra member built from basis vectors with superposition weights that are analytic functions of the $\alpha_j,\,\beta_j$;
  3. Then use Theorem 6.2, $\eqref{GeodesicCoordinatesTheorem_3}$ to write $\d_\tau \sigma(\tau)$ as an analytic function of the $\alpha_j,\,\beta_j,\,\omega_j$.

and argue analogously with the Picard-Lindelöf theorem as in the proof of Theorem 5.15

This ends the proof, but it is well to understand that a much more elegant proof showing that indeed the group operations are $C^\omega$, i.e. analytic, will be given when the Baker Campbell Hausdorff theorem has been proven. $\quad\square$

Geodesic co-ordinates afford a very clear proof of the Nuclear Generation Theorem 3.14.

Theorem 6.4 ($\G$ Generated by Any Neigbourhood of $\id$), Version 2)

Let $\g$ be the Lie algebra of a connected Lie group $\G$ and let $\mathcal{N}\subset\g$ be any neighbourhood of $\Or$ in $\g$. Then $\G = \bigcup\limits_{k=1}^\infty \exp(\g) = \bigcup\limits_{k=1}^\infty \exp(\mathcal{N})$, independent of exactly which neighbourhood $\mathcal{N}$ is used.

Proof: Show Proof

Given $\mathcal{N}\subset\g$ is a neighbourhood of $\Or$, there is an open ball $\O(\Or,\,r) = \{X\in\g| \,\left\|X\right\|<r\}\subseteq\mathcal{N}$ of some radius $r>0$ centred on $\Or$. Then, for any $Y\in\g\,\exists n\in\mathbb{N}\ni \frac{Y}{n}\in\O(\Or,\,r)$ (choose $n$ to be the smallest integer greater than $\left\|Y\right\|/r$) so that $\gamma_Y=\exp(Y|/r)^n\in\O(\Or,\,r)\subset\mathcal{N}$, hence $\gamma_Y^n = e^Y$ is an $n$-fold product of members of $\mathcal{N}$, hence $\bigcup\limits_{k=1}^\infty \exp(\g)\subset\bigcup\limits_{k=1}^\infty \exp(\mathcal{N})$. Since $\mathcal{N}\subset\g$, certainly $\bigcup\limits_{k=1}^\infty \exp(\mathcal{N})\subset \exp(\g)$ so that we have, as stated, $\bigcup\limits_{k=1}^\infty \exp(\g)=\bigcup\limits_{k=1}^\infty \exp(\mathcal{N})$. $\quad\square$

Furthermore, we have the following trivially proven lemma:

Lemma 6.5 (Symmetric Nucleus in Geodesic Co-ordinates)

In a Lie group $\G$ wbra $\g$, every open $\exp(\mathcal{H})$ or closed $\exp(\mathcal{\bar{H}})$ ball is a symmetric nucleus, where $\mathcal{H}=\{X\in\g| \,\left\|X\right\|<r\}$ and $\mathcal{\bar{H}}=\{X\in\g| \,\left\|X\right\|\leq r\}$.

$\square$

Are There Any Non $C^1$ Or Non $C^0$ One Parameter Subgroups Of A Connected Lie Group?

We have yet to soundly justify the notation $e^{\tau\,X}$ for $X\in\g$. We shall now explore the exponential function in a way pretty like what one does for the definition of the exponential and logarithm for $\R$. We shall along the way find some more “explicit” formulas for the exponential and logarithm function. We shall show that every $C^0$ one parameter subgroup that has a nonzero subinterval in either (i) a suitably small nucleus $\mathfrak{K}\subset\G$ or (ii) a “translated” copy $\gamma\,\K = \{\gamma\,\zeta:\,\zeta\in\K\}$ of this nucleus, then that one parameter group must link to the identity $\id$ through $\K$ and be $C^1$, indeed analytic $C^\omega$, inside $\K$. We shall see that the only other kind of one parameter group possible is one that both (i) has no nonzero interval $\exp([-\epsilon,\,\epsilon])$ contained in $\K$ and (ii) has uncountably many, totally disconnected points in every translated copy $\gamma\,\K$ of our basic nucleus. Such an $\exp:\R\to\G$ is discontinuous at all points of its domain $\R$.

It is also good to understand that the following proofs can be made much simpler, indeed trivial in some cases, by use of geodesic co-ordinates. However, it is also instructive to understand that much of the following material can be made independent of these co-ordinates, whose definition assumes that the exponential function has been defined. For example, the square root and square functions become trivial to calculate in the Lie algebra (i.e. in geodesic co-ordinates): $\log\mathrm{sqrt} \gamma = \frac{1}{2}\log\gamma$. However, we still need to keep ourselves to a small enough nucleus so that our co-ordinates are one-to-one and onto.

The Square Root Function

My following ideas for defining the exponential map comes from the idea of iterated square roots used by [von Neumann, 1929]; von Neumann was keeping himself to the discussion of closed matrix groups, but it turns out that his method is more general than this. Moreover, the idea of calculating tables of general logarithms using repeated square roots to find very small powers $\gamma^\delta,\,\delta\ll1$ can be thought of as bearing the germ of von Neumann’s idea goes back to the method that Henry Briggs followed in 1620.

We first define a square root function and show that there is a neighbourhood of the identity in $\G$ wherein every element has a unique square root and that that square root also lies within the same neighbourhood. This is the key to a workable definition of a square root iterated an arbitrary number of times.

Theorem 6.6 (Neighbourhood Wherein Square Roots are Uniquely Defined):

In a connected Lie Group $\G$ there is a nucleus $\K\subseteq \Nid$ wherein all elements $\gamma\in\K$ have a unique square root $\zeta = \mathrm{sqrt}(\gamma) \in \K$, i.e. a unique element in $\K$ fulfilling the equation $\zeta\,\zeta = \gamma$.

Proof: Show Proof

We choose $C^1$ paths $\sigma_{\hat{X}_j}:[-\tau_0,\,\tau_0]\to\G$ with tangents $\hat{X}_j$ at $\id$ spanning $\g$ and define the functions:

\begin{equation}
\label{SquareRootTheorem_1}
\begin{array}{l}
\mu_1 :[-\tau_0,\,\tau_0]^N\to \V;\;\mu_2(x_1 ,\,x_2,\,\cdots ,\,x_N)=\lambda\left(\sigma_1( x_1)\,\sigma_2( x_2)\,\cdots \,\sigma_N( x_N)\right)\\\\
\mu_2 :[-\tau_0,\,\tau_0]^N\to \V;\mu_2(x_1 ,\,x_2,\,\cdots ,\,x_N)=\\\\
\qquad\qquad\qquad\lambda\left(\sigma_1( x_1)\,\sigma_2( x_2)\,\cdots \,\sigma_N( x_N)\bullet\sigma_1( x_1)\,\sigma_2( x_2)\,\cdots \,\sigma_N( x_N)\right)
\end{array}
\end{equation}

i.e. $\mu_1$ labels some nucleus $\tilde{\K}$ with second-kind canonical co-ordinates $(x_1 ,\,x_2,\,\cdots ,\,x_N)$.

Reasoning as in Theorem 3.9, we can use the inverse function theorem to see that there are open neighbourhoods $\U_1,\,\U_2\subset [-\tau_0,\,\tau_0]^N$ such that:

  1. Both $\mu_1$ and $\mu_2$ map $\U_1,\,\U_2\subset [-\tau_0,\,\tau_0]^N$, respectively, of $\Or\in[-\tau_0,\,\tau_0]^N$ onto open neighbourhoods $\W_1,\,\W_2\subset \V$ of $\Or\in\V$; and
  2. When restricted to these open neighbourhoods, $\mu_1$ and $\mu_2$ have continuously differentiable inverses $\mu_1^{-1},\,\mu_2^{-1}$ mapping open neighbourhoods $\W_1,\,\W_2\subset\V$, respectively, of $\Or\in\V$ onto $\U_1,\,\U_2$; and
  3. If we choose Cartesian co-ordinates for $\V$ with basis vectors $\hat{X}_j$, then the differentials of $\mu_2,\,\mu_2$ at $\Or\in\V$ are, respectively, the $N\times N$ identity matrix $\id_N$ and twice the identity matrix $2\,\id_N$. Likewise, the differentials at $\Or\in[-\tau_0,\,\tau_0]^N$ of $\mu^{-1},\,\mu^{-1}$ are $\id_N$ and $\frac{1}{2}\id_N$, respectively.

To complete the above reasoning for $\mu_2$, we calculate its derivatives as in the proof of Lemma 3.2 to show that $\partial_{x_j} \mu_2 = 2 \,X_j$. Thus $\mu_2$ also labels some nucleus with unique co-ordinates.

Now we partially define the square function $\mathrm{sqr}:\Nid\to \Nid;\,\mathrm{sqr}(\zeta) = \zeta\,\zeta$ by its effect on the co-ordinates in $\V$ as follows. Let $\W = \W_1\cap\W_2$, another open neighbourhood of $\Or\in\V$, then we define $\mathrm{sqr}_\W: \W\to\W_2: \mathrm{sqr}_\W = \mu_2 \circ \mu^{-1}$. $\mathrm{sqr}_\W$ is a $C^1$ homeomorphism between $\W$ and its image (because (i) $\W\subseteq\W_1,\,\W_2$, (ii) $\mu^{-1}$ is a $C^1$ homeomorphism between $\W$ and $\mu^{-1}(\W)$ and, moreover, (iii) $\mu^{-1}(\W)\subseteq \U_2$) and the differential of $\mathrm{sqr}_\W$ is $2\,\id_N$.

So, by the inverse function theorem again, there is some open neighbourhood $\mathcal{Y}\subseteq\W$ wherein $\mathrm{sqr}_\W$ has a $C^1$, homeomorphic inverse $\mathrm{sqrt}_\mathcal{Y}:\mathrm{sqr}_\W(\mathcal{Y})\to\mathcal{Y}$ with differential $\frac{1}{2} \,\id$ at $\Or$. By the continuity of derivatives of $\mathrm{sqr}_\W$ and $\mathrm{sqrt}_\mathcal{Y}$, for any $\epsilon>0$ we can choose an open ball neighbourhood:

\begin{equation}
\label{SquareRootTheorem_2}
\mathcal{T}_{R(\epsilon)}=\{X\in\V;\;\left\|X\right\|<R(\epsilon)\}\subseteq\mathrm{sqr}_\W(\mathcal{Y})
\end{equation}

wherein $\left\|\mathrm{sqrt}_\mathcal{Y}(X)\right\| \leq \left(\frac{1}{2}+\epsilon\right)\left\|X\right\|$. That is:

\begin{equation}
\label{SquareRootTheorem_3}
\mathrm{sqrt}_\mathcal{Y}(\mathcal{T}_{R(\epsilon)})\subseteq \mathcal{T}_{R(\epsilon)}
\end{equation}

as long as $\epsilon\leq\frac{1}{2}$. Then the inverse image:

\begin{equation}
\label{SquareRootTheorem_4}
\K\subseteq\Nid=\lambda^{-1}(\mathcal{T}_{R(\epsilon)})
\end{equation}

that is, the subset of $\G$ mapped to $\mathcal{T}_{R(\epsilon)}$ by the labeller $\lambda$, is a nucleus with the all the properties stated in the theorem. Thus we can define $\mathrm{sqrt}:\K\to\K$ for any $\gamma\in\K$ as being the unique entity $\zeta\in\K$ such that $\zeta\,\zeta=\gamma$. $\qquad\square$

Note: This theorem does not prove outright uniqueness of the square root. It simply proves that, in a small enough nucleus, all elements have a square root and that this square root is the only one in that nucleus. There could also be other square roots elsewhere.

An Aside: No Small Subgroups Behaviour of Lie Groups

The above theorem shows that, for every $\epsilon>0$, one can find a small enough nucleus:

\begin{equation}
\label{EscapeKernelEquation}
\K=\{\gamma\in\Nid:\,\left\|\lambda(\gamma)\right\|<R_\epsilon\}\,\ni\,\left\|\lambda(\sigma^2)\right\| > (2-\epsilon)\,\left\|\lambda(\sigma)\right\|,\,\forall\,\sigma\in\K
\end{equation}

(i.e. one can choose $R_\epsilon$ small enough to guarantee this). Therefore, in a Lie group, there is no nontrivial subgroup contained inside an arbitrarily small nucleus. For behold any element $\sigma\neq\id\in\K$. The smallest subgroup containing $\sigma$ is $\{\id\,\sigma,\,\sigma^2,\,\cdots\}$, for which the “magnitudes” are:

\begin{equation}
\label{EscapeSequenceEquation}
\left\|\lambda(\id)\right\|=0\,\left\|\lambda(\sigma)\right\|=\ell,\,\left\|\lambda(\sigma^2)\right\|>(2-\epsilon)\,\ell,\,\cdots\,\left\|\lambda(\sigma^{2^N})\right\|>(2-\epsilon)^N\,\ell,\,\cdots
\end{equation}

as long as $\sigma^{2^N}\in\K$. But this means that at a finite power $\left\|\lambda(\sigma^{2^M})\right\|>(2-\epsilon)^M\,\ell>R_\epsilon$ and this finite power lies outside $\K$. So we have proven:

Theorem 6.7 (No Small Subgroups):

A connected Lie group $\G$ has no small subgroups; i.e. it is not true that every nucleus, no matter how small, contains a nontrivial subgroup of $\G$. $\quad\square$

The no small subgroups property was shown to be crucial in the solution of Hilbert’s fifth problem by Montgomery, Zippin, Gleason and Yamabe. This solution essentially states that our basic five axioms will still work if (i) the $C^1$ assumption is replaced by $C^0$ in the Group Product Continuity Axiom 3 and the Nontrivial Continuity Axiom 4 in our basic five axioms and (ii) the group is assumed to have no small subgroups. It is often said that Lie groups only need to be assumed $C^0$ in their group operations and then they admit co-ordinates wherein the group operations are $C^1$ and thence $C^\omega$. Without meaning to take away from the achievement of the amazing solution to Hilbert’s fifth problem, I believe it is slightly misleading to say that $C^0$ assumptions are all that are needed. The no small subgroups assumption is not trivial and so far is essential.

Defining a $C^0$ One-Parameter Group

Now, kitted with this special nucleus $\K$ wherein square roots are unique, we can now define an exponential function of the form $\gamma^b$, where $\gamma\in\K$ is any member of the nucleus and $b$ is a rational number in the interval $[0,\,1]$ with a finite binary expansion. This is simple: using the theorem above, the infinite sequence of iterates $\gamma,\,\mathrm{sqrt}(\gamma),\,\mathrm{sqrt}\circ\mathrm{sqrt}(\gamma), \, \mathrm{sqrt}\circ\mathrm{sqrt}\circ\mathrm{sqrt}(\gamma)$ is uniquely defined for any $\gamma\in\K$. Then, naturally, we define $\gamma_n = \gamma^{\left(2^{-n}\right)}$ as the $n^{th}$ iterate of the square root function on the beginning point $\gamma$, because this is the unique $\gamma_n\in\K$ such that $\gamma_n^{\left(2^{n}\right)} = \gamma$, and then we can define the set:

\begin{equation}
\label{Q2PowersEquation}
\large{\mathcal{Q}(\gamma)=\{\gamma^{m\,2^{-n}}:\,n,\,m\in\mathbb{N};\,m\leq m;\,\gamma^{m\,2^{-n}}=\gamma_n^m\text{ if }m\neq0;\,\gamma^0=\id\}}
\end{equation}

Let $\mathbb{Q}_2$ stand for all rationals with finite binary expansions; $\mathbb{Q}_2$ is clearly a dense subset of $\R$. Therefore the set of all Lie group members $\{\gamma^b:\,\gamma\in\K;\,b\in\mathbb{Q}_2\cap [0,\,1]\}$, which is the set we’ve just described, defines at most one unique $C^0$ (continuous) path linking $\id$ and $\gamma$. Clearly, we have just defined a function $f$ that fulfills $f(\tau)\,f(\varsigma) = f(\tau+\varsigma),\,\forall \tau,\,\varsigma\in\mathbb{Q}_2$. The question is how do we extend it to $\tau\in\R$? A theorem will make this easy.

Theorem 6.8 (Continuity of $\exp: \mathbb(Q)_2\to\G$):

The function $\lambda\circ\exp: \mathbb(Q)_2\to\V;\,\lambda\circ\exp(m\,2^{-n}) = \lambda\left(\gamma^{m\,2^{-n}}\right),\,\forall m,\,n\in\mathbb{N}$ is continuous when $\mathbb{Q}_2$ is given the relative topology inherited from $\R$, as long as $\gamma$ belongs to a small enough nucleus $\K\subseteq \G$.

Proof: Show Proof

We need to prove that, for any $\epsilon>0$, we can find a $\delta>0$ such that $\left\|\lambda(\gamma^{q+p}) – \lambda(\gamma^q)\right\|<\epsilon$ whenever $|p|\in\mathbb{Q}_2<\delta$. Because $\gamma^{q+p} =\gamma^q \gamma^p$, and given we can choose $\K$ small enough to appeal to (Lemma 3.17) the $C^1$ behaviour of the group product at $\gamma^q\in\K$, we have:

\begin{equation}
\label{RationalExponentialContinuityTheorem_1}
\left\|\lambda(\gamma^{q+p}) – \lambda(\gamma^q)\right\|<\left\|\mathbf{M}(\id,\,\gamma^q)\right\|\,(1+\epsilon_1) \left\|\lambda(\gamma^p)\right\|
\end{equation}

where $\mathbf{M}$ is the tangent map induced by the left translation by $\gamma^p$ and we have used the fact that $\lambda(\sigma_X(\tau)) = (\id_N + \mathscr{D}(\sigma_X(\tau))) X$ for any $C^1$ path $\sigma$ through the identity with tangent $X$ there. Differntiability is tantamount to the existence of $\mathscr{D}(\sigma_X(\tau))$ such that $\left\|\mathscr{D}(\sigma_X(\tau))\right\|\to0$ as $\tau\to0$. Hence we can write $\left\|\lambda(\gamma^{q+p})-\lambda(\gamma^p)\right\| < (1+\epsilon_0)\,(1+\epsilon_1) \left\|\lambda(\gamma^p)\right\|$ where we have chosen $\K$ small enough that $\left\|\mathscr{D}(\sigma_X(\tau))\right\|<\epsilon_1$ and $\left\|\mathbf{M}(\id,\,\gamma^q)\right\|<1+\epsilon_0$. Therefore, our theorem will be proven if we can prove that for any $\epsilon > 0$ we can find a $\delta$ such that $\left\|\lambda(\gamma^p)\right\|<\epsilon$ whenever $|p|<\delta$. Otherwise put: $\gamma^{q+p}$ is continuous at $\gamma^q$ with respect to the relative topology in $\mathbb{Q}_2$ inherited from $\R$ if and only if it is continuous at $\gamma^0=\id$. Continuity of our exponential function is proven at all points if and only if it can be proven at one point and, when this is proven, our exponential function is then uniformly continuous in $\K$.

So now we need to study $\left\|\lambda(\gamma^p)\right\|$ where $p\to0$ and $p\in\mathbb{Q}_2$. Using our square root function and knowing that it has differential $\frac{1}{2}$ at $\id$, we can choose $\K$ small enough that:

\begin{equation}
\label{RationalExponentialContinuityTheorem_2}
\left\|\lambda(\zeta\,\mathrm{sqrt}(\sigma))\right\|<\left\|\lambda(\zeta)\right\|+(1+\epsilon_3)\,\frac{1}{2}\,\left\|\lambda(\sigma)\right\|
\end{equation}

for any $\epsilon_1>0$ for $\zeta\,\sigma\in\K$. Note that the co-efficient $1+\epsilon_3$ includes both:

  1. The tangent map $\mathbf{M}(\id,\,\zeta)$ induced by left translation, which fulfills $\left\|\mathbf{M}(\id,\,\zeta)\right\| \leq1+\epsilon_1$ when $\zeta\in\K$ for some $\epsilon_1>0$ such that $\epsilon_1\to0$ as the width of $\lambda(\K)$ approaches nought; and
  2. The definition of the differential of $\mathrm{sqrt}$ as being such that $\lambda(\zeta\,\mathrm{sqrt}(\sigma)) =\lambda(\zeta)+ \mathbf{M}(\id,\,\zeta)(\frac{1}{2} + \mathscr{D}_{\frac{1}{2}}(\sigma)) \lambda(\sigma)$ where $\left\|\mathscr{D}_{\frac{1}{2}}(\sigma)\right\|<\epsilon_2$ when $\sigma\in\K$ for some $\epsilon_2>0$ such that $\epsilon_2\to0$ as the width of $\lambda(\K)$ approaches nought.

Now we write $p$ as a binary expansion $p = a_1\, 2^{-1} + a_2\, 2^{-2} + \cdots + a_n \, 2^{-m}$, where $a_j \in \{0,\,1\}$, then:

\begin{equation}
\label{RationalExponentialContinuityTheorem_3}
\large{\gamma^p = \gamma^{\frac{a_1}{2}}\,\gamma^{\frac{a_2}{2^2}}\,\gamma^{\frac{a_3}{2^3}}\,\cdots\,\gamma^{\frac{a_n}{2^m}}}
\end{equation}

By iteratively applying $\eqref{RationalExponentialContinuityTheorem_2}$, once for each binary digit in the binary expansion of $p$ we get:

\begin{equation}
\label{RationalExponentialContinuityTheorem_4}
\left\|\lambda\left(\gamma^p\right)\right\| \leq \left\|\lambda(\gamma)\right\|\,\left(a_1\,\frac{1+\epsilon_3}{2} +a_2\,\left(\frac{1+\epsilon_3}{2}\right)^2+\cdots+a_n\,\left(\frac{1+\epsilon_3}{2}\right)^m\right)
\end{equation}

So now if $p<2^{-n}$ then:

\begin{equation}
\label{RationalExponentialContinuityTheorem_5}
\begin{array}{lcl}
\left\|\lambda\left(\gamma^p\right)\right\| &\leq& \left\|\lambda(\gamma)\right\|\,\left(a_{n+1}\,\left(\frac{1+\epsilon_3}{2}\right)^{n+1} +\cdots+a_{n+r}\,\left(\frac{1+\epsilon_3}{2}\right)^{n+r}\right)\\
&\leq&\left\|\lambda(\gamma)\right\|\,\left(\left(\frac{1+\epsilon_3}{2}\right)^{n+1} +\left(\frac{1+\epsilon_3}{2}\right)^{n+2}+\cdots\right)\\
&=&\left\|\lambda(\gamma)\right\|\,\frac{1+\epsilon_3}{1-\epsilon_3}\,\left(\frac{1+\epsilon_3}{2}\right)^n
\end{array}
\end{equation}

so that, if

\begin{equation}
\label{RationalExponentialContinuityTheorem_6}
\epsilon = \left\|\lambda(\gamma)\right\|\,\frac{1+\epsilon_3}{1-\epsilon_3}\,\left(\frac{1+\epsilon_3}{2}\right)^n
\end{equation}

then

\begin{equation}
\label{RationalExponentialContinuityTheorem_7}
\log p=\frac{\log\left(\frac{1}{2}\right)}{\log\left(\frac{1+\epsilon_3}{2}\right)}\,\log\epsilon – \frac{\log\left(\frac{1}{2}\right)}{\log\left(\frac{1+\epsilon_3}{2}\right)}\,\log\left(\left\|\lambda(\gamma)\right\|\,\frac{1+\epsilon_3}{1-\epsilon_3}\right)
\end{equation}

i.e.

\begin{equation}
\label{RationalExponentialContinuityTheorem_8}
\large{p \propto \epsilon^{\frac{\log\left(\frac{1}{2}\right)}{\log\left(\frac{1+\epsilon_3}{2}\right)}}}
\end{equation}

and so $p$ is defined for all $\epsilon>0$ and is a monotonically increasing function of $\epsilon$ as long as $\epsilon_3<1$. Naturally we choose $\K$ to be a small enough nucleus to make this so. Therefore, we can ensure $\left\|\lambda\left(\gamma^p\right)\right\|<\epsilon$ by choosing $p$ less than the value given by $\eqref{RationalExponentialContinuityTheorem_8}$. Thus $p\mapsto\gamma^p$ is a continuous function of $p$ for $p\in\mathbb{Q}_2\cap[0,\,1]$, $\gamma\in\K$. $\quad\square$

So now we can make a unique definition of $\gamma^\tau$ when $\tau$ is real but not a rational with a finite binary expansion – it is the unique value in $\G$ that makes $\tau\mapsto\gamma^\tau$ continuous, given the defined values of $\gamma^p$ for $p\in\mathbb{Q}_2$. So, for any real $\tau$, we define $\tau$ by a Cauchy sequence of $\mathbb{Q}_2$ numbers $\{\tau_n\}_{n=1}^\infty,\,\tau_n\in\mathbb{Q}_2$. Given the continuity just proven, we see that $\left\{\lambda\left(\gamma^{\tau_n}\right)\right\}_{n=1}^\infty$ is a Cauchy sequence in $\V$ whenever $\{\tau_n\}_{n=1}^\infty$ is a Cauchy sequence. Since $\R^N$ is complete, $\left\{\lambda\left(\gamma^{\tau_n}\right)\right\}_{n=1}^\infty$ converges to the limit point $\lambda(\gamma_\infty)\in\V$. So we define $\gamma_\infty = \gamma^\tau$.

Now, let $\{\tau_n\}_{n=1}^\infty,\,\{\varsigma_n\}_{n=1}^\infty$ be two different Cauchy sequences of finite-binary-expansion numbers in $\mathbb{Q}_2\cap[0,\,1]$ converging to $\tau,\,\varsigma\in\mathbb{Q}_2\cap[0,\,1]$, respectively such that $\{\tau_n+\varsigma_n\}_{n=1}^\infty\subset\mathbb{Q}_2\cap[0,\,1]$ also. $\{\tau_n+\varsigma_n\}_{n=1}^\infty$ is also a Cauchy sequence converging to $\tau+\varsigma$. By the continuity above, $\left\{\lambda\left(\gamma^{\tau_n+\varsigma_n}\right)\right\}_{n=1}^\infty$ is also a Cauchy sequence in $\V$ and it converges to $\gamma^{\tau+\varsigma}$, by the definition of $\gamma$ to the power of a real number in $[0,\,1]$ we have just made. But also $\left\{\lambda\left(\gamma^{\tau_n+\varsigma_n}\right)\right\}_{n=1}^\infty=\left\{\lambda\left(\gamma^{\tau_n}\,\gamma^{\varsigma_n}\right)\right\}_{n=1}^\infty$ so $\gamma^{\tau+\varsigma} = \lim\limits_{n\to\infty} \left(\gamma^{\tau_n}\,\gamma^{\varsigma_n}\right) = \gamma^\tau\,\gamma^\varsigma$, again by the definition we have just made. Therefore $\tau\mapsto\gamma^\tau$ is a continuous function for all $\tau\in[0,\,1]$ and, moreover $\gamma^\tau\,\gamma^\varsigma=\gamma^{\tau+\varsigma},\,\forall \tau,\,\varsigma,\tau+\varsigma\in[0,\,1]$. So, then, if we inductively define $\gamma^{\pm n + \tau} = \gamma^{\pm n} \gamma^\tau\in\G$ for any $n\in\mathbb{N}$ and $\tau\in[0,\,1]$, we have defined $\gamma^\tau$ for all $\tau\in\R$ and moreover the exponental of a sum is the product of the sum of exponentials, for any real number.

Furthermore, if $\{\tau_n\}_{n=1}^\infty$ is a Cauchy sequence in $\mathbb{Q}_2$ converging to $\tau\in\R$ and $\varsigma\in \mathbb{Q}_2$, then it is readily shown that $\left(\gamma^{\tau_n}\right)^\varsigma=\left(\gamma^{\varsigma}\right)^\tau_n=\gamma^{\varsigma\,\tau_n}$. Arguing as above, from the continuity of $\exp_\gamma$ in $\R$ ($\gamma_\tau$ is the unique continuous function of $\tau\in\R$ that is equal to $\gamma^t$ for $t\in\mathbb{Q}_2$), $\left\{\lambda\left(\gamma^{\varsigma\,\tau_n}\right)\right\}_{n=0}^\infty$ is a Cauchy sequence in $\R^N$ converging to:

\begin{equation}
\label{IndexMultiplicationEquation}
\lambda\left(\gamma^{\varsigma\,\tau}\right)=\lambda\left(\left(\gamma^\varsigma\right)^\tau\right)=\lambda\left(\left(\gamma^\tau\right)^\sigma\right)\,\forall\tau\in\R,\,\varsigma\in\mathbb{Q}_2
\end{equation}

Lastly, we apply the result above and the reasoning used to derive it to shown that the result above generalises for all $\tau,\,\varsigma\in\R$.

We gather these thoughts into the next important theorem.

Theorem 6.9 (Continuous One Parameter Group):

There exists a nucleus $\K\subseteq\Nid$ such that, for any $\gamma\in\K$, we can define a unique one parameter group $\{\exp_\gamma(\tau):\tau\in\R\}$ by defining the unique $C^0$ path through $\G$:

\begin{equation}
\label{ContinuousOneParameterGroupTheorem_1}
\exp_\gamma:\R\to\G;\,\exp_\gamma(\tau) = \gamma^\tau
\end{equation}

where:

\begin{equation}
\label{ContinuousOneParameterGroupTheorem_2}
\gamma^\tau = \left(\overbrace{\mathrm{sqrt}\circ\mathrm{sqrt}\circ\cdots\circ\mathrm{sqrt}}^{\text{n times}}(\gamma)\right)^m
\end{equation}

if $\tau = m\,2^{-n}$ for $n\in\mathbb{N},\,m\in\mathbb{Z}$, i.e. $\tau\in\mathbb{Q}_2$ and $\gamma^\tau = \lim\limits_{n\to\infty} \gamma^{\tau_n}$ if $\{\tau_n\}_{n=1}^\infty\subset\mathbb{Q}_2 \to \tau\in\R$.

Members of this one parameter group fulfil the wonted index laws that hold for exponentials in $\R$ and $\mathbb{C}$, to wit:

\begin{equation}
\label{ContinuousOneParameterGroupTheorem_3}
\begin{array}{lclcl}
\gamma^{\tau+\varsigma} &=& \gamma^\tau\,\gamma^\varsigma &=&\gamma^\varsigma\,\gamma^\tau\\
\left(\gamma^\tau\right)^\varsigma &=& \left(\gamma^\varsigma\right)^\tau&=&\gamma^{\tau\,\varsigma}
\end{array}\end{equation}

$\square$

So now we have defined a family one parameter groups in $\G$ for Lie group members $\gamma\in\K$ which are $C^0$ paths through $\G$ and which are also flows: i.e. with the general exponential of a sum being the product of exponentials property spoken of at the beginning of this post. But we have not yet proven that the $C^0$ paths are differentiable or $C^1$.

However, let us now take our fundamental nucleus $\K\subset \Nid$ to be small enough that:

  1. Theorem 6.9 holds for all points $\gamma\in\K$ so that a $C^0$ one parameter group $\{\sigma_\gamma(\tau):\,\tau\in\R\}$ passing through $\gamma = \sigma_\gamma(1)$ can be defined wherein $\sigma(x) = \gamma^x,\,\forall\,x\in\mathbb{Q}_2$; and
  2. Theorem 6.2 holds so that $\log: \K\to\g$ is uniquely defined so that $\exp(\log\gamma) = \gamma$.

Then we can find $X\in\g\,\ni\,\exp(X) = \gamma$. Then $\exp(X/2)$ is a square root of $\gamma$, and since the square root is unique within $\K$, we must have $\gamma^{\frac{1}{2}} = \exp(X/2)$. Likewise, on repeated application of this deduction, $\gamma^{2^{-n}} = \exp(2^{-n}\,X)$. Whence by the flow equations holding for both $\exp(X\tau)$ and $\gamma^\tau$, we can deduce $\exp(X\tau) = \gamma^\tau = \exp(X)^\tau,\,\forall\,\tau\in\mathbb{Q}_2$. Lastly, since both the continuous one parameter group $\{\gamma^\tau:\,\tau\in\R\}$ defined by Theorem 6.9 and the set $\{\exp(\tau\,X):\,\tau\in\R\}$ are $C^0$ paths through the points $\gamma^\tau\forall\,\tau\in\mathbb{Q}_2$ and, as discussed in Theorem 6.9, there is but one unique $C^0$ through all these points, we must have that $\{\exp(\tau\,X):\,\tau\in\R\}$ is precisely continuous one parameter group $\{\gamma^\tau:\,\tau\in\R\}$ defined by Theorem 6.9. So therefore, the $C^0$ one parameter group constructed in Theorem 6.9 must be $C^1$ after all and $\exp(\tau\,X) = \exp(X)^\tau,\,\forall\tau\in\R$.

This justifies our notation of $e^{\tau\,X}$.

Now suppose we have a one parameter group defined by the homomorphism $\rho:\R\to\G$ which passes through some translated copy $\gamma\,\Nid$ of the “namespace set” such that $\gamma^{-1}\,\rho(\tau)$ defines a $C^0$ path in $\Nid$ when it passes through the latter set. So let $\rho((s-\tau_0,\,s+\tau_0))$ be the $C^0$ path segment in question. In other words, there is some $\gamma^\prime\in\G$ such that $\gamma^\prime\,\rho(\tau)$ passes through the identity: say $\gamma^\prime\,\rho(s)=\id$, i.e. $\gamma^\prime \rho(s)^{-1}$ and so $\gamma^\prime = \rho(-s)$. Therefore, $\rho(-\tau_0,\,\tau_0))$ passes through $\id$ at $\tau=0$. Moreover, any $C^0$ path can be constructed as a limit of a set $\{\sigma_k:[-\tau_0,\,\tau_0]\to\Nid\}_{k=0}^\infty$ of $C^1$ paths, which become a set $\{\gamma^\prime\,\sigma_k:[-\tau_0,\,\tau_0]\to\Nid\}_{k=0}^\infty$ of $C^1$ paths passing through the identity at $\tau = 0$, and they too can be shown to converge to a $C^0$ path by Lipschitz continuity of the group product. Therefore, every segment of a one parameter group for which $\gamma^{-1}\,\rho(\tau)$ defines a $C^0$ path in $\Nid$ has corresponding to it a segment passing through $\Nid$ and linking to the the identity such that $\rho(\tau)$ is $C^0$ along this translated segment. So, if we choose a nucleus $\K\subset \Nid$ small enough that both the Theorem 6.9 and the Theorem 6.2 hold therein, the argument above holds and the segment of the one parameter group passing through $\K$ must indeed be $C^1$.

Therefore we have the theorem:

Theorem 6.10 ($\gamma^\tau$ is $C^1$)

The function $\tau\mapsto\gamma^\tau$ for $\tau\in\R$ defined in Theorem 6.9 is indeed $C^1$ wherever $\gamma^\tau\in\K$. All continuous ($C^0$) one parameter groups are of the form $\{e^{\tau\,X} = \gamma^\tau:\,\tau\in\R\}$ where $X=\log\gamma$ and $\gamma\in\K$ and are thus in fact $C^1$. $\quad\square$

We have therefore shown that for every one parameter subgroup $\mathfrak{P}$, $\mathfrak{P}\cap\gamma \K$ for any $\gamma\in\G$ must either be $C^1$ or $\mathfrak{P}\cap\gamma \K$ must be an everywhere discontinuous collection of points. Furthermore, the behaviour is the same for any $\gamma\in\G$, i.e. we cannot have $\mathfrak{P}\cap\gamma \K$ with $C^1$ behaviour for some $\gamma\in\G$ and discontinuous behaviour for other $\gamma\in\G$. Lastly, it will turn out that, in the discontinuous case, $\mathfrak{P}\cap\gamma \K$ must be an uncountable set, for we shall show that every connected Lie group $\G = \bigcup\limits_{n=0}^\infty \K^n$ is indeed covered by at most a countable number of translated copies of $\K$, i.e. $\G = \bigcup\limits_{n=0}^\infty \gamma_n \K$, a construction that I shall call the “Rossmann Snakeskin Constuction”.

When Axiom 6 Does Not Hold

Our definition of the exponential function and all results so far have assumed the Full Tangent Space Dimension Axiom 6 has been fulfilled and, as a result, there is some open nucleus $\K\subset \Nid$ such that $\lambda(\K)\subseteq\V$ is an open neighbourhood of $\Or$ in $\R^N$. In such a case any solution to a differential equation of the form $\d_\tau\,\sigma = \sigma\,X(\tau);\,\sigma(0)=\id$, i.e. any $C^1$ path through $\V$ linking to $\id$ must do so through the nucleus $\mathcal{Kl}$ and therefore there is at least some nonzero length interval $\mathcal{I}=(-\tau_{max},\,\tau_{max})$ with $\tau_{max}>0$ such that $\sigma(\mathcal{I}$ lies within the group.

Suppose now that the Lie algebra $\g$ of a putative connected Lie group $\G$ fulfills the first five connected Lie group axioms, but that is not of full dimension (i.e. the Full Tangent Space Dimension Axiom 6 is not fulfilled). Instead, the set of all possible tangents to the identity in $\Nid$ is of some dimension $M<N$, i.e. $\dim(\g)<\dim(\V)$. Then of course we cannot argue as above, because Theorem 3.9 no longer holds (the Jacobian does not have full rank so that the inverse function theorem no longer applies) and so there is no open subset of $\V$ which comprises only group elements. We must now check what happens instead when $M<N$.

Theorem 6.11 (Definition of Exponential Function In Less that Full Dimension Tangent Space):

Suppose now that the Lie algebra $\g$ of a putative connected Lie group $\G$ fulfills the first five connected Lie group axioms, but that is not of full dimension (i.e. the Full Tangent Space Dimension Axiom 6 is not fulfilled). Instead, the set of all possible tangents to the identity in $\Nid$ is of some dimension $M<N$, i.e. $\dim(\g)<\dim(\V)$. Then there is a unique solution $\sigma(\tau)$ to the Cauchy initial value problem $\left.\d_s \sigma(\tau)^{-1}\,\sigma(\tau+s)\right|_{s=0} = X;\,\sigma(0)$ where $X\in\g$ and $\mathrm{g}$ is the $M<N$ dimensional tangent space has a unique solution within the group $\G$. Thus the exponential function is still soundly defined in the less than full dimension tangent space case.

Proof: Show Proof

We build a function $\mu$ analogous to the $\mu$ in Theorem 3.9 when $M<N$, i.e. we choose $C^1$ paths $\sigma_j:(-\tau_{max},\,\tau_{max})\to\Nid$ whose tangents $\{\hat{X}_j\}_{j=1}^M$ form a $\g$-basis and then define:

\begin{equation}
\label{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_1}
\mu: (-\tau_{max},\,\tau_{max})^M\to\Nid;\,\mu(\tau_1,\,\cdots,\,\tau_M) = \prod\limits_{j=1}^M\,\sigma_j(\tau_j)
\end{equation}

Suppose now the $\tau_j$ vary with “time” $\tau$ so that $\tau_j = \tau_j(\tau)$. The variation of $\mu(\tau_1(\tau),\,\cdots,\,\tau_M(\tau))$ with $\tau$ can be shown to beget a variation in the tangent vector such that:

\begin{equation}
\label{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_2}
\left. \d_s (\mu(\tau_1(\tau),\,\cdots,\,\tau_M(\tau)))^{-1}\,\mu(\tau_1(\tau+s),\,\cdots,\,\tau_M(\tau+s))\right|_{s=0}= \sum_{j=1}^M x_j(\tau)\,\hat{X}_j;\,\mu(\tau_1(0),\,\cdots,\,\tau_M(0))=\id
\end{equation}

where the superposition weights $x_j$ of $\g$-basis members $\hat{X}_j$ in the instantaneous tangent are related to the $\tau_j$ by:

\begin{equation}
\label{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_3}
\d_\tau \left(\begin{array}{c}\tau_1\\\tau_2\\\vdots\\\tau_M\end{array}\right) = \left(\left(P(1,\,\tau)\right)_1,\,\left(P(2,\,\tau)\right)_2,\,\left(P(3,\,\tau)\right)_3,\,\cdots,\,\left(P(M,\,\tau)\right)_M\right)^{-1}\,\left(\begin{array}{c}x_1(\tau)\\ x_2(\tau)\\\vdots\\x_M(\tau)\end{array}\right)
\end{equation}

where $\left(P(j,\,\tau)\right)_j$ is the $j^{th}$ column of the square matrix $P(j,\,\tau)$ and

\begin{equation}
\label{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_4}
P(j,\,\tau) = \prod\limits_{k=0}^{M-j}\Ad(-\sigma_{M-k}(\tau_{M-k}(\tau)))
\end{equation}

Take heed that, by definition, all possible tangents to the identity of $C^1$ paths through the group are spanned by the $M<N$ vectors $\{\hat{X}_j\}_{j=1}^M$. So $\Ad(\gamma)\,X$ for any $\gamma$ within the group and any tangent to the identity $X\in\mathrm{span}\left(\{\hat{X}_j\}_{j=1}^M\right)$ must also lie within $\mathrm{span}\left(\{\hat{X}_j\}_{j=1}^M\right)$ because $\Ad(\gamma)\,X$ is the tangent to the identity of $\gamma\,\sigma(\tau)\,\gamma^{-1}$, where $\sigma(\tau)$ is a $C^1$ path through the group. Therefore, $\Ad(\gamma):\mathrm{span}\left(\{\hat{X}_j\}_{j=1}^M\right)\to\mathrm{span}\left(\{\hat{X}_j\}_{j=1}^M\right)$ for any $\gamma$ within the group and so $\Ad(\gamma)$ is wholly represented by a nonsingular $M\times M$ matrix.

The calculation in $\eqref{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_3}$ and $\eqref{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_4}$ is wholly analogous to the Wei Norman trick for the product of exponentials which calculates $\eqref{AnalyticPathDifferentialEquationTheorem_1}$ in the Theorem 5.14. The only difference is that now $\Ad(\sigma_j(\tau_j)) \hat{X}_j \neq \hat{X}_j$ in general (whereas beforehand we had $\Ad(e^{\tau_j\,\hat{X}_j}\,\hat{X}_j = \hat{X}_j$). So now, by $\eqref{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_3}$ we can choose $C^1$ variations $\tau_j(\tau)$ in the $\tau_j$ such that the $x_j$ are constant for some $\tau\in(-\tau_{max},\,\tau_{max})$ with $\tau_{max}>0$. This is because the inverted matrix in $\eqref{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_3}$ is equal to the identity at $\tau=0$, its determinant is at least a $C^0$ function of $\tau$ and so the inverted matrix is nonsingular for $\tau$ belonging to some nonzero length interval on either side of $0$. Therefore, notwithstanding the lack of full rank in the tangent space $\g$, we can solve the Cauchy initial value problem stated in the theorem if $X\in\g$ over some nonzero length interval by a $C^1$ path of the form $\sigma(\tau)=\mu(\tau_1(\tau),\,\cdots,\,\tau_M(\tau))$, i.e. by a $C^1$ path within the group. By our Uniqueness Theorem (Theorem 5.6), this solution is unique within the group and is thus exists and is uniquely defined.

$\square$

Take Heed: Since our proof of our Uniqueness Theorem rests on the Peano Existence Theorem, solutions always exist and there may be more than one of them realised as $C^1$ paths in the whole of $\V$ (Recall that the tangent space is not of full rank). But there is at most one of them that lies in the group, by our Uniqueness Theorem; since we have explicitly constructed an entity in the group with the required properties, it is the only one we can find in the group, thus leading to the soundness of definition.

Take Heed 2: The above reasoning works just as well for a time varying $X(\tau)$ where $X$ is a $C^0$ function of time, i.e. to show existence and uniqueness of a $C^1$ path solving the Cauchy initial value problem within the group (although, as above, there may be more than one path solving the problem outside the group). Naturally, for a time varying $X$, the path is no longer described by the exponential function.

Now:

  1. We have shown (Theorem 5.15) that two paths which are $C^1$ in Exponential Canonical Co-ordinates of the Second Kind combine, under the group operations, nontrivially to another $C^1$ path in the same co-ordinates;
  2. We have just shown that the exponential is still well defined when $M=\dim(\g)<N=\dim(\V)$,

we have a way to remove the need for the Full Tangent Space Dimension Axiom 6. That is:

Theorem 6.12 (Building of a Full Dimension Lie Group):

Suppose that the Lie algebra $\g$ of a putative connected Lie group $\G$ fulfills the first five connected Lie group axioms, but that is not of full dimension (i.e. the Full Tangent Space Dimension Axiom 6 is not fulfilled). Instead, the set of all possible tangents to the identity in $\Nid$ is of some dimension $M<N$, i.e. $\dim(\g)<\dim(\V)$. Then one can define a new nucleus $\K\subseteq{N}_\id^\prime$, a new $\V^\prime\subseteq\R^M$ of dimension $M$ and a new and a labeller $\lambda^\prime:\G\to\V^\prime\subseteq\R^M$ such that the putative connected Lie group $\G$ becomes a “true” connected Lie group fulfilling all six axioms when $\K\subseteq{N}_\id$, $\V$ and $\lambda$ are replaced by $\K\subseteq{N}_\id^\prime$, $\V^\prime$ and $\lambda^\prime$.

Proof: Show Proof

Every $C^1$ path $\sigma_\gamma(\tau)$ through $\gamma\in\Nid$ can be mapped, by the Group Product Continuity Axiom 3 and the Nontrivial Continuity Axiom 4 to a $C^1$ path through the identity by $\sigma_\gamma(\tau)\mapsto \gamma^{-1}\tau$ and likewise every $C^1$ path through the identity can be mapped by the same axioms to a $C^1$ path through $\gamma$; the former asssertion shows, through Theorem 3.19 that the dimension $M^\prime$ of the tangent space $\g_\gamma$ to $\gamma\in\mathfrak{N}_\id$ must be at least as great as that $M$ of the tangent space $\g$ to the identity, i.e. $M\leq M^\prime$; the latter assertion, again through Theorem 3.19, shows that the dimension$M$ of the tangent space $\g$ to the identity must be at least as great as that $M^\prime$ of the tangent space $\g_\gamma$ to $\gamma$, i.e. $M\geq M^\prime$ and $M=M^\prime$, so that the dimensions of the tangent spaces to all group elements in $\Nid$ must be the same.

Now, by the Connectedness Axiom 2, there is a $C^1$ path linking the identity to every element $\gamma$ in $\Nid$ and, since the dimension of the tangent space does not change, this $C^1$ path can be described by the Cauchy Initial Value Problem:

\begin{equation}
\label{FullDimensionLieGroupTheorem_1}
\left.\d_s \sigma(\tau)^{-1}\,\sigma(\tau + s)\right|_{s=0} = X(\tau);\,\Leftrightarrow\,\d_\tau \lambda(\sigma) = \mathbf{M}(\id,\,\sigma)\,X(\tau)\text{ and } \sigma(0)=\id
\end{equation}

where, because the dimension of the tangent space does not change with $\sigma$, $X:(-\tau_{max},\,\tau_{max})\to\g$ for some $\tau_{max}>0$ is a vector that varies within the Lie algebra $\g$ and is at least a $C^0$ function of $\tau$ (this means that in the equation on the righthand side of the $\Leftrightarrow$, $X$ is an $M\times1$ column vector function of “time” $\tau$ and $\mathbf{M}$ is a full rank (i.e. row rank M), $N\times M$ matrix function of $\tau$). Given $X(\tau)\in\g$, we can apply the same reasoning as (Theorem 6.11) in the definition of the exponential function for a less than full dimension tangent space to show that a $C^1$ path of the form of $\eqref{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_2}$ in Theorem 6.11:

\begin{equation}
\label{FullDimensionLieGroupTheorem_2}
\sigma: (-\tau_{max},\,\tau_{max})\to\Nid\subseteq\G;\,\sigma(\tau)=\prod\limits_{j=1}^M \sigma_j(\tau_j(\tau))
\end{equation}

fulfills the Cauchy initial value problem of $\eqref{FullDimensionLieGroupTheorem_1}$ above as long as $\gamma$ lies in a small enough nucleus $\K\subseteq{N}_\id$ that the matrix inversion of Theorem 6.11, $\eqref{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_3}$ exists. Certainly, this path $\sigma$ lies inside the group $\G$ and so, by our Uniqueness Theorem, it is the only path within the group solving the Cauchy initial value problem. Here, as in the Theorem 6.11, $\sigma_j$ are $C^1$ paths passing through the identity with tangent $\hat{X}_j$ there, respectively, such that $\{\hat{X}_j\}_{j=1}^M\subset\g$ is a $\g$-basis and the $\tau_j:(-\tau_{max},\,\tau_{max})\to\R$ are $C^0$ functions of time defined by solving $\eqref{ExponentialDefinitionInLessThanFullDimensionTangentSpaceTheorem_3}$ in Theorem 6.11.

Now we can just as well replace the $\sigma_j$ with the exponential paths $e^{\tau\,\hat{X}_j}$ and we shall still be able to represent any $\gamma$ inside a small enough nucleus within the group $\K\subseteq{N}_\id$ as a product of the form $\prod\limits_{j=1}^M e^{\tau_j\,\hat{X}_j}$. Therefore, having seen that (Theorem 5.15) group operations are $C^1$ on two paths of the form $\prod\limits_{j=1}^M \sigma_j(\tau_j(\tau))$, it follows that if we replace $\Nid$ by the nucleus $\K\subseteq{N}_\id$, redefine $\lambda\left(\prod\limits_{j=1}^M e^{\tau_j\,\hat{X}_j}\right) = (\tau_1,\,\cdots,\,\tau_M)$ and redefine $\V = (-\tau_{max},\,\tau_{max})^M$, then the group $\G$ with these definitions fulfills all the fundamental Lie group axioms, including the Full Tangent Space Dimension Axiom 6, this time with $\dim(\g) = \dim(\V) = M$.

$\square$

The above theorem actually gives us a procedure to define the full Lie group from a mathematical system fulfilling only the first five axioms. It is simply the following:

Theorem 6.13 (Procedure for Building a Connected Lie Group With Full Tangent Space Dimension)

Suppose we have a putative connected Lie group $\G$ fulfilling the first five connected Lie group axioms but not the sixth as in Theorem 6.12. Then the following procedure defines a full connected Lie group structure for $\G$:

  1. Choose a basis $\{\hat{X}_j\}_{j=1}^M$ for the space of all possible tangents to the identity;
  2. Define:
    \begin{equation}
    \label{FullDimensionTangentSpaceGroupBuildingProcedureTheorem_1}\Nid^\prime=\left\{\prod\limits_{j=1}^M e^{\tau_j\,\hat{X}_j}:\,\tau_j\in (-\tau_{max},\,\tau_{max})\right\}
    \end{equation}
  3. Define $\V^\prime = (-\tau_{max},\,\tau_{max})^M$ and
    \begin{equation}
    \label{FullDimensionTangentSpaceGroupBuildingProcedureTheorem_2}
    \lambda^\prime:\Nid^\prime\to\V^\prime;\,\lambda^\prime\left(\prod\limits_{j=1}^M e^{\tau_j\,\hat{X}_j}\right) = (\tau_1,\,\cdots,\,\tau_M)
    \end{equation}
    for any $\tau_max>0$ small enough that the co-ordinates are unique, i.e. small enough that:
    \begin{equation}
    \label{FullDimensionTangentSpaceGroupBuildingProcedureTheorem_3}
    \det\left(\left(e^{-\tau_M\,\ad(\hat{X}_M)}\,\cdots\,e^{-\tau_2\,\ad(\hat{X}_2)}\right)_1,\,\left(e^{-\tau_M\,\ad(\hat{X}_M)}\,\cdots\,e^{-\tau_3\,\ad(-\hat{X}_3)}\right)_2,\,\cdots,\,\left(e^{-\tau_M\,\ad(\hat{X}_M)}\right)_{N-1},\,\begin{array}{c}0\\0\\\vdots\\0\\1\end{array}\right) > 0\,\forall\,\tau_j\in(-\tau_{max},\,\tau_{max})
    \end{equation}

Then $\G = \bigcup\limits_{k=1}^\infty \left(\Nid^\prime\right)^k$ together with the namespace set in Equation (1), the new $\V^\prime$ and the labeller map in $\eqref{FullDimensionTangentSpaceGroupBuildingProcedureTheorem_3}$ is a connected Lie group of dimension $M$, fulfilling all six connected Lie group axioms. $\quad\square$

Since we have seen that group operations on geodesic co-ordinates are also at least $C^1$, there is also a direct analogue of the above procedure in geodesic co-ordinates for defining the full dimension Lie group as follows:

Theorem 6.14 (Procedure for Building a Connected Lie Group With Full Tangent Space Dimension)

Suppose we have a putative connected Lie group $\G$ fulfilling the first five connected Lie group axioms but not the sixth as in Theorem 6.12. Then the following procedure defines a full connected Lie group structure for $\G$:

  1. Choose a basis $\{\hat{X}_j\}_{j=1}^M$ for the space of all possible tangents to the identity;
  2. Define:
    \begin{equation}
    \label{GeodesicCoordinatesFullDimensionTangentSpaceGroupBuildingProcedureTheorem_1}
    \Nid^\prime=\left\{\exp\left(\sum\limits_{j=1}^M x_j\,\hat{X}_j\right):\,x_j\in (-x_{max},\,x_{max})\right\}
    \end{equation}
  3. Define $\V^\prime = (-x_{max},\,x_{max})^M$ and
    \begin{equation}
    \label{GeodesicCoordinatesFullDimensionTangentSpaceGroupBuildingProcedureTheorem_2}
    \lambda^\prime:\Nid^\prime\to\V^\prime;\,\lambda^\prime\left(\exp\left(\sum\limits_{j=1}^M x_j\,\hat{X}_j\right)\right) = (x_1,\,\cdots,\,x_M)
    \end{equation}
    for any $x_max>0$ small enough that the co-ordinates are unique, i.e. small enough that:
    \begin{equation}
    \label{GeodesicCoordinatesFullDimensionTangentSpaceGroupBuildingProcedureTheorem_3}
    \det\left(\sum\limits_{k=0}^\infty \frac{(-1)^k\,\ad\left(\sum\limits_{j=1}^M x_j\,\hat{X}_j\right)^k}{(k+1)!} \right) > 0\,\forall\,x_j\in(-x_{max},\,x_{max})
    \end{equation}

Then $\G = \bigcup\limits_{k=1}^\infty \left(\Nid^\prime\right)^k$ together with the namespace set in $\eqref{GeodesicCoordinatesFullDimensionTangentSpaceGroupBuildingProcedureTheorem_1}$, the new $\V^\prime$ and the labeller map in $\eqref{GeodesicCoordinatesFullDimensionTangentSpaceGroupBuildingProcedureTheorem_3}$ is a connected Lie group of dimension $M$, fulfilling all six connected Lie group axioms. $\quad\square$

These procedures can be thought of as a grounding for the definition of a linear group in Chapter 1 and 2 of [Rossmann] as any subgroup of the general linear matrix group $GL(N,\R)$ with a nontrivial space of tangents to the identity.

Further Material:

This ends the discussion of the exponential function. However, I shall now show explicit constructions of the exponential and logarithm functions in the spirit of Henry Briggs and John Napier. Unfortunately these constructions do not always work. It is readily shown, through the construction of exponential canonical co-ordinates of the second kind and of geodesic co-ordinates, that the square and square root functions can both be made $C^\infty$ in the co-ordinates. One simply goes back to the differential equations $\d_\tau \sigma^2 = \sigma^2\,X(\tau)$ and $\d_\tau \sigma = \sigma\,X(\tau)$ and, using the analytic expressions for the $\mathbf{M}(1,\,\sigma)$ found for the two co-ordinate systems just named, proves that the implicit transformation $\sigma\mapsto\sigma^2$ and its inverse has derivatives of all orders. However, this behaviour is not true for all co-ordinate systems, so the following proofs only work when stronger assumptions are made about the group product. They cannot therefore be used to motivate fundamental definitions of the exponential and logarithm functions grounded only on our axioms. The fact that we can find $C^\infty$ co-ordinates means that there are co-ordinates that will fulfill the following assumptions made.

What the possible nonconvergence in the non-Hölder continuous derivative case means that the following method for calculating a logarithm will not work in all co-ordinate systems. This would be an important fact, say, in a numerical setting.

Firstly, for Briggs’s method of constructing the logarithm, we shall have to assume stronger continuity conditions in our basic Group Product Continuity Axiom 3 and Nontrivial Continuity Axiom 4 with $C^1$ replaced by the stronger, Hölder continuity condition $C^{1,\alpha},\,\alpha > 0$. When we do this, the paths $\exp_\gamma(\tau)$ are $C^1$.

Definition 6.15 (Hölder Continuity):

A function $f:\U\R^n\to\R^m$ is called Hölder continuous over $\U$ iff:

\begin{equation}
\label{HoelderCriterion}
\exists\,\alpha,\,K>0\,\ni\,\left\|f(X_1)-f(X_2)\right\|\leq K\,\left\|X_1-X_2\right\|^\alpha,\,\forall\,X_1,\,X_2\in\U
\end{equation}

Lipschitz continuity is the special case $\alpha = 1$. The square root function $x\mapsto\sqrt{x}$ defined on positive reals is not Lipschitz continuous nor $C^1$ at $x=0$, but it is Hölder continuous.

Figure 6.2 shows the general idea of how we are going to try to calculate $\left.\d_\tau \gamma^\tau\right|_{\tau=0}$. By the fundamental flow property we have:

\begin{equation}
\label{ExponentialDerivativeEquation}\left.\d_\tau \gamma^{\varsigma+\tau}\right|_{\tau=0}=\lim\limits_{\delta\to0} \delta^{-1}\,\left(\lambda(\gamma^\varsigma\,\gamma^\delta)-\lambda(\gamma^\varsigma)\right) = \mathbf{M}(\id,\,\gamma^\varsigma) \lim\limits_{\delta\to0} \delta^{-1}\,\lambda(\gamma^\delta)
\end{equation}

if indeed the limit exists.

Estimating Log Gamma

Figure 6.2: Estimating the Tangent of $\exp_\gamma(\tau)$ at the Identity

Therefore, we only need to prove $\lim\limits_{\delta\to0} \delta^{-1}\,\lambda(\gamma^\delta)$ exists and we shall have proven $\left.\d_\tau \gamma^{\varsigma+\tau}\right|_{\tau=0}$ exists for any value of $\varsigma$ such that $\gamma^\varsigma\in\Nid$. Moreover, $\mathbf{M}(\id,\,\gamma^\varsigma)$ is a $C^0$ function of the co-ordinates $\lambda(\gamma^\varsigma)$ and $\lambda(\gamma^\varsigma)$ in turn is a $C^0$ function of $\varsigma$, as we have shown above in Theorem 6.9, so, the composition of two $C^0$ functions being $C^0$, the derivative $\left.\d_\tau \gamma^{\varsigma+\tau}\right|_{\tau=0} = \mathbf{M}(\id,\,\gamma^\varsigma)\,\left.\d_\tau \gamma^\tau\right|_{\tau=0}$ is a $C^0$ function of $\varsigma$, as long as we can prove $\left.\d_\tau \gamma^\tau\right|_{\tau=0}$ exists. We gather these thoughts into the theorem:

Theorem 6.16 ($\exp_\gamma$ is $C^1$):

In a connected Lie group with the basic Group Product Continuity Axiom 3 and Nontrivial Continuity Axiom 4 strengthened so that the preservation of $C^1$ paths by the group product is replaced by the stronger, preservation of Hölder continuity $C^{1,\alpha},\,\alpha > 0$ of paths, there exits a nucleus $\K\subseteq\Nid$ such that, for any $\gamma\in\K$, the path:

\begin{equation}
\label{ExpIsC1Theorem_1}\exp_\gamma:\R\to\G;\,\exp_\gamma(\tau) = \gamma^\tau
\end{equation}

is a $C^1$ path through $\Nid$.

Proof: Show Proof

By the discussion above it is only left to prove that $\left.\d_\tau \gamma^\tau\right|_{\tau=0}$ exists. We estimate the tangent as:

\begin{equation}
\label{ExpIsC1Theorem_2}
D_n = 2^n \lambda\left(\overbrace{\mathrm{sqrt}\circ\mathrm{sqrt}\circ\cdots\circ\mathrm{sqrt}}^{\text{n times}}(\gamma)\right)
\end{equation}

We calculate $D_n$ by the following recurrence:

\begin{equation}
\label{ExpIsC1Theorem_3}
\begin{array}{rlcl}
&\lambda(\gamma^{2^{-n}}) &=& \lambda\left(\gamma^{2^{-(n+1)}}\,\gamma^{2^{-(n+1)}}\right)\\
&&=& \left(2\, \id_N + \mathscr{D}_2(\gamma^{2^{-(n+1)}})\right) \lambda(\gamma^{2^{-(n+1)}})\\
\Leftrightarrow&\lambda(\gamma^{2^{-(n+1)}})&=&\left(2\, \id_N + \mathscr{D}_2(\gamma^{2^{-(n+1)}})\right)^{-1}\lambda(\gamma^{2^{-n}})\\ \\\\
\Rightarrow&D_n&=& 2^n \lambda(\gamma^{2^{-n}})\\
&&=& \prod\limits_{j=1}^n \left(\id_N + \frac{1}{2}\mathscr{D}_2(\gamma^{2^{-(n+1)}})\right)^{-1}\,\lambda(\gamma)
\end{array}
\end{equation}

Take heed that we have written the above in terms of the derivative of $\mathrm{sqr}$, not in terms of the derivative of $\mathrm{sqrt}$. This method allows us to exploit the stronger Hölder continuity of the first derivative of the group product that shows up in the square $\sigma\mapsto\sigma\,\sigma$: by reasoning as in Lemma 3.17 with $C^1$ replaced by the Hölder continuous derivative $C^{1,\alpha}$ with exponent $\alpha > 0$, we see that $\left\|\mathscr{D}_2(\sigma)\right\|\leq K \left\|\lambda(\sigma)\right\|^\alpha$. Thus:

\begin{equation}
\label{ExpIsC1Theorem_4}
\begin{array}{lcl}
\left\|\mathscr{D}_2(\gamma^{2^{-(n+1)}})\right\|&\leq &K \left\|\lambda(\gamma^{2^{-(n+1)}})\right\|^\alpha\leq\mathscr{R}(n+1,\,\alpha);\\
\mathscr{R}(j,\,\alpha)&\stackrel{def}{=}&K \left\|\lambda(\gamma)\right\|^\alpha\,\left(\frac{1}{2}+\epsilon_2\right)^{j\,\alpha}
\end{array}
\end{equation}

where we assume that $\gamma\in\K$ belongs to a nucleus small enough that $\left\|\lambda(\mathrm{sqrt}(\gamma))\right\|\leq \left(\frac{1}{2} + \epsilon_2\right)\,\left|\lambda(\gamma)\right\|$ and that $\epsilon_2<\frac{1}{2}$.

So now we must prove that the infinite product in $\eqref{ExpIsC1Theorem_3}$ converges. To do this, let us study $\left\|D_n-D_m\right\|$ and thus prove that $\{D_n\}_{n=1}^\infty\subset\V$ is a Cauchy sequence in $\R^N$ and therefore converges. In the following we assume, without loss of generalness, that $m<n$, and also that $m$ is big enough that $\mathscr{R}(m,\,\alpha) < 1$ so that $1/(1-\frac{1}{2}\mathscr{R}(m,\,\alpha)) \leq 1+ \mathscr{R}(m,\,\alpha)$:

\begin{equation}
\label{ExpIsC1Theorem_5}
\begin{array}{lcl}
\left\|D_n-D_m\right\| &=&\left\|\prod\limits_{j=1}^m \left(\id_N + \frac{1}{2}\mathscr{D}(\gamma^{2^{-(j+1)}})\right)^{-1}\left(\id_N-\prod\limits_{j=m}^n \left(\id_N + \frac{1}{2}\mathscr{D}(\gamma^{2^{-(j+1)}})\right)^{-1}\right)\right\|\,\left\|\lambda(\gamma)\right\|\\
&\leq&\left\|\prod\limits_{j=1}^m \left(\id_N + \frac{1}{2}\mathscr{D}(\gamma^{2^{-(j+1)}})\right)^{-1}\right\|\;\left\|\left(\id_N-\prod\limits_{j=m}^n \left(\id_N + \frac{1}{2}\mathscr{D}(\gamma^{2^{-(j+1)}})\right)^{-1}\right)\right\|\,\left\|\lambda(\gamma)\right\|\\
&\leq&\left|\frac{1}{\prod\limits_{j=1}^m \left(1 – \left\|\frac{1}{2}\mathscr{D}(\gamma^{2^{-(j+1)}})\right\|\right)}\right|\;\left|1 – \frac{1}{\prod\limits_{j=m}^n \left(1 – \left\|\frac{1}{2}\mathscr{D}(\gamma^{2^{-(j+1)}})\right\|\right)}\right|\,\left\|\lambda(\gamma)\right\|\\
&\leq&\left|\prod\limits_{j=1}^m \left(1 + \mathscr{R}(j+1,\,\alpha)\right)\right|\;\left|\prod\limits_{j=m}^n \left(1 + \mathscr{R}(j+1,\,\alpha)\right)-1\right|\,\left\|\lambda(\gamma)\right\|
\end{array}
\end{equation}

Since the first term $\prod\limits_{j=1}^m \left(1 + \mathscr{R}(j+1,\,\alpha)\right)< \prod\limits_{j=1}^\infty \left(1 + \mathscr{R}(j+1,\,\alpha)\right)$ is bounded by an infinite product of terms of the form $1 + \mathscr{R}(j+1,\,\alpha)$ and the $\mathscr{R}(j+1,\,\alpha)$ are all positive, this term converges if and only if $\sum\limits_{j=0}^\infty \mathscr{R}(j+1,\,\alpha)$ converges. This latter sum, by $\eqref{ExpIsC1Theorem_4}$ , is a convergent geometric series, so let us write:

\begin{equation}
\label{ExpIsC1Theorem_6}
\prod\limits_{j=1}^m \left(1 + \mathscr{R}(j+1,\,\alpha)\right)=\mathscr{C}
\end{equation}

a bounded, positive number. Likewise:

\begin{equation}
\label{ExpIsC1Theorem_7}
\prod\limits_{j=m}^n \left(1 + \mathscr{R}(j+1,\,\alpha)\right)<\exp\left(\sum\limits_{j=m}^n\mathscr{R}(j+1,\,\alpha)\right)<\exp\left(\sum\limits_{j=m}^\infty\mathscr{R}(j+1,\,\alpha)\right)
\end{equation}

so that:

\begin{equation}
\label{ExpIsC1Theorem_8}
\begin{array}{lcl}
\left\|D_n-D_m\right\|&<&\mathscr{C}\, \left(\exp\left(\sum\limits_{j=m}^\infty\mathscr{R}(j+1,\,\alpha)\right)-1\right)\,\left\|\lambda(\gamma)\right\|
\\&=&\mathscr{C}\, \left(\exp\left(K\,\left\|\lambda(\gamma)\right\|^\alpha\,\frac{\left(\frac{1}{2}+\epsilon_2\right)^{\alpha\,(m+1)}}{1-\left(\frac{1}{2}+\epsilon_2\right)^\alpha}\right)-1\right)\,\left\|\lambda(\gamma)\right\|
\end{array}
\end{equation}

The argument of $\exp(\cdot)$ dwindles exponentially to nought as $m\to\infty$ as long as $\alpha>0$, thus for every $\epsilon>0$ we can find $M$ such that $\left\|D_n-D_m\right\|<\epsilon$ for $M < m < n$. Therefore the infinite product in $\eqref{ExpIsC1Theorem_3}$ converges by completeness of $\R^N$ to a well-defined derivative for any $\gamma\in\mathcal{C}$. $\quad\square$

The above theorem gives showing to a sequence of functions $D_n:\K\to\g$ that take as input any member $\gamma\in\K$ and output an estimate $D_n(\gamma) = 2^n\,\lambda\left(\gamma^{2^{-n}}\right)$ of the tangent to the $C^1$ path $\exp_\gamma:\R\to\G;\,\exp_\gamma(\tau) = \gamma^\tau$, whose existence and $C^1$ nature we have shown, and these functions converge to that tangent, also as we have just shown. We call the limiting function the logarithm:

\begin{equation}\label{LogDefinitionEquation}\log:\K\to\g;\,\log\gamma = \lim\limits_{n\to\infty} 2^n\,\lambda\left(\gamma^{\frac{1}{2^n}}\right)\end{equation}

because:

  1. $\log\left(\gamma^\tau\right) = \tau \log\gamma\,\forall \,\gamma\in\K,\,\tau\in[0,\,1]$; and
  2. $\log(\gamma_1\,\gamma_2) = \log\gamma_1+\log\gamma_2,\,\forall\,\gamma_1,\,\gamma_2,\,\gamma_1\,\gamma_2\in \{\gamma^\tau:\,\tau\in[0,\,1]\}\subset\K$

These results follow from our result that $\gamma^\tau$ defines a $C^1$ path through the identity. If $X =\log \gamma$ then this means that $\left.\d_\tau \lambda\left(\gamma^\tau\right)\right|_{\tau=0}=X\in\g$. Then, from the index laws in Theorem 6.9, we get $\left(\gamma^\varsigma\right)^\tau = \gamma^{\varsigma\,\tau}$, so that $\left.\d_\tau \lambda\left(\left(\gamma^\varsigma\right)^\tau\right)\right|_{\tau=0}=\left.\d_\chi \lambda\left(\gamma^\chi \right)\right|_{\chi=0}\,\cdot\, \d_\tau \chi= \varsigma\,X$ where $\chi = \varsigma\,\tau$. For the second relationship, we put $\gamma_1 = \gamma^{\tau_1},\,\gamma_2 = \gamma^{\tau_2}$ for $\tau_1,\,\tau_2,\,\tau_1+\tau_2\in[0,\,1]$. Then:

\begin{equation}\label{LogPropertiesEquation_1}\log(\gamma_1\,\gamma_2)\stackrel{def}{=}\left.\d_\tau \lambda\left(\left(\gamma_1\,\gamma_2\right)^\tau\right)\right|_{\tau=0} = \left.\d_\tau \lambda\left(\gamma^{\left(\tau_1+\tau_2\right)\,\tau}\right)\right|_{\tau=0} = (\tau_1+\tau_2)\,X\end{equation}

and, since, by the first result, $\log \gamma_j = \log \gamma^{\tau_j} = \tau_j\,\log\gamma,\, j=1,\,2$ the second result follows. These results in particular mean that $\log$ maps the set $\{\gamma^\tau:\,\tau\in[-1,\,1]\}$ to the partial ray $\{\tau\,X:\,\tau\in[-1,\,1]\}\subset\g$ through the Lie algebra $\g$. By Lemma 3.12, we can assume that $\K$ is a symmetric nucleus at the outset. Hence we can extend our definition of $\tau\mapsto\gamma^\tau$ from $\tau\in[0,\,1]$ to $\tau\in[-1,\,1]$, when we (obviously) define $\gamma^{-\tau}=\left(\gamma^\tau\right)^{-1}$. These simple discussions lead to:

Theorem 6.17 (Logarithm and Its Properties):

There exists a nucleus $\K\subseteq\Nid$ such that the function:

\begin{equation}
\label{LogarithmPropertiesTheorem_1}
\begin{array}{lcl}
\log:\K\to\g;\,\log\gamma &=& \lim\limits_{n\to\infty} \left(2^n\,\lambda\left(\gamma^{\left(\frac{1}{2}\right)^n}\right) \right)\\
&=& \lim\limits_{n\to\infty} \left(2^n\,\lambda\left(\overbrace{\mathrm{sqrt}\circ\mathrm{sqrt}\circ\cdots\circ\mathrm{sqrt}}^{\text{n times}}(\gamma)\right)\right)
\end{array}
\end{equation}

is well defined and:

  1. $\log \gamma$ equals the tangent $X\in\g$ to the $C^1$ path $\gamma^\tau$ for $\tau\in[-1,\,1]$ at the identity $\id=\gamma^0$;
  2. The logarithm fulfills the “slide rule isomorphism” law $\log\left(\eta\,\zeta\right) = \log\eta+\log\zeta$ for any $\eta,\,\zeta\in \{\gamma^\tau:\,\tau\in[-1,\,1]\}\subset\K$ such that $\eta\,\zeta\in \{\gamma^\tau:\,\tau\in[-1,\,1]\}$ as well; and
  3. The logarithm fulfills $\log\left(\zeta^\varsigma\right)=\varsigma\log\zeta$ for any $\zeta\in \{\gamma^\tau:\,\tau\in[-1,\,1]\}\subset\K$ such that $\zeta^\varsigma\in \{\gamma^\tau:\,\tau\in[-1,\,1]\}$ as well.

$\square$

So now we have just seen a generalisation of the Briggs procedure for calculating logarithms of real numbers to the calculation of the corresponding quantity in a connected Lie group.

An “inverse” construction can be made for the Naperian definition of the exponential function, but now we must assume $C^{1,1}$ instead of $C^1$ in our basic axioms. Namely, we take any $C^{1,1}$ path $\sigma_X$ with tangent $X$ at the identity and we form the following iterates: $\sigma_X(1),\,\sigma_X\left(\frac{1}{2}\right)^2,\,\sigma_X\left(\frac{1}{4}\right)^4\,\cdots$. So we iterate $\sigma\mapsto(\mathrm{sqrt}(\sigma))^2$ and this procedure is the equivalent of calculating the sequence $\left\{\left(1+\frac{x}{2^N}\right)^{2^n}\right\}_{n=1}^\infty$ to calculate the exponential function of the real number $\tau$. As stated, convergence can only be ensured by assuming a $C^{1,1}$ (Lipschitz continuous first derivatives) group product, which we shall show, through the Campbell Baker Hausdorff (CBH) theorem, can be achieved with geodesic co-ordinates. Since none of the following is used in the derivation of the CBH theorem, there is no begging of the question here.

Although not general, the Briggsian and Naperian approach above work perfectly for the class of Lie groups studied by von Neumann[von Neumann, 1929]. Here we restrict ourselves to matrix Lie groups which are closed subgroups of $GL(M,\,\R)$, where $M$ is big enough for the matrix Lie group in question to be contained in $GL(M,\,\R)$. In $GL(M\,\R)$, we have seen in Example 1.9 and Example 1.8 (note that $GL(M,\,\mathbb{C})\subset GL(2\,M,\,\R)$) that $\lambda=\log$ as defined by the convergent logarithm Taylor series and $\Nid = \exp(\U)$ where $\U$ is a neighbourhood of $\Or$ in the topology induced by the Frobenius norm and $\U$ is small enough to make the logarithm Taylor series converge. It is easy to show that in $\Nid$ square roots are unique, so that the sequence of iterated square roots $\left\{\gamma^{2^{-k}}\right\}$ can be studied as above. Then:

Theorem 6.18: (Exponentials in Closed Subgroups of $GL(M,\,\R)$)

Let $X$ be the tangent at the identity of any $C^1$ path through a closed subgroup $\G\subset GL(M,\,\R)$. Then $e^{\tau\,X}\in\G,\,\forall\,\tau\in\R$.

Proof: If $X$ is a tangent to $\sigma:\R\to\G$ at the identity, then $s\,X$ is a tangent to the path $\sigma_s:\R\to\G;\,\sigma_s(\tau) = \sigma(s\,\tau)$, simply by matrix differentiation. Choose $s$ small enough that $\sigma_s(1)\in\Nid$, where the matrix logarithm series converges at all points in $\Nid$. Then $\sigma_s(1/k)\in\G,\forall\,k\in\mathbb{N}$ so $(\sigma_s(1/k))^k\in\G\,\forall,\,k\in\mathbb{N}$. But $\log \sigma_s(1/k)$ is defined by the convergent logarithm series for all $k\in\mathbb{N}, and, since $\sigma_s$ is $C^1$, we have $\sigma_s(1/k) = \id + X/k + \mathscr{D}(k)/k$ where $\mathscr{D}(k)\to0$ as $k\to\infty$, whence, by the logarithm series $k\,\log\,\sigma_s(1/k)\to X$ as $k\to\infty$. But $\log$ and $\exp$ are continuous, therefore $\exp(k\,\log\sigma_s(1/k))$, whose terms are exponentials of a Cauchy sequence, is Cauchy with respect to e.g. the Frobenius norm. Moreover $\exp(k\,\log\sigma_s(1/k)) = (\sigma_s(1/k))^k\in\G\,\forall,\,k\in\mathbb{N}$, so, by assumption of the closedness of $\G$ in $GL(M,\,\R)$, $\lim\limits_{k\to\infty}\exp(k\,\log\sigma_s(1/k))=e^{s X}\in\G$ for any $s\in\R$ small enough that $\sigma_s(1)\in\Nid$. But, for any $\tau\in\R\,\exists\,n\in\mathbb{N}\,\ni\,\tau/n<1$, so $e^{\tau,\,X} = \left(e^{\tau\,X/n}\right)^n\in\G$ because $e^{\tau\,X/n}\in\G$ and $\G$ is a group.$\quad\square$.

$\G$ is needfully closed in $GL(M,\,\R)$ in von Neumann’s approach, so his approach may seem overly restricted. However, his approach does work for all matrix groups because we have the following:

Theorem 6.19: ([Goto, 1950]Closed Subgroups of $GL(\mathbb{K})$)

Any linear Lie group $\G$ is Lie-isomorphic to a closed subgroup of $GL (\mathbb{K})$, where $\mathbb{K}=\R,\,\mathbb{C}$ and the dimension of $GL (\mathbb{K})$ is big enough that $\G\subset GL(\mathbb{K})$.

Proof: See [Goto, 1950].$\quad\square$

Thus, even though a particular matrix Lie group may not closed in $GL(M,\,\R)$, we can always find a matrix Lie group that is both closed in $GL(M,\,\R)$ and Lie-isomorphic to the particular matrix Lie group in question. Thus we can justify the argument in Theorem 6.18 by choosing the isomorphic, closed copy of a matrix Lie group that is not closed in $GL(M,\,\R)$.

A famous theorem of Élie Cartan, proven in 1930, shows that closed Lie subgroups of a general Lie group have an important property: they are a topological embedding in their containing Lie supergroup. That is, their group topology, which we shall discuss later, is the relative topology in the Lie supergroup’s group topology. This convenient relationship between group topologies does not hold for Lie subgroups which are not closed in the Lie supergroup. These issues will be discussed in a later post.

References:

  1. Giuseppe Peano, Demonstration de l’intégrabilité des équations différentielles ordinaires, Mathematische Annalen, 37 (1890) 182–228.
  2. Friedrich Schur, Zur Theorie der endlichen Transformationsgruppen, Abhandlungen aus dem Mathematischen Seminar der Universität Hamburg, 4, 15-32, 1891
  3. Wulf Rossmann: “Lie Groups: An Introduction through Linear Groups (Oxford Graduate Texts in Mathematics)”, §1.2, Theorem 5
  4. John von Neumann, ¨Uber die analytischen Eigenschaften von Gruppen linearer Transformationen und ihrer Darstellungen. Mathematische Zeitschrift, 30:3–42, 1929.
  5. Morikuni Goto, “Faithful representations of Lie groups. II“, Nagoya Math. J. 1,  pp91–107, 1950.