Chapter 2: Some Examples of Connected Lie Groups

We shall now study some examples of Connected Lie groups as defined by our five axioms. The following examples will serve as concrete groups which illustrate the theoretical behaviours and properties that the following chapters will talk about and prove. Many of these examples illustrate techniques we shall use for our later proofs about connected matrix Lie groups. In particular they illustrate how our general proofs can work for matrix groups, i.e. groups which are groups of matrices fulfilling the axioms above. All of these techniques will be generalised to the general, non-matrix case in later chapters. Matrix groups will ultimately be shown to comprise “almost all” matrix groups.

One Dimensional and other Abelian Connected Lie Group Examples

Example 1.1: (One-Dimensional Connected Lie Group Examples)

The group $(\R,\,+)$ is a connected Lie group when we take $\Nid=\V=\R$ and $\lambda$ to be the identity function.

The group $(\R^+\setminus\{0\},\,*)$ of strictly positive reals together with multiplication is a connected Lie group when we take $\Nid=\R^+\setminus\{0\}$, $\V=\R$ and $\lambda = \log$.

$(\R,\,+)$ and $(\R^+\setminus\{0\},\,*)$ are of course isomorphic with isomorphisms $\log:\R^+\setminus\{0\}\to\R$ and $\exp:\R\to\R^+\setminus\{0\}$

The “circle group” $U(1)$, which is the set $\mathcal{C}=\{e^{i\,\phi}|\,\phi\in\R\}$ together with the operation of complex multiplication is a connected Lie group when we take $\Nid=\mathcal{C}$, $\V$ to be the real interval $(-\pi,\,\pi)$ and $\lambda = i^{-1}\,\log$.

The “circle group” $SO(2)$, which is the set:


together with the operation of matrix multiplication is a connected Lie group when we take $\Nid=\mathcal{S}$, $\V$ to be the real interval $(-\pi,\,\pi)$ and


The two circle groups are of course isomorphic when $e^{i\,\theta}$ is identified with $\left(\begin{array}{cc}\cos\theta&-\sin\theta\\\sin\theta&\cos\theta\end{array}\right)$.

All of these examples are one dimensional connected Lie groups, $N=1$.

Example 1.2: (Abelian Lie Group Examples)

$(\R^N,\,+)$, the group of real vectors together with vector addition is a connected Lie group when we take $\Nid=\V=\R^N$ and $\lambda$ to be the identity function.

The group $U(1)^N=\mathbb{T}^N$, i.e. the Cartesian product of $N$ circles $\mathcal{C}^N=\{(e^{i\,\phi_1},\,e^{i\,\phi_2},\,\cdots,\,e^{i\,\phi_N})|\,\phi_j\in\R\}$ together with termwise complex multiplication is a connected Lie group when we take $\Nid=\mathcal{C}^N$, $\V$ to be the Cartesian product $(-\pi,\,\pi)^N$ of $N$ open intervals and $\lambda$ to be defined by $\lambda((e^{i\,\phi_1},\,e^{i\,\phi_2},\,\cdots,\,e^{i\,\phi_N})) = (\phi_1,\,\phi_2,\,\cdots,\,\phi_N)$. This group, as a compact topological space, is a compact connected Lie group and is also called the torus $\mathbb{T}^N$.

The “mixed” group comprising the Cartesian product of $N$ circles and real lines, e.g. $\mathcal{M}=\{(e^{i\,\phi_1},\,x_2,\,x_3,\,\cdots,\,e^{i\,\phi_N})|\,\phi_j\in\R\}$ together with termwise addition / complex multiplication as approriate, i.e. for $\mathcal{M}$ the group operation would be

\begin{equation}\label{AbelianLieGroupExamples_1}\begin{array}{rl}&(e^{i\,\phi_1},\,x_2,\,x_3,\,\cdots,\,e^{i\,\phi_N})\bullet(e^{i\,\phi_1^\prime},\,x_2^\prime,\,x_3^\prime,\,\cdots,\,e^{i\,\phi_N^\prime})\\ = &(e^{i\,(\phi_1+\phi_1^\prime)},\,x_2+x_2^\prime,\,x_3+x_3^\prime,\,\cdots,\,e^{i\,(\phi_N+\phi_N^\prime)})\end{array}\end{equation}

is a connected Lie group when we take $\Nid=\mathcal{M}$, $\V = (-\pi,\,\pi)\times \R\times\R\times\cdots\times(-\pi,\,\pi)$ (or whatever the appropriate mixture of $(-\pi,\,\pi)$ and $\R$ is for other, like examples) and we define $\lambda((e^{i\,\phi_1},\,x_2,\,x_3,\,\cdots,\,e^{i\,\phi_N})) = (\phi_1,\,x_2,\,x_3,\,\cdots,\,\phi_N)$.

All connected Abelian Lie groups are of one of these kinds.

The Rotation, Orthogonal and Unitary Groups

Example 1.3: (Three Dimensional Rotation Group $SO(3)$)

The composition of two rotations about axes through the origin in $\R^3$ is a rotation. Every rotation has an inverse (to wit: the rotation about the same axis with the same magnitude, but opposite sense, angle), so therefore the set of rotations in $\R^3$ about axes through the origin is a group. The rotation of $x\,X+y\,Y$, where $X,\,Y$ are any two vectors in $\R^3$ and $x,\,y\in\R$, is $x$ times the rotated $X$ vector summed with $y$ times the rotated $Y$, so that rotations are linear. If we consider only rotations about axes through the origin, such transformations are homogeneous. So our rotation can be represented by a matrix $U$ so that:

\begin{equation}\label{SO3Example_1}\mathcal{U}:\R^3\to \R^3:\;X\mapsto U\,X\end{equation}

After any rotation, lengths of positions vectors stay the same, as do angles between position vectors. This means the inner product $\left<X,\,Y\right> = X^T\,Y = Y^T X$ between any pair of position vectors $X$ and $Y$ is unchanged. Therefore:

\begin{equation}\label{SO3Example_2}\left<U\,X,\,U\,Y\right> = (U\,X)^T\,(U\,Y) = X^T\,U^T\,U\,Y = \left<X,\,Y\right> = X^T\,Y\end{equation}

or equivalently:

\begin{equation}\label{SO3Example_3}X^T\,\left(U^T\,U – \id\right)\,Y = 0;\;\forall X,\,Y\in\mathbb{R}^3\end{equation}

Since we can put any pair of basis vectors $X$, $Y$ into the above equation, we must have:

\begin{equation}\label{SO3Example_4}U^T\,U = \id\,\Leftrightarrow\,U^{-1}=U^T\end{equation}

as an identity. Therefore the matrix must be orthogonal. Here naturally $\id$ is the $3\times3$ identity matrix. Real, orthogonal matrices are normal, i.e. they commute with their Hermitian conjugate (in this case their transpose – because the transpose is the inverse) and so they and so can always be diagonalised (have a strictly diagonal Jordan normal form) and have mutually orthogonal eigenvectors. So therefore a matrix logarithm can always be found for an orthogonal matrix even when the matrix logarithm series (Newton-Mercator series) diverges: since $U = T\,\Lambda\,T^\dagger$ for unitary $T$ and diagonal $\Lambda$, $H=\log U = T\,\log\Lambda\,T^\dagger=T\,\log\Lambda\,T^{-1}$ exponentiates to $U$ by the universally convergent exponential Taylor series so that $U = \exp(H)$. We can pull out the rotation angle from the “logarithm” $H$ by noting that, , for a fixed rotation axis, the composition of two rotations through angles $\theta_1$ and $\theta_2$ is the same as a single rotation about the same axis through angle $\theta_1+\theta_2$. Moreover, the rotation matrix as a matrix function of the angle $\theta$ is continuous, since the norm of the difference between vectors rotated through different angles can be made arbitrarily small ($\epsilon$) by bounding the difference between the angles ($\delta$) (for every vector difference $\epsilon$ there is an angle difference $\delta$ guaranteeing $\epsilon$ will be heeded). The only matrix function fulfilling all these conditions is:

\begin{equation}\label{SO3Example_5}\exp(\theta\,H)=\id + \theta \, H + \frac{\theta^2}{2!}\,H^2 + \frac{\theta^3}{3!}\,H^3+\cdots\end{equation}

for some constant $3\times3$ real matrix $H$ that characterises in some way the axis of rotation. The inverse rotation is a rotation about the same axis through the same angle but in opposite sense, therefore, by $\eqref{SO3Example_4}$ we must then have $\exp(\theta\,H)^{-1}=\exp(\theta\,H)^T=\exp(-\theta\,H)$ whence it is readily shown that:

\begin{equation}\label{SO3Example_6}H= -H^T\end{equation}

Now rotations preserve the handedness of vector systems, so that:

\begin{equation}\label{SO3Example_7}\det(\exp(\theta\,H)) = 1\,\Leftrightarrow\,\mathrm{tr}(H)=0\end{equation}

The most general real $\mathbf{H}$ fulfilling both $\eqref{SO3Example_6}$ and $\eqref{SO3Example_7}$ is of the form:

\begin{equation}\label{SO3Example_8}H = \frac{K}{\sqrt{\gamma_x^2+\gamma_y^2+\gamma_z^2}}\left(\begin{array}{ccc}0& -\gamma_z& \gamma_y\\\gamma_z&0&-\gamma_x\\-\gamma_y&\gamma_x&0\end{array}\right)\end{equation}

for $\gamma_x,\,\gamma_y,\,\gamma_z\in\R$ and where $K$ is a scaling constant yet to be determined. When $K=1$, $H$ is readily shown to fulfil its characteristic equation $H^3= -H$, which can be used to simplify the universally convergent Taylor series $\eqref{SO3Example_5}$; to “calibrate” our $K$ to give the right rotation angle, we use the relationship $\left<X,\,\exp(\theta\,H))\,X\right>=\cos\theta$ (the inner product between a vector and a version rotated through angle $\theta$ is $\cos\theta$) and when all this is done it is readily shown that $K=1$ and the rotation matrix for an angle $\theta$ is given by the following Rodrigues formula:

\begin{equation}\label{SO3Example_9}\exp(\theta\,H)=\id+\sin\theta\,H +(1-\cos\theta)\,H^2\end{equation}

where now:

\begin{equation}\label{SO3Example_10}H = \gamma_x\,\hat{S}_x + \gamma_y\,\hat{S}_y+\gamma_z\,\hat{S}_z\end{equation}


\begin{equation}\label{SO3Example_11}\hat{S}_x = \left(\begin{array}{ccc}0 & 0 & 0 \\0 & 0 & -1 \\0 & 1 & 0 \end{array}\right);\quad \hat{S}_y = \left(\begin{array}{ccc}0 & 0 & 1 \\0 & 0 & 0 \\-1 & 0 & 0\end{array}\right);\quad \hat{S}_z = \left(\begin{array}{ccc}0 & -1 & 0 \\1 & 0 & 0 \\ 0 & 0 & 0 \end{array}\right)\end{equation}

$H$ is readily shown to have an eigenvector of $(\gamma_x,\,\gamma_y,\,\gamma_z)$ with an eigenvalue of nought; this is therefore an eigenvector of the rotation matrix $\exp(\theta\,H)$ with an eigenvector of 1. Therefore we see that $(\gamma_x,\,\gamma_y,\,\gamma_z)$ are the direction cosines of the rotation axis. Lastly, one can prove that the composition of two rotations must be a rotation either by Euler’s Rotation Theorem, which essentially argues that any rigid motion of a sphere with its centre fixed must have all the above properties we have derived and so the composition of two such rigid motions must be of the same form or one can directly prove that the matrix product of two matrices of the form defined by $\eqref{SO3Example_9}$, $\eqref{SO3Example_10}$ and $\eqref{SO3Example_11}$ must also be of the same form: this is the approach used by [Engø]. Thus we have fully characterised $SO(3)$ as $(\mathcal{S},\,\bullet)$ where:

\begin{equation}\label{SO3Example_12}\mathcal{S}=\left\{\exp(\theta\,H)=\id+\sin\theta\,H +(1-\cos\theta)\,H^2\left|\,\begin{array}{ll}H = \frac{\gamma_x\,\hat{S}_x + \gamma_y\,\hat{S}_y+\gamma_z\,\hat{S}_z}{\sqrt{\gamma_x^2+\gamma_y^2+\gamma_z^2}}\\\gamma_j,\,\theta\in\R\end{array}\right.\right\}\end{equation}

and the group operation $\bullet$ is matrix multiplication. Given this background, $SO(3)$ can be readily shown to be a connected Lie group if (i) we put:

\begin{equation}\label{SO3Example_13}\Nid = \left\{\left.\exp(\theta\,(\gamma_x\,\hat{S}_x+\gamma_y\,\hat{S}_y+\gamma_z\,\hat{S}_z))\right|\,|\theta|<\epsilon,\,\gamma_x^2+\gamma_y^2+\gamma_z^2=1\right\}\end{equation}

with $\epsilon>0$ but small enough to make the matrix exponential $\exp:\Nid\to SO(3)$ one-to-one, (ii) make $\V$ the open cube $(-\epsilon,\,\epsilon)^3\subset\R^3$ and (iii):

\begin{equation}\label{SO3Example_14}\lambda\left(\exp(\theta\,(\gamma_x\,\hat{S}_x+\gamma_y\,\hat{S}_y+\gamma_z\,\hat{S}_z))\right) = (\theta\,\gamma_x,\,\theta\,\gamma_y,\,\theta\,\gamma_z)\end{equation}

Note that any $\epsilon>0$ will work, because we have the relationship $\exp(X/n)^n = \exp(X)$, so that any matrix of the form $\exp(H)$ can be written as a finite product of $n$ matrices of the form $\exp(H/n)$, for any integer $n>0$, no matter how big.

Example 1.4: ($2\times2$ Unitary Group $SU(2)$)

Unitary matrices, i.e. those which $U^{-1} = U^\dagger$ for, clearly form a group: $U^{-1} = U^\dagger$ is unitary and if $U,\,V$ are unitary, then $(U,\,V)^\dagger=V^\dagger\,U^\dagger = V^{-1}\,U^{-1} = (U\,V)^{-1}$ so that $U\,V$ is also unitary. The identity matrix is unitary and matrix multiplication is associative.

Unimodular matrices, i.e. those which have unit determinant, are also likewise a group (because $\det(U\,V) = \det(U)\,\det(V)$. So the set of unitary, unimodular $2\times2$ matrices together with matrix multiplication is clearly a group, the $2\times2$ special unitary group $SU(2)$.

As with orthogonal matrices, unitary matrices are normal operators (commute with their Hermitian conjugate), their diagonalisation by a unitary matrix is guaranteed and so for $U\in SU(2)$ we can find a matrix logarithm $H$ such that $U = \exp(H)$. Since $U^\dagger = U^{-1}$, this implies that $H^\dagger = -H$ and since $\det(U)=1$ we have $\mathrm{tr}(H) = 0$ (since $\det(\exp(H)) = \exp(\mathrm{tr}(H))$ is an identity for square matrices). So we see that:

\begin{equation}\label{SU2Example_1}SU(2) = \left\{e^H|\;H^\dagger = -H;\;\mathrm{tr}(H) = 0;\;H\text{ is }2\times2\right\}\end{equation}

i.e. an equivalent characterisation of $SU(2)$ is as exponentials of all traceless, skew-Hermitian $2\times2$ complex matrices. The most general such matrix $H$ is of the form:




and we also have the identity $H^2 = -\theta^2 / 4 \,\id$, which relationship lets us write the exponential Taylor series for the group member $e^H$ as the $SU(2)$ Rodrigues Formula:

\begin{equation}\label{SU2Example_4}e^H\in SU(2)=\cos\left(\frac{\theta}{2}\right)\,\id+\frac{1}{\sqrt{\gamma_x^2+\gamma_y^2+\gamma_z^2}}\,\left(\begin{array}{cc}i\,\gamma_z&-\gamma_y+i\,\gamma_x\\\gamma_y+i\,\gamma_x&-i\,\gamma_z\end{array}\right)\sin\left(\frac{\theta}{2}\right)\end{equation}

The reason for the angle’s being written as $\theta/2$ rather than $\theta$ will become clearer shortly. So now we have fully characterised $SU(2)$ as $(\mathcal{S},\,\bullet)$ where:

\begin{equation}\label{SU2Example_5}\mathcal{S}=\left\{\left.e^H=\cos\left(\frac{\theta}{2}\right)\,\id+\sin\left(\frac{\theta}{2}\right)\,\frac{\gamma_x\,\hat{s}_x + \gamma_y\,\hat{s}_y+\gamma_z\,\hat{s}_z}{\sqrt{\gamma_x^2+\gamma_y^2+\gamma_z^2}}\right|\,\gamma_j,\,\theta\in\R\right\}\end{equation}

and the group operation $\bullet$ is matrix multiplication. Given this background, $SU(2)$ can be readily shown to be a connected Lie group if (i) we put:

\begin{equation}\label{SU2Example_6}\Nid = \left\{\left.\exp\left(\frac{\theta}{2}\,\left(\gamma_x\,\hat{s}_x + \gamma_y\,\hat{s}_y+\gamma_z\,\hat{s}_z\right)\right)\right|\,|\theta|<\frac{\pi}{4},\,\gamma_x^2+\gamma_y^2+\gamma_z^2=1\right\}\end{equation}

with $\epsilon>0$ but small enough to make the matrix exponential $\exp:\Nid\to SU(2)$ one-to-one, (ii) make $\V$ the open cube $(-\epsilon,\,\epsilon)^3\subset\R^3$ and (iii):

\begin{equation}\label{SU2Example_7}\lambda\left(\exp\left(\frac{\theta}{2}\,\left(\gamma_x\,\hat{s}_x + \gamma_y\,\hat{s}_y+\gamma_z\,\hat{s}_z\right)\right)\right) = \left(\frac{\theta}{2}\,\gamma_x,\,\frac{\theta}{2}\,\gamma_y,\,\frac{\theta}{2}\,\gamma_z\right)\end{equation}

As in the foregoing example, any $\epsilon>0$ will work, because we have the relationship $\exp(X/n)^n = \exp(X)$, so that any matrix of the form $\exp(H)$ can be written as a finite product of $n$ matrices of the form $\exp(H/n)$, for any integer $n>0$, no matter how big.

There is another characterisation of $SU(2)$ that may be wonted to the reader: it is the group of unit quaternions. Witness that the $\hat{s}_j$ of $\eqref{SU2Example_3}$ fulfill:

\begin{equation}\label{SU2Example_8}\hat{s}_x^2 = \hat{s}_y^2=\hat{s}_z^2 = \hat{s}_x\,\hat{s}_y\,\hat{s}_z=-\id\end{equation}

which, in turn, can be shown to imply:


where $\epsilon_{i\,j\,k}$ is the permutation symbol and so the $\hat{s}_j$ are isomorphic to the imaginary quaterion units. The group member in $\eqref{SU2Example_4}$ is thus recognised to be a unit magnitude quaternion and the Rodrigues formula for $SU(2)$ is thus the quaternion De Moivre’s theorem $e^{\hat{a}\,\theta} = \cos\theta\,\id + \sin\theta\, \hat{a}$, where $\hat{a}$ is any unit magnitude purely imaginary quaternion, i.e. one of the form $\gamma_x\,\hat{i}+\gamma_y\,\hat{j}+\gamma_z\,\hat{k}$ where $\gamma_x^2+\gamma_y^2+\gamma_z^2=1$. Therefore:

\begin{equation}\label{SU2Example_10}SU(2)=\left\{w + x\,\hat{i}+y\,\hat{j}+z\,\hat{k}\,\left|\;\begin{array}{l}w,\,x,\,y,\,z\in\R;\;w^2+x^2+y^2+z^2=1;\\\hat{i}^2 = \hat{j}^2=\hat{k}^2 =\hat{i}\,\hat{j}\,\hat{k}= -1\end{array}\right.\right\}\end{equation}

Quaternions represent rotations through a spinor map; if a vector in 3-space is represented by a purely imaginary quaternion of the form $x\,\hat{i}+y\,\hat{j}+z\,\hat{k}$, then its image under a rotation of angle $\theta$ about an axis with direction cosines $\gamma_x,\,\gamma_y,\,\gamma_z$ is given by:

\begin{equation}\label{SU2Example_11}x\,\hat{i}+y\,\hat{j}+z\,\hat{k} \mapsto U\,(x\,\hat{i}+y\,\hat{j}+z\,\hat{k})\,U^\dagger;\quad U=\exp\left(\frac{\theta}{2}(\gamma_x\,\hat{i}+\gamma_y\,\hat{j}+\gamma_z\,\hat{k})\right) \end{equation}

This spinor map is an example of the group $SU(2)$ acting on its own Lie algebra through the adjoint representation, a statement whose explanation needs to be deferred to the explanations of the Lie algebra and the adjoint representation. This is the ultimate reason for the scale factor $2$ in the angle, but for now it can be intuitively understood by joining [Penrose] in his wonderful explanation of the triangle law for composition of rotations in §11.4 “How to Compose Rotations”; quaternions themselves are discussed more fully in the early sections of chapter 11 in [Penrose]. I shall add a little more detail as follows.

We begin by considering a particular form of the group multiplication law in $SU(2)$ derived by the same “trick” explained in [Engø] as follows. Given that all $SU(2)$ members are of the polar (de Moivre) form in $\eqref{SU2Example_5}$, we can write $\exp(X)\,\exp(Y)=\exp(Z)$ where $X,\,Y,\,Z$ are all linear combinations of $\{\hat{s}_x,\,\hat{s}_y,\,\hat{s}_z\}$, and the exponential of all such linear combinations belongs to $SU(2)$. So we can think of $SU(2)$ as the exponential of the three-dimensional vector space spanned by $\{\hat{s}_x,\,\hat{s}_y,\,\hat{s}_z\}$. This vector space is also precisely the vector space of all $2\times 2$ skew-Hermitian matrices i.e. matrices $H$ such that $H^\dagger = {H^*}^T = -H$. Now from $\eqref{SU2Example_5}$ we have $exp(Z) = \cos(\theta_z)\,\id + \sin(\theta_z)\,Z/\|Z\|$, where $|\theta_z| = \|Z\|$. The “trick” used in [Engø] is to take heed that the expression $\sin(\theta)\,Z/\|Z\|$ is precisely the skew-Hermitian part of $\exp(Z) =\exp(X)\,\exp(Y)$ (since $Z$ is skew-Hermitian), leaving $\cos(\theta_z)\,\id$ as precisely the Hermitian part and so we have a very simple way to calculate the matrix logarithm:

\begin{equation}\label{SU2Example_12}\log U = \frac{1}{2}\,\frac{\arcsin\|U\|}{\|U\|}\,(U – U^\dagger)\end{equation}

So we expand the product $\exp(X)\,\exp(Y)=(\cos(\theta_x)\,\id + \sin(\theta_x)\,X/\|X\|) (\cos(\theta_y)\,\id + \sin(\theta_y)\,Y/\|Y\|)$ and take the skew-Hermitian part, taking heed that the skew-Hermitian part of $X\,Y$ when $X,\,Y$ are both skew-Hermitian is $(X\,Y-(X\,Y)^\dagger=X\,Y-Y\,X$:

\begin{equation}\label{SU2Example_13}\begin{array}{ll}&\frac{1}{2}\left(\exp(X)\,\exp(Y)-(\exp(X)\,\exp(Y))^\dagger\right)\\=&\cos \theta_y\,\sin\theta_x\,\frac{X}{\|X\|}+\cos \theta_x\,\sin\theta_y\,\frac{Y}{\|Y\|} + \frac{1}{2\,\|X\|\,\|Y\|}\sin\theta_y\,\sin\theta_x\,(X\,Y-Y\,X)\end{array}\end{equation}

whilst the Hermitian part is:

\begin{equation}\label{SU2Example_14}\begin{array}{ll}&\frac{1}{2}\left(\exp(X)\,\exp(Y)+(\exp(X)\,\exp(Y))^\dagger\right)\\=&\cos \theta_x\,\cos \theta_y\,\id + \frac{1}{2\,\|X\|\,\|Y\|}\sin\theta_y\,\sin\theta_x\,(X\,Y+Y\,X)\end{array}\end{equation}

Now, if we think of $X$ and $Y$ as vectors in the linear space spanned by $\{\hat{s}_x,\,\hat{s}_y,\,\hat{s}_z\}$ so that we represent them as $3\times1$ column vectors $X=(x_j)_{j=1}^3,\,Y=(y_j)_{j=1}^3$ of superposition weights $x_j,\,y_j$ multiplying the $\hat{s}_x$, it is readily shown that the matrix operation $(X\,Y-Y\,X)/2$ becomes the cross product $X\times Y$ of the column vectors and the matrix operation $(X\,Y+Y\,X)/2$ becomes the scalar product $X\cdot U$ (times the “scalar” identity matrix $\id$). The matrix operation In this way of thinking, we are led to something like Hamilton’s original graphical visualisation of the quaternions: as directed arcs of great circles on the unit sphere. Figure 1.1 shows three unit vectors $\mathbf{a},\,\mathbf{b},\,\mathbf{c}$ defining vertices on the spherical triangle $uvw$. The directed arcs $\mathbf{u}$, $\mathbf{v}$ and $\mathbf{w}$ are the unit quaternions defined by the pairs $\mathbf{a},\,\mathbf{b}$, $\mathbf{b},\,\mathbf{c}$ and $\mathbf{c},\,\mathbf{a}$ of vectors, respectively. Actually, the arcs themselves contain one more bit of information than the pairs of vectors delimiting them: they define which of the two possible sections of the great circle marked out by the vectors is the arc in question.


Figure 1.1: Unit Quaternions as Directed Great Circle Arcs and Unit Quaternion Multiplication

The pair $(\mathbf{a}\cdot\mathbf{b},\,\mathbf{a}\times\mathbf{b})$ uniquely defines the arc $\mathbf{u}$ in Figure 1.1. Indeed, given any two three dimensional vectors $\mathbf{a},\, \mathbf{b}$, one vector in the pair together with $\mathbf{a}\cdot\mathbf{b}$ and $\mathbf{a}\times\mathbf{b}$ uniquely determines the other vector; for example:

\begin{equation}\label{SU2Example_15}\mathbf{b} = \frac{1}{\mathbf{a}\cdot\mathbf{a}}\left((\mathbf{a}\cdot\mathbf{b})\,\mathbf{a}-\mathbf{a}\times (\mathbf{a}\times \mathbf{b})\right)= \frac{1}{\mathbf{a}\cdot\mathbf{a}}\left((\mathbf{a}\cdot\mathbf{b})\,\mathbf{a}+(\mathbf{a}\times \mathbf{b})\times\mathbf{a}\right)\end{equation}

i,e. one can invert the product defined by $\mathbf{a},\mathbf{b}\mapsto (\mathbf{a}\cdot\mathbf{b},\,\mathbf{a}\times\mathbf{b})$, although of course neither the scalar nor vector product on their own is invertible. The pairing $(\mathbf{a}\cdot\mathbf{b},\,\mathbf{a}\times\mathbf{b})$ is a standard way (see e.g. [Doran&Lasenby], Chapter 1) to think of a quaternion; indeed one could say that the scalar and vector products of three dimensional vectors wonted to “modern” vector calculus are simply disembodied parts of the original quaternion product. The “scalar” part of the pairing $\mathbf{a}\cdot\mathbf{b}$ corresponds to the term $\cos\theta_u\,\id = \mathbf{a}\cdot\mathbf{b}\,\id$ in the corresponding $SU(2)$ element $e^U = \cos\theta_u \,\id + \sin\theta_u\, U/\|U\|$, whilst the three components of $\mathbf{a}\times\mathbf{b}$ correspond to the superposition weights of the $2\times 2$ matrices $\hat{i} = \hat{s}_x, \,\hat{j} = \hat{s}_y,\,\hat{k} = \hat{s}_z$ that sum to give the skew-symmetric part $\sin\theta_u\, U/\|U\|$. So now we describe the arc $\mathbf{v}$ in Figure 1.1 as the pair $(\mathbf{b}\cdot\mathbf{c},\,\mathbf{b}\times\mathbf{c})$; this corresponds to the $SU(2)$ element $e^V = \cos\theta_v\,\id + \sin\theta_v\,V/\|V\|$: the “scalar” matrix $\id$ has superposition weight $\cos\theta_v=\mathbf{b}\cdot\mathbf{c}$ whereas the skew-Hermitian “vector” matrix $\sin\theta_v\,V/\|V\|$ is equal to $(\mathbf{b}\times\mathbf{c})_x\,\hat{s}_x+(\mathbf{b}\times\mathbf{c})_y\,\hat{s}_y+(\mathbf{b}\times\mathbf{c})_z\,\hat{s}_z$. So now we work out what the product $(\mathbf{a}\cdot\mathbf{b},\,\mathbf{a}\times\mathbf{b})\,(\mathbf{b}\cdot\mathbf{c},\,\mathbf{b}\times\mathbf{c})$ must be if it is to correspond to the $SU(2)$ element $e^V\,e^U$ by calling on $\eqref{SU2Example_13}$ and $\eqref{SU2Example_14}$ to work out the Hermitian and skew-Hermitian parts of the product: the skew-Hermitian part, from $\eqref{SU2Example_13}$, corresponds to $(\mathbf{b}\cdot \mathbf{c})\,\mathbf{a}\times \mathbf{b}+(\mathbf{a}\cdot \mathbf{b})\,\mathbf{b}\times \mathbf{c} + (\mathbf{a}\times \mathbf{b})\times(\mathbf{c}\times \mathbf{d})$, which, after simplification through standard vector identities, reduces to $\mathbf{c}\times\mathbf{a}$; we must take heed that $\mathbf{a},\,\mathbf{b},\,\mathbf{c}$ are all unit vectors. The Hermitian part of the product, from $\eqref{SU2Example_14}$, corresponds to $(\mathbf{a}\cdot \mathbf{b})\,(\mathbf{b}\cdot \mathbf{c}) + (\mathbf{a}\times \mathbf{b})\cdot (\mathbf{b}\times \mathbf{c})$, which simplifies to $\mathbf{c}\cdot\mathbf{a}$. Written out in full, the product formation is:

\begin{equation}\label{SU2Example_16}\begin{array}{lcl}\mathbf{u}\,\mathbf{v} &=& \left(\mathbf{a}\cdot \mathbf{b},\,\mathbf{a}\times\mathbf{b}\right)\,\left(\mathbf{b}\cdot \mathbf{c},\,\mathbf{b}\times\mathbf{c}\right)\\&=& \left((\mathbf{a}\cdot \mathbf{b})\,(\mathbf{b}\cdot \mathbf{c}) + (\mathbf{a}\times \mathbf{b})\cdot (\mathbf{b}\times \mathbf{c}),\,(\mathbf{b}\cdot \mathbf{c})\,\mathbf{a}\times \mathbf{b}+(\mathbf{a}\cdot \mathbf{b})\,\mathbf{b}\times \mathbf{c} + (\mathbf{a}\times \mathbf{b})\times(\mathbf{c}\times \mathbf{d})\right)\\&=&\left(\mathbf{c}\cdot \mathbf{a},\,\mathbf{c}\times\mathbf{a}\right)\\&=&\mathbf{w}\end{array}\end{equation}

but of course this is nothing more than the quaternion pairing $(\mathbf{c}\cdot\mathbf{a},\, \mathbf{c}\times\mathbf{a})$ that describes the arc $\mathbf{w}$ in Figure 1.1. So the unit quaternion ($SU(2)$) product $\mathbf{u}\,\mathbf{v}$ is indeed defined by the pair of vectors $\mathbf{c},\,\mathbf{a}$ and by the arc $\mathbf{w}$ on the unit sphere which closes the spherical triangle formed by $\mathbf{a},\,\mathbf{b},\,\mathbf{c}$ and unit quaternion multiplication can be interpreted as a spherical arc addition rule. Unit quaternions, i.e. members of $SU(2)$ were called versors by their creator, William Rowan Hamilton ([Hamilton], Chapter 1, §§8, 9, pp133-157). More generally, when the quaternions are not of unit length, the product between the pairs can be shown to be defined by:

\begin{equation}\label{SU2Example_17}\mathbf{u}\,\mathbf{v} = (u_s\,\mathbf{u}_v)\,(v_s\,\mathbf{v}_v)=(u_s\,v_s,-\mathbf{u}_v\cdot\,\mathbf{v}_v,\,\mathbf{u}_v\times\mathbf{v}_v+u_s\,\mathbf{v}_v+v_s\,\mathbf{u}_v)\end{equation}

Now we relate the spherical triangle in Figure 1.1 to ordinary three dimensional rotations, which we can do by taking heed that a rotation of angle $2\,\theta$ about an axis can be decomposed as the product of two rotations, each through a half turn, about axes through the end points of any arc section of length $\theta$ of a great circle in the plane defined by the axis on the unit sphere. Thus a rotation through angle of twice the arclength of $\mathbf{u}$ about the axis defined by $\mathbf{a}\times\mathbf{b}$ in Figure 1.1 can be realised by a rotation through $\pi$ radians about the axis $\mathbf{a}$ followed by a second rotation through $\pi$ radians about the axis $\mathbf{b}$.


Figure 1.2: Composing Rotations in $SO(3)$ by Unit Quaternion Multiplication

The triangles $R_1,\,R_2,\,R_3$ in Figure 1.2 are all congruent to the unit quaternion multiplication triangle with vertices $\mathbf{a},\,\mathbf{b},\,\mathbf{c}$. $R_1$, $R_2$ and $R_3$ are the images of triangle $\mathbf{a},\,\mathbf{b},\,\mathbf{c}$ after the latter has been rotated $\pi$ radians about the axis defined by the unit vectors $\mathbf{a}$, $\mathbf{a}$ and $\mathbf{c}$, respectively. [Penrose] describes them as the central triangle $\mathbf{a},\,\mathbf{b},\,\mathbf{c}$ having been “reflected” in vertices $\mathbf{a}$, $\mathbf{a}$ and $\mathbf{c}$, respectively. So, the composition of a rotation through $\pi$ about axis $\mathbf{a}$ followed by a $\pi$ radian rotation about $\mathbf{b}$ maps triangle $R_1$ onto $R_2$. The product of two rotations is again a rotation, as discussed in Example 1.3, and any rotation of the unit sphere is uniquely defined by a spherical triangle and its image under the rotation. Therefore, from Figure 1.2, $R_2$ is the triangle $R_1$ after the latter has been rotated about the axis normal to the great circle containing the arc $\mathbf{u}$, but the magnitude of the rotation, given $R_2\cong R_1$, by inspection is twice the arclength of the arc $\mathbf{u}$. The unit quaternion $\mathbf{u}$ thus represents an $SO(3)$ rotation about its axis but through an angle that is twice the arclength of the quaternion itself. This is the reason why an element of $SU(2)$ is often written as in $\eqref{SU2Example_4}$; the angle $\theta$ there is the angle of the $SO(3)$ rotation represented by the quaternion.

Likewise $R_3$ and $R_1$ are the images of $R_2$ and $R_3$ under rotations about axes defined by the quaternions $\mathbf{v}$ and $\mathbf{w}$, respectively. And, of course, the rotation mapping $R_3$ onto $R_1$ is the composition of the two rotations mapping first $R_1$ to $R_2$ then $R_2$ to $R_3$. So this rotation, represented by the quaternion $\mathbf{w}$, can be found graphically as the great circle arc closing the spherical triangle $\mathbf{a},\,\mathbf{b},\,\mathbf{c}$ when we pay proper heed to the scaling factor 2 between the unit quaternion’s arclength and the rotation’s angle. To calculate the rotation imparted by a unit quaternion $U\in SU(2)$ on the spatial vector $X$ we use the spinor map $X\mapsto U\,X\,U^\dagger$ defined in $\eqref{SU2Example_11}$.

Below is a Mathematica simultation showing some of the principles discussed above. Drag the triangle vertices using the 2D sliders. Use the “time” slider to see the congruences between the central triangle formed by vertices abc and the three triangles $R_1$, $R_2$ and $R_3$. Use the “show congruence” checkbox to show or hide the moving triangles, “showlabels” to show or hide labels and “show vertex vectors” to show the position vectors of vertices a, b, c and “show triangle only” to show the quaternion arc composition triangle alone.

Unimodular Matrix Groups; General Matrix Groups

Example 1.5: (Group of Unimodular Complex Matrices $SL(2,\,\mathbb{C})$)

The group of unimodular (unit determinant), complex element, $2\times 2$ matrices $SL(2,\,\mathbb{C})$ is clearly a group with matrix multiplication as the group operation. Unlike the group members in Example 1.3 and Example 1.4, a unimodular $2\times2$ matrix cannot be diagonalised; for example, the best we can do for any matrix of the form $\left(\begin{array}{cc}1&z\\0&1\end{array}\right)$, which has a double eigenvalue of 1, is that it is its own Jordan normal form.

However, we can still use the square-matrix Mercator-Newton series for the matrix logarithm to write any $SL(2,\,\mathbb{C})$ member “near enough” to the identity (i.e. $\left\|\gamma-\id\right\| < 1$ as the exponential of a matrix. Therefore, for any $\gamma\in SL(2,\,\mathbb{C})$ with $\left\|\gamma-\id\right\| < 1$, we have $\gamma = e^X$ where:

\begin{equation}\label{SL2CExample_1}X=\log(\gamma) = \gamma-\id – \frac{(\gamma-\id)^2}{2}+\frac{(\gamma-\id)^3}{3}-\cdots\end{equation}

We recall the standard square matrix identity that $\det(e^X) = \exp(\mathrm{tr}(X))$. If you don’t know why this is true, take heed that, for any square matrices $\gamma,\,\zeta$ we have $\det(\gamma\,\zeta) = \det(\gamma)\,\det(\zeta)$ so that $\det(e^{(\tau+\varsigma)\,X}) = \det(e^{\tau\,X})\,\det(e^{\varsigma\,X})$, so that $\d_\tau \det(e^{\tau\,X}) = \det(e^{\tau\,X}) \,\left.\d_\varsigma\det(e^{\varsigma\,X})\right|_{\varsigma=0}$ so that, by the Picard–Lindelöf theorem (i.e. uniqueness of solution of a first differential equations when an initial condition is specified; see a later post for more on this theorem), we have $\det(e^{\tau\,X}) = \exp\left(\tau\,\left.\d_\varsigma\det(e^{\varsigma\,X})\right|_{\varsigma=0}\right)$, so now all that we need to do is work out $\left.\d_\varsigma\det(e^{\varsigma\,X})\right|_{\varsigma=0}$. This last can be worked out with Jacobi’s Formula (see the Wikipedia page with this name for a proof) to be $\mathrm{tr}(X)$, whence our original identity. So, the unimodularity condition on $\gamma=e^X\in SL(2,\mathbb{C})$ is equivalent to the tracelessness condition $\mathrm{tr}(X)$ on the logarithm of $\gamma$. The set of $2\times2$ traceless complex matrices is a vector space space spanned by:

\begin{equation}\label{SL2CExample_3}\begin{array}{lclclclclcl}\hat{s}_x&=&\left(\begin{array}{cc}0&i\\i&0\end{array}\right)& &\hat{s}_y&=&\left(\begin{array}{cc}0&-1\\1&0\end{array}\right)& &\hat{s}_z=\left(\begin{array}{cc}i&0\\0&-i\end{array}\right)\\\hat{t}_x&=&\left(\begin{array}{cc}0&1\\1&0\end{array}\right)& &\hat{t}_y&=&\left(\begin{array}{cc}0&-i\\i&0\end{array}\right)& &\hat{t}_z=\left(\begin{array}{cc}1&0\\0&-1\end{array}\right)\end{array}\end{equation}

If we now put:

\begin{equation}\label{SL2CExample_4}\begin{array}{rcl}\Nid &=& \left\{e^X|\,\mathrm{tr}(X) = 0;\,X\in\mathbb{M}(2,\,\mathbb{C});\,\|X\|<\log 2\right\}\\\lambda:\Nid\to\R^6;\, \lambda(e^X) &=& \lambda(\exp(x\,\hat{s}_x+y\,\hat{s}_y+z\,\hat{s}_z+u\,\hat{t}_x+v\,\hat{t}_y+w\,\hat{t}_z))\\&=&\left(\begin{array}{c}x\\y\\z\\u\\v\\w\end{array}\right);\,x,\,y,\,z,\,u,\,v,\,w\in\R\\\mathcal{V} \subset\R^6;\,\mathcal{V}&=&\log\Nid\end{array}\end{equation}

where $\mathbb{M}(M,\,\mathbb{C})$ is the linear space of $M\times M$ complex matrices, then the group of all finite products of exponentials of $2\times2$ complex matrices is readily shown (do this) to fulfill all of axioms 1 through 5 above with Convention 1.2 is clearly a connected Lie group, with dimension $N=6$. However, we shall have to take it as (for the time being an unproven) fact that this smallest connected Lie group containing all of $\Nid$ is indeed the whole of $SL(2,\,\mathbb{C})$, because, unlike $SO(3)$ and $SU(2)$ of Example 1.3 and Example 1.4, not every matrix in $SL(2,\,\mathbb{C})$ is the exponential of the set of $2\times2$ traceless complex matrices. The Cayley-Hamilton theorem for $2\times2$ complex matrices can be stated as $U^2-\mathrm{tr}(U)\,U + \det(U)\,\id=0$, so, by reasoning as in $\eqref{SU2Example_4} $for example, this yields the Rodrigues formula for the traceless matrix $X$:

\begin{equation}\label{SL2CExample_5}e^X = \cos\sqrt{\det\,X}\,\id + \frac{\sin\sqrt{\det\,X}}{\sqrt{\det\,X}}\,X;\;\frac{\sin 0}{0}\stackrel{def}{=}\mathrm{sinc}\, 0 = 1\end{equation}

so that if we try to solve:

\begin{equation}\label{SL2CExample_6}e^X = \cos\sqrt{\det\,X}\,\id + \frac{\sin\sqrt{\det\,X}}{\sqrt{\det\,X}}\,X =\gamma_0=\left(\begin{array}{cc}-1 & \alpha\\0&-1\end{array}\right);\;X=\left(\begin{array}{cc}a&b\\c&-a\end{array}\right);\,a,\,b,\,c\in\mathbb{C}\end{equation}

with $\alpha\neq0$ we get, with $\theta = \sqrt{\det X} = \sqrt{-a^2-b\,c}$ (possibly complex) we find:

\begin{equation}\label{SL2CExample_7}\begin{array}{lcl}-1&=&\cos\theta\pm a\,\frac{\sin\theta}{\theta}\\\alpha &=& b\,\frac{\sin\theta}{\theta}\\0 &=& c\,\frac{\sin\theta}{\theta}\end{array}\end{equation}

which has no solution over $\mathbb{C}$: the equation $0\neq\alpha =b\,\sin\theta/\theta$ shows $\sin\theta/\theta \neq 0$ so that $1=\cos\theta \pm a\,\sin\theta/\theta$ imples $a=0$ and $0=c\,\sin\theta/\theta$ implies $c=0$. So $\det\,X=0\,\Rightarrow\,\theta=0$ and therefore $e^X$ is of the form:

\begin{equation}\label{SL2CExample_8}e^X = \id+\left(\begin{array}{cc}0&b\\0&0\end{array}\right)=\left(\begin{array}{cc}1&b\\0&1\end{array}\right)\end{equation}

hence the square matrix $\gamma_0$ in $\eqref{SL2CExample_6}$, clearly belonging to $SL(2,\,\mathbb{C})$, is not an exponential of a traceless, complex $2\times 2$ matrix.

$SL(2,\,\mathbb{C})$ our first example of a complex connected Lie group: that is, if we replace the field $\R$ with the field $\mathbb{C}$ of complex numbers in our axioms 1 through 5 above, then $SL(2,\,\mathbb{C})$, together with the definitions:

\begin{equation}\label{SL2CExample_9}\begin{array}{l}\Nid = \left\{e^X|\,\mathrm{tr}(X) = 0;\,X\in\mathbb{M}(2,\,\mathbb{C});\,\|X\|<\log 2\right\}\\\lambda:\Nid\to\mathbb{C}^3;\, \lambda(e^X) = \lambda(\exp(x\,\hat{s}_x+y\,\hat{s}_y+z\,\hat{s}_z)=\left(\begin{array}{c}x\\y\\z\end{array}\right)\\\mathcal{V} \subset \mathbb{C}^3;\,x,\,y,\,z\in\mathbb{C}\,\mathcal{V}=\log\Nid\end{array}\end{equation}

then $SL(2,\,\mathbb{C})$ fulfills the modified definitions. From $\eqref{SL2CExample_9}$ we see that $SL(2,\,\mathbb{C})$ is like a “complexified” $SU(2)$, insofar that it is generated by matrices of the form $\exp(x\,\hat{s}_x+y\,\hat{s}_y+z\,\hat{s}_z)$ where the superposition weights $x,\,y,\,z$ are now allowed to be complex instead of real, as they must be for $SU(2)$. The fact that some elements of $SL(2,\,\mathbb{C})$ are not exponentials of this form, whereas all members of $SU(2)$ are (when $x,\,y,\,z$ are constrained to be real) is another key difference, a difference that arises (we state this without proof for now) from the former’s topological non-compactness, whereas the latter is topologically compact. Clearly, every connected complex Lie group fulfilling the complexified versions of axioms 1 through 5 is also a connected real Lie group fulfilling the axioms as they are stated above if we think of $\mathcal{V}\subset \mathrm{C}^N$ as instead $\mathcal{V}\subset \R^{2\,N}$ by assigning a real parameter for each real and imaginary part of the complex parameters, but the converse is clearly not true: not every connected real Lie group can be throught of as a complex Lie group. Therefore, any result holding for connected real Lie groups also holds for connected complex ones. The passing from a connected complex Lie group to the equivalent connected real Lie group, with the dimension $N$ doubling at the same time to $2\,N$, is called waiving by [FreudenthalDeVries]. The inverse process, when it can be done, is called twinning. An earlier name used by Freudenthal for the same process as waiving is “bedoubling[Freudenthal, 1941] (Verdoppelung, which can also mean “duplication”; one speaks of “DNA Verdoppelung”, for example, so I’m guessing this may have been why Freudenthal changed the name in his later works).

$SL(2,\,\mathbb{C})$ has an important interpretation through its relationship with the Möbius group $\mathbb{M}(2)$, which is the group of Möbius transformations of the form:

\begin{equation}\label{SL2CExample_10}\mathscr{g}:\mathbb{C}\to\mathbb{C};\;\mathscr{g}(z)=\frac{a\,z+b}{c\,z+d};\;a,\,b,\,c,\,d\in\R;\;a\,d-b\,c\neq 0\end{equation}

together with the group multiplication operation of function composition. Any such transformation is unaffected by a scaling $(a,\,b,\,c,\,d)\mapsto(\ell\,a,\,\ell\,b,\,\ell\,c,\,\ell\,d)$ for any $\ell\in\R\setminus\{0\}$. So we can choose a scaling by deeming:


In this form, the co-efficients $a_h,\,b_h,\,c_h,\,d_h$ defining the composition $h= f\circ g$ of the two Möbius transformations $f$ and $g$ defined by co-efficients $a_f,\,b_f,\,c_f,\,d_f$ and $a_g,\,b_g,\,c_g,\,d_g$ are given by the $SL(2,\,\mathbb{C})$ group product:


so that the mapping:

\begin{equation}\label{SL2CExample_13}\phi: SL(2\,\mathbb{C})\to\mathbb{M}(2);\;\phi\left[\left(\begin{array}{cc}a&b\\c&d\end{array}\right)\right] : \mathbb{C}\to\mathbb{C};\, \phi\left[\left(\begin{array}{cc}a&b\\c&d\end{array}\right)\right](z) = \frac{a\,z+b}{c\,z+d}\end{equation}

fulfills $\phi(\gamma\,\zeta) = \phi(\gamma)\,\phi(\zeta);\,\forall\,\gamma,\,\zeta\in SL(2,\mathbb{C})$ and is a homomorphism. Now there are indeed two matrices in $SL(2,\mathbb{C})$ that correspond to a given Möbius transformation: the two transformations $f_+(z) = \frac{a\,z+b}{c\,z+d}$ and $f_-(z) = \frac{-a\,z-b}{-c\,z-d}$ corresponding to two distinct matrices, to wit an $SL(2\,\mathbb{C})$ matrix and the same matrix multiplied by $-1$, are readily seen to be the same transformation. Therefore the kernel of the homomorphism is $\mathbb{Z}_2=\{\id,\,-\id\}\subset SL(2,\,\mathbb{C})$ and so $\mathbb{M}(2)\cong SL(2\,\mathbb{C})/\mathbb{Z}_2$. The notation $PSL(2\,\mathbb{C})\cong SL(2\,\mathbb{C})/\mathbb{Z}_2 \cong \mathbb{M}(2)$ is also current and stands for “projective special linear” group.

Example 1.6: (Group of Unimodular Real Matrices $SL(2,\,\R)$)

The group of unimodular (unit determinant), real element, $2\times 2$ matrices $SL(2,\,\R)$ is clearly a group with matrix multiplication as the group operation. Naturally, it is a subgroup, although not a normal subgroup, of $SL(2,\,\mathbb{C})$. Reasoning exactly as in Example 1.5, we show that the group of all real matrices which are exponentials of real traceless matrices, equivalently: finite products of exponentials $e^X$ of matrices of the following form:


is indeed a connected Lie group and is contained within $SL(2,\,\R)$. We need for now to take without proof the assertion that this connected Lie group is indeed the whole of $SL(2,\R)$.

The example given by $\eqref{SL2CExample_6}$, $\eqref{SL2CExample_7}$ and $\eqref{SL2CExample_8}$ not only describes a matrix in $SL(2,\,\mathbb{C})$ but one that is also in $SL(2,\,\R)$. So $SL(2,\,\R)$ has members which are not the exponentials of traceless matrices, even though all the members near the identity are. Like $SL(2,\,\mathbb{C})$, $SL(2,\,\R)$ differs from $SU(2)$ and $SO(3)$ insofar that the latter are exponentials of linear spaces, whereas the former are not and this fact has to do with the compactness of the latter contrasted with the non-compactness of the former.

$SL(2,\,\R)$ also has an important geometric interpretation through Poincaré’s theorem, which deals with the Hyperbolic Halfplane. We summarise the following without proof. The Hyperbolic halfplae is the metric space $(\mathbb{H}^2,\,\Delta)$ where:

\begin{equation}\label{SL2RExample_2}\mathbb{H}^2 = \{z\in\mathbb{C}: \mathrm{Im}(z)>0\}\end{equation}

is kitted with the metric or distance function defined by:

\begin{equation}\label{SL2RExample_3}\Delta: \mathbb{H}^2\times \mathbb{H}^2\to\R;\,\Delta(z_1,\,z_2) = 2\,\mathrm{artanh}\left|\frac{z_1-z_2}{z_1-z_2^*}\right|\end{equation}

which in turn is defined by the line element:

\begin{equation}\label{SL2RExample_4}ds^2 = \frac{dx^2 + dy^2}{y^2}\end{equation}

so that the length of a curve $\sigma:[0,\,1]\to\mathbb{H}^2;\,\sigma(t) = x(t)+i\,y(t)$ is:

\begin{equation}\label{SL2RExample_5}L_\sigma = \int_0^1 \frac{\d\,s}{\d\,t}\,\d\,t = \int_0^1 \frac{\sqrt{\left(\frac{\d\,x(t)}{\d\,t}\right)^2 + \left(\frac{\d\,y(t)}{\d\,t}\right)^2}}{|y(t)|}\,\d\,t\end{equation}

and the distance $\Delta(z_1,\,z_2)$ is the length of the shortest length curve, i.e. the geodesic, joining $z_1=\sigma(0)$ and $z_2 = \sigma(1)$. It can be shown that set of geodesics in this metric space is the set of all circular (in the conventional, Euclidean sense) arcs whose centres lie on the real axis.

In the following, we think of the Möbius transformations corresponding to the elements of $SL(2\,\R)$. As in Example 1.5, there are two elements of $SL(2\,\R)$ which map to the same Möbius transformation by the homomorphism of $\eqref{SL2CExample_13}$; we call the image of $SL(2\,\R)$ under this homomorphism $PSL(2\,\R) \cong SL(2\,\R)/\mathbb{Z}_2$. We now state without proof:

Theorem 1.7:(Poincaré, 1882)

The group $PSL(2, \mathbb{R})$ of billinear transformations of the form:

\begin{equation}\label{PoincaresTheorem_1}\mathscr{P}:\mathbb{H}^2\to\mathbb{H}^2;\;\mathscr{P}(z) = \frac{\alpha\,z+\beta}{\gamma\,z+\delta};\;\alpha,\,\beta,\,\gamma,\,\delta \in \mathbb{R};\;\alpha\delta-\beta\gamma=1\end{equation}

is precisely the group of orientation preserving isometries of $\mathbb{H}^2$; that is, every transformation of this kind is an isometry, and all orientation preserving isometries are of this kind.

The orientation-reversing isometries are precisely those functions of the form $z\mapsto \mathscr{P}(z^*)$, where $\mathscr{P}\in PSL(2,\,\R)$.

Proof: See §4.4 of [Stillwell], which is a modern rendering of the original [Poincaré] reference. $\qquad\square$


Example 1.8: (General Linear Group of Matrices)

The group of nonsingular, complex element, $M\times M$ matrices $GL(M,\,\mathbb{C})$ is clearly a group with matrix multiplication as the group operation. We now consider matrices “near” the identity; we give $GL(M,\,\mathbb{C})$ topology by declaring it to be a metric space with distance function $d: GL(M,\,\mathbb{C})\to\R;\,d(\gamma,\,\zeta) = \left\|\gamma-\zeta\right\|$ where $\left\|\cdot\right\|$ is the Frobenius norm $\left\|\xi\right\| = \sqrt{\mathrm{tr}(\xi^\dagger\,\xi)}$. This is is simply the Pythagorean sum of the magnitudes of all the elements in a matrix: $\left\|\xi\right\|=\sqrt{\sum\limits_{k,\,j} |\xi_{j,\,k}|^2}$.

We now consider the identity connected component $GL^+(M,\,\mathbb{C}) \subset GL(M,\,\mathbb{C})$ of this group.

In a small enough neigbourhood of the identity i.e. $\mathcal{B}(\id,\,\epsilon) = \{\gamma\in GL(M,\,\mathbb{C}) |\,\left\|\gamma-\id\right\| < \epsilon\}$ with $\epsilon < 1$, the matrix logarithm Taylor series (the Mercator-Newton series) converges uniformly. It maps this neighbourhood bijectively onto a neigbourhood $\V$ of $\Or$ in the linear space $\mathcal{M}(\mathbb{C},\,M)$ of general $M\times M$ complex matrices.

We define $GL^+(M,\,\mathbb{C})$ to be the smallest group that contains $\mathcal{B}(\id,\,\epsilon) = \exp\left(\V\right)$, i.e. $GL^+(M,\,\mathbb{C})$ is the group of all finite products of members of $\mathcal{B}(\id,\,\epsilon) = \exp\left(\V\right)$ and their inverses. The exponential of any neigbourhood of $\Or$ (i.e. containing some open ball $\mathcal{B}(\Or,\,\epsilon^\prime)$ will generate the same group $GL^+(M,\,\mathbb{C})$, for, given any $M\times M$ complex matrix $X\in\mathcal{M}(M,\,\mathbb{C})$, we can find an integer $n$ such that $X/n\in\mathcal{B}(\Or,\,\epsilon^\prime)$, whence $\exp(X) = \exp(X/n)^N$ where here $\exp$ is the universally convergent matrix exponential defined by the exponential Taylor series. So then $\exp(X)$ is an $n$-fold product of elements $\exp(X/n)$ where $X/n\in\mathcal{B}(\Or,\,\epsilon^\prime)$ for some $n\in\mathbb{N}>0$. Likewise for the inverse $\exp(-X)$. Thus any $\exp(\mathcal{B}(\Or,\,\epsilon^\prime))$ generates the smallest group containing the set $\{\exp(X)|\,X\in \mathcal{M}(M)\}$ independent of $\epsilon^\prime$.

It is now be clear that $GL^+(M,\,\mathbb{C})$ is a connected Lie group if we put $\Nid = \exp(\mathcal{B}(\Or,\,\epsilon^\prime)) \subset \mathcal{B}(\id,\,\epsilon)$ with $\epsilon<1$, $\lambda = \log$ and $\mathcal{V}$ to be $\mathcal{B}(\Or,\,\epsilon^\prime)$. The dimension $N$ of the group is $2\,M^2$, since $\mathcal{B}(\Or,\,\epsilon^\prime)$ comprises general (albeit small norm) $M\times M$ complex matrices.

For the time being, we shall have to take it as a proven fact that $GL(M,\,\mathbb{C})$ is a connected Lie group, so that our group $GL^+(M,\,\mathbb{C})$ defined above is indeed the whole of $GL(M,\,\mathbb{C})$.

If the $M\times M$ matrices $\hat{Z}_1,\,\hat{Z}_2,\,\cdots,\,\hat{Z}_{2\,M^2}$ are a basis for the $2\,M^2$ dimensional $\R$-vector space $\mathcal{M}(M,\,\mathbb{C})$ of $M\times M$ complex matrices, then, by the bijectivity of $\exp,\,\log$ between $\mathcal{B}(\id,\,\epsilon) ,\, \mathcal{B}(\Or,\,\epsilon^\prime)$ we explored above, we can always find unique co-ordinates $(z_j)_{j=1}^{2\,M^2}$ that define any $\gamma\in\Nid$ as $\gamma = \exp\left(\sum\limits_{k=1}^{2\, M^2}\,z_k\, \hat{Z}_k\right)$, and every member of $GL^+(M,\,\mathbb{C})$ is a finite product of matrices labelled thus (although the finite product is not unique: many products can multiply to the same member of $GL^+(M,\,\mathbb{C})$. The co-ordinates $(z_j)_{j=1}^{2\,M^2}$ uniquely and fully labelling $\Nid$ in this way (i.e. as superposition weights of an exponentiated vector $Z\in\mathcal{M}(M,\,\mathbb{C})$) are called geodesic co-ordinates, exponential co-ordinates, geodetic co-ordinates or canonical co-ordinates of the first kind. I shall use the word geodesic almost exclusively hereafter.

It should be heeded that $GL(M, \,\mathbb{C})$ is isomorphic to a subgroup of the identity connected component of $GL(2\,M, \,\R)$; we can construct this isomorphism through Freudenthal’s waiving procedure, as described in Example 1.5[FreudenthalDeVries]. The matrix $A+i\,B\in GL(N,\,\mathbb{C}$ maps to $A\otimes\id+B\otimes\left(\begin{array}{cc}0&-1\\1&0\end{array}\right) = \left(\begin{array}{c|c}A&-B\\\hlineB&A\end{array}\right)\in GL(2\,M, \,\R)$.

Example 1.9: (General Linear Group of Real Matrices)

If we replace the set $\mathbb{C}$ with the set $\R$ in Example 1.9, the example runs almost analogously with that example. We get the general linear group of $M\times M$ real matrices and we find that there is a connected Lie group $GL^+(M,\,\R)$ which comprises all finite products of matrices of the form $e^X$ where $X\in\mathcal{M}(M,\,\R)$ belongs to the set $\mathcal{M}(M,\,\R)$ of $M\times M$ real matrices. We take $\Nid = \exp(\mathcal{B}(\Or,\,\epsilon^\prime)) \subset \mathcal{B}(\id,\,\epsilon)$ with $\epsilon<1$, $\lambda = \log$ and $\mathcal{V}$ to be $\mathcal{B}(\Or,\,\epsilon^\prime)$, then the set of all finite products of $\Nid$ is connected Lie group with dimension $N = M^2$. We call this group $GL^+(M,\,\R)$, the identity component of the full group $GL(M,\,\R)$ of nonsingular real matrices. Again, the topology is induced by the Frobenius norm: the Pythagorean sum of all the elements in a matrix (this time of course we don’t have to take complex magnitudes): $\left\|\xi\right\|=\sqrt{\mathrm{tr}(\xi^\dagger\,\xi)}=\sqrt{\sum\limits_{k,\,j} |\xi_{j,\,k}|^2}$.

However, this time we do not get the whole of the group $GL(M,\,\R)$ in our connected Lie group $GL^+(M,\,\R)$, i.e. $GL^+(M,\,\R)\subset GL(M,\,\R)$ is a proper subset of $GL(M,\,\R)$. The reason for this is our formula $\det\exp X = \exp(\mathrm{tr}\,X)$; the determinants of all members of $\Nid$ are thus positive real numbers. So only nonsingular matrices with positive determinants can be finite products of members of $\Nid$.

We must take without proof for now the statement that $GL^+(M,\,\R)$ is in fact the whole of the group of nonsingular, positive determinant matrices, that $GL^+(M,\,\R)$ is a normal subgroup of $GL(M,\,\R)$ and that $GL(M,\,\R)/GL^+(M,\,\R)\cong\mathbb{Z}_2$. The two cosets are precisely $GL^+(M,\,\R)$, i.e. the group of positive determinant nonsingular matrices and the set of nonsingular matrices with negative determinant. For odd $M$, the two cosets are precisely $\pm GL^+(M,\,\R)$.

Example 1.10: (General Classical Groups and The Proper Lorentz Group $SO^+(1,\,3)$)

The group $U(M)$ (the $M\times M$ analogue of $U(2)$) conserves the $\ell^2$ length of vectors in $\mathbb{C}^2$, that is given a $\gamma\in SU(N)$, then

\begin{equation}\label{SO13Example_1}\left\|\gamma\,v\right\| = \left\|v\right\|\,\Leftrightarrow\,v^\dagger\,\gamma^\dagger\,\gamma\,v = v^\dagger\, v\,\forall\, v\in\mathbb{C}^N\,\Leftrightarrow\,v^\dagger\,(\gamma^\dagger\,\gamma-\id)\,v=0;\’\forall\, v\in\mathbb{C}^N\end{equation}

and this implies a seemingly stronger conservation law when we write $v+u$ instead of $v$ for any two arbitary $u,\,v\in\mathbb{C}^N$ because:

\begin{equation}\label{SO13Example_2}\begin{array}{rl}&(u+v)\dagger\,(\gamma^\dagger\,\gamma-\id)\,(u+v)=0;\;\forall\, v,\,u\in\mathbb{C}^N\\\Leftrightarrow&v^\dagger\,(\gamma^\dagger\,\gamma-\id)\,v + u^\dagger\,(\gamma^\dagger\,\gamma-\id)\,u + 2 u^\dagger\,(\gamma^\dagger\,\gamma-\id)\,v=0\;\forall\,u,\,v\in\mathbb{C}^N\end{array}\end{equation}

so that, on substituting $\eqref{SO13Example_2}$ into $\eqref{SO13Example_1}$ we get:


which, together with an assumption of unimodularity, is our current definition of $SU(M)$. So we see that simply an assumption of the conserved $\ell^2$ length of vectors is equivalent to the definition $\{\gamma\in\mathcal{M}(M,\,\mathbb{C})|\;\gamma^\dagger\,\gamma=\id\}$. Futhermore, the first equation $u^\dagger\,(\gamma^\dagger\,\gamma-\id)\,v=0$ in $\eqref{SO13Example_3}$ shows in turn that the definition $\left\|\gamma\,v\right\| = \left\|v\right\|;\,\forall\,v\in\mathbb{C}^N$ lis equivalent to an assumption that the transformation $v\mapsto\gamma\,v;\,u\mapsto\,\gamma\,u$ preserves the inner product $\left<u,\,v\right> = u^\dagger\,v$ because then $\left<\gamma\,u,\,\gamma\,v\right> = u^\dagger\,\gamma^\dagger\,\gamma\,v = u^\dagger\,v = \left<u,\,v\right>$.

Many of the so called “classical Lie groups” are defined analogously as groups of $M\times M$ matrices that act on $M\times 1$ such that the conserve different quadratic forms $Q$, which are bilinear forms defined by $Q(u,\,v) = u^\dagger \, \eta\,v$. By reasoning precisely analogous to the logical flow in $\eqref{SO13Example_1}$, $\eqref{SO13Example_2}$ and $\eqref{SO13Example_3}$, the conservation of the “diagonal form” or “inner product” form $Q(v,\,v) = v^\dagger \,v$ is logically equivalent to the conservation of the seemingly much more general “inner product form” $Q(u,\,v) = u^\dagger \, \eta\,v$. The group $U(M)$ is simply the special case when dealing with complex matrices and the quadratic form is the “simplest case” defined by $\eta = \id$. The orthogonal group $O(M)$ is the special case when dealing with real matrices and vectors (so that $v^\dagger = v^T$) and the quadratic form is the “simplest case” defined by $\eta = \id$. The groups $SU(M)$ and $SO(M)$ are simply the subgroups we get when we add the further constraint that the group members should be unimodular.

The description of the classical groups through the quadratic forms they conserve is central to, and extremely well expounded on by, Chapter 3 of [Rossmann].

In the real case, the simplest $4\times4$ example is the $4\times4$ orthogonal group $O(4)$, whose members conserve the quadratic form $v^T\,v = t^2+x^2+y^2+z^2$, where $t,\,x,\,y,\,z$ are the four components of the $4\times 1$ column vector $v$. This form is the wonted, Euclidean $\ell^2$ length of a vector and induces the standard $\ell^2$ metric function $\d(u,\,v)=(u-v)^T\,(u-v)$ measuring the distance between two points with position vectors $u,\,v$ in the Euclidean space.

An added sophistication is to consider the group of real matrices that conserve the squared Minkowsky “norm” $\left\|v\right\|_M^2 = t^2-x^2-y^2-z^2$. The sequence of signs $+,\,-,\,-,\,-$ applied to the terms in the otherwise Euclidean norm is called the signature of the “norm”. I write the quotes here because this “norm” is not a norm in the wonted, mathematical sense, it can vanish for nonzero vectors called null vectors which define the light cone $t^2-x^2-y^2-z^2 = 0$. Futhermore, it is not even a seminorm, as it is not subadditive (does not fulfill the triangle inequality). In the field of physics called Special Relativity the Minkowsky “distance” $\left\|v-u\right\|_M=\sqrt{(t_u-t_v)^2-(x_u-x_v)^2-(y_u-y_v)^2-(z_u-z_v)^2}$ between two points or events is called the proper time that separates them, and is measured to be the same by all inertial observers. In the Minkowsky space of Special Relativity, there are three spatial co-ordinates $x,\,y,\,z$ and $t$ is the time, but here measured in distance units; the time elapsed between two events with the same spatial co-ordinates is measured in the distance light travels over that time interval. Alternatively one can use the astronomer’s approach and measure all four co-ordinates with time units: $x,\,y,\,z$ now have, for example, units of seconds (analogous to the astromomer’s light year, light hour and so on). It is readily checked, arguing analogously to the reasoning leading to $\eqref{SO13Example_3}$ that real matrices that conserve the Minkowsky norm are precisely those that fulfill:

\begin{equation}\label{SO13Example_4}\gamma\in \mathcal{M}(4,\,\R);\,\gamma\,\eta\,\gamma^T = \eta;\;\eta=\mathrm{diag}[1,\,-1,\,-1,\,-1]\end{equation}

This is the general Lorentz group $O(1,\,3)$. The notation $O(p,\,q)$ stands for the group of matrices that act on the vector space $\R^{p+q}$ and conserve the quadratic form:

\begin{equation}\label{SO13Example_5}Q(v) = v_1^2+v_2^2 + \cdots + v_p^2 – (v_{p+1}^2 + v_{p+2}^2 + \cdots + v_{p+q}^2) \end{equation}

an equation which in turn, by the reasoning leading to $\eqref{SO13Example_2}$, implies the matrices also conserve the generalised “inner product” (not an inner product in the mathematical sense unless $p=0$ or $q=0$):

\begin{equation}\label{SO13Example_6}Q(u,\,v) = u_1\,v_1+u_2\, v_2 + \cdots + u_p\,v_p – (u_{p+1}\,v_{p+1} + u_{p+2}\,v_{p+2}+ \cdots + u_{p+q}\,v_{p+q})\end{equation}

and these matrices are precisely characterised by:

\begin{equation}\label{SO13Example_7}\gamma\in \mathcal{M}(p+q,\,\R);\,\gamma\,\eta\,\gamma^T = \eta;\;\eta=\mathrm{diag}[\overbrace{1,\,1,\,1,\cdots,\,1}^\text{p elements},\,\underbrace{-1,\,-1,\,-1,\cdots,\,-1}_\text{q elements}]\end{equation}

$O(p,\,q)$ is readily shown to be a group: products of matrices conserving the generalised norm and inner product also conserve the same, as do the inverses of these matrices. To study this group more deeply, let us think of it at first in the metric space $\mathcal{M}(p+q,\,\R)$ of $(p+q)\times(p+q)$ real matrices. We first assume we can find a $C^1$ (with respect to the topology of $\mathcal{M}((p+q),\,\R)$) path through this space and ask whether we can find a $C^1$ path $\sigma:[-1,1]\to O(p,\,q);\,\sigma(0)=\id$ that is wholly within $O(p,\,q)$. Now let us calculate the tangent to this path at some value of $\tau=\tau_0$. Since $O(p,\,q)$ is a group, $\tilde{\sigma}(s)=\sigma^{-1}(\tau_0)\,\sigma(\tau_0+\varsigma)$ is also a path through the group, so that:

\begin{equation}\label{SO13Example_8}\left.\d_\tau\sigma(\tau)\right|_{\tau=\tau_0} = \sigma(\tau_0)\,\left.\d_\varsigma\,\sigma^{-1}(\tau_0)\,\sigma(\tau_0+\varsigma)\right|_{\varsigma=0}\end{equation}

that is, because $\sigma^{-1}(\tau_0)\,\sigma(\tau_0+\varsigma)$ passes through $\id$ at $\varsigma=0$, the tangent is of the form $\d_\tau \sigma(\tau) = \sigma(\tau)\,X$, where $X$ is $(p+q)\times(p+q)$ matrix and is a tangent to some $C^1$ path in $\mathcal{M}(4,\,\R)$ lying within $O(p,\,q)$. This suggests that we should study the tangents to $C^1$ paths through the identity as tangents to paths at any point are simply these matrices $X$ “translated” to a matrix of the form $X\mapsto \sigma(\tau)\,X$; from $\eqref{SO13Example_7}$ these are precisely those matrices that belong to the set:

\begin{equation}\label{SO13Example_9}\mathfrak{so}(p,\,q) = \left\{X\in \mathcal{M}(p+q,\,\R)\left|\,\begin{array}{lcl}X&=&\eta\,X^T\,\eta^{-1} = \eta\,X^T\,\eta;\\\eta&=&\mathrm{diag}[\overbrace{1,\,1\,1,\cdots,\,1}^\text{p elements},\,\underbrace{-1,\,-1\,-1,\cdots,\,-1}_\text{q elements}]\end{array}\right.\right\}\end{equation}

This is a vector space; in $O(1,3)$ for example, these are a six dimensional real vector space of matrices with one possible basis:


Witness that the $\hat{J}_j$ are skew-Hermitian (skew-symmetric for real matrices) whereas the $\hat{K}_j$ are symmetric.

So now, suppose we consider the path $\gamma(\tau) = e^{\tau\,X}$ for $X\in\mathfrak{so}(p,\,q)$, where naturally $e^{\tau\,X}$ is a matrix exponential defined by the universally convergent matrix Taylor series. At $\tau=0$, this matrix, being the identity fulfills $\gamma\,\eta\,\gamma^T = \eta$. At a general point $\tau$ on the path:

\begin{equation}\label{SO13Example_11}\d_\tau\,\gamma\,\eta\,\gamma^T = \d_\tau\,\gamma\,\eta\,\gamma^T + \gamma\,\eta\,(\d_\tau\,\gamma)^T = e^{\tau\,X}\,\left(X\,\eta+\eta\,X^T\right)\,e^{\tau\,X^T}=0\end{equation}

the last step following from $\eqref{SO13Example_9}$. Therefore, $\gamma\,\eta\,\gamma^T$ is constant on the path and equal to its value at $\tau=0$, to wit $\id$. Therefore $\gamma(\tau)\,\eta\,\gamma(\tau)^T=\eta;\forall\,\tau\in\R$ and the path $\gamma(\tau)$ indeed lies within the group $SO(p,\,q)$.

Conversely, consider any element $\gamma\in O(p,\,q)$ near the identity, i.e. with $\|\gamma – \id\|<1$ so that the Mercator-Newton logarithm series converges. We can therefore write any element of this kind as a matrix of the form $\gamma = e^Z\in O(p,\,q)$ where $Z = \log\gamma \in \mathcal{M}(4\,\R)$. Now consider the $C^1$ path $\sigma:[-1,\,1]\to \mathcal{M}(p+q,\,\R);\,\sigma(\tau) = e^{\tau\,Z},\,Z=\log\gamma$ through $\mathcal{M}(p+q,\,\R)$. We do not know that this path is inside $O(p,\,q)$, aside from at the point $\gamma = \sigma(1)$. We know from Example 1.8 that every $(p+q)\times (p+q)$ real matrix near ($\|\gamma – \id\|<1$) the identity can be represented uniquely in the form:

\begin{equation}\label{SO13Example_12}\gamma = e^Z=\mu_1\left((z_k)_{k=1}^{(p+q)^2}\right)= \exp\left(\sum\limits_{k=1}^{(p+q)^2}\,z_k\, \hat{Z}_k\right)\end{equation}

where $Z\in\mathcal{M}(p+q,\,\R)$ and the $\hat{Z}_k$ form a basis for the real vector space $\mathcal{M}(p+q,\,\R)$ of $(p+q)\times(p+q)$ real matrices and the $(p+q)$ $z_k$ are the unique geodesic co-ordinates for $\gamma$. Now let us consider $\mathfrak{so}(p,\,q)$ as a vector subspace of $\mathcal{M}(p+q,\,\R)$ so that we have $\mathcal{M}(p+q,\,\R) = \mathfrak{so}(p,\,q) \oplus \mathfrak{so}(p,\,q)^\perp$. Now choose bases $\{\hat{X}_j\}_{j=1}^{\frac{(p+q)^2}{2}}$ and $\{\hat{Y}_j\}_{j=1}^{\frac{{p+q}^2}{2}}$ for $\mathfrak{so}(p,\,q)$ and $\mathfrak{so}(p,\,q)^\perp$ respectively and consider a member of $GL(M,\,\R)$ of the form

\begin{equation}\label{SO13Example_13}\mu_2\left((x_k)_{k=1}^{\frac{(p+q)^2}{2}},\,(y_k)_{k=1}^{\frac{(p+q)^2}{2}}\right) = \exp\left(\sum\limits_{k=1}^{\frac{(p+q)^2}{2}}\,x_k\, \hat{X}_k\right)\,\exp\left(\sum\limits_{k=1}^{\frac{(p+q)^2}{2}}\,y_k\, \hat{Y}_k\right)\end{equation}

within a neigbourhood of the identity small enough such that geodesic co-ordinates label this neighbourhood uniquely. So we have the alternative co-ordinates $z_k$ and $x_k, \,y_k$ for the same element. The Jacobian of the transformation between the $z_k$ and the $x_k, \,y_k$ in either direction between the co-ordinates defined by $\eqref{SO13Example_12}$ and $\eqref{SO13Example_13}$ is invertible at the identity (where $z_k=x_k=y_k=0$) and indeed the Jacobian itself there is the $(p+q)^2\times (p+q)^2$ identity matrix. Therefore, by the inverse function theorem, there is some open ball $\mathcal{B}(\Or,\,\epsilon)\subset \R^{(p+q)^2}$ wherein $\mu_2 \circ \mu_1$ is bijective, so every member of $GL(M,\,\R)$ within the neighbourhood $\exp(\mathcal{B}(\Or,\,\epsilon))\subset GL(M,\,\R)$ is uniquely representable in the co-ordinates of $\eqref{SO13Example_13}$, i.e. is uniquely representable as $\gamma = e^Y\,e^X$ where $X\in \mathfrak{so}(p,\,q)$ and $Y\in \mathfrak{so}(p,\,q)^\perp$ are uniquely defined by and uniquely define $\gamma$. So suppose $\gamma\in O(p,\,q)$ so that $\gamma\,\eta\,\gamma^T = e^Y\,e^X\,\eta\,e^{X^T}\,e^{Y^T} = \eta$. But, since $X\in \mathfrak{so}(p,\,q)$, from $\eqref{SO13Example_9}$ we have $e^X\,\eta\,e^{X^T} = \eta$ so that we must have $e^{\tau\,Y}\,\eta\,e^{\tau\,Y^T} = \eta$ when $\tau=1$ even though $Y\neq\mathfrak{so}(p,\,q)$.

Now we think about the solutions of the equation $e^{\tau\,Y}\,\eta\,e^{\tau\,Y^T} = \eta$; we know that there is our assumed solution at $\tau=1$. Now for the moment assume there is no $\tau_{min}(Y)>0$ such that $e^{\tau\,Y}\,\eta\,e^{\tau\,Y^T} \neq \eta$ for $|\tau|< \tau_{min}(Y)$. That is, there is no interval around $\tau=0$ that is free of solutions to $e^{\tau\,Y}\,\eta\,e^{\tau\,Y^T} = \eta$. If so, then for any $\epsilon_1>0$ we can find a $\delta_1$ such that $\delta_1<\epsilon_1$ and $e^{\delta_1\,Y}\,\eta\,e^{\delta_1\,Y^T} = \eta$ solves the equation. But then $e^{n\,\delta_1\,Y}\,\eta\,e^{n\,\delta_1\,Y^T} = \eta;\,\forall\,n\in\mathbb{N}$ is also a solution, that is, $e^{\tau_\omega\,Y}\,\eta\,e^{\tau_\omega\,Y^T} = \eta;\;\forall\,\tau_\omega\in\Omega\subset[-1,\,1]$ where $\Omega$ is a dense subset of the interval $[-1,\,1]$. But the matrix function $e^{\tau\,Y}\,\eta\,e^{\tau\,Y^T}$ is a continuous (indeed $C^\omega$) function of $\tau$, therefore we must have $e^{\tau\,Y}\,\eta\,e^{\tau\,Y^T}=\eta;\;\forall\,\tau\in[-1,\,1]$. That is, from $\eqref{SO13Example_9}$ we see that $Y\in \mathfrak{so}(p,\,q)$, so we have $Y\in\mathfrak{so}(p,\,q)\cap \mathfrak{so}(p,\,q)^\perp = \{\Or\}$. That is, we have $Y=0$. Therefore, if $Y\neq0$ we see that there must be a $\tau_{min}(Y)$ such that $e^{\tau\,Y}\,\eta\,e^{\tau\,Y^T} \neq \eta$ for $|\tau|< \tau_{min}(Y)$.

Therefore, because $\mathfrak{so}(p,\,q)^\perp $ is finite dimensional, there is positive, nonzero $T = \inf\limits_{Y\in \mathfrak{so}(p,\,q)^\perp;\,\|Y\|\leq1} \tau_{min}(Y)$. In other words, there is an open ball neighbourhood $\mathcal{B}(\Or,\,\epsilon_2)\subset \R^{(p+q)^2}$ such that the only solutions to $e^Z\,\eta\,e^{Z^T}=\eta$ with $Z\in \mathcal{B}(\Or,\,\epsilon_2)$ are those where $Z\in \mathfrak{so}(p,\,q)$. Otherwise put: near enough to the identity, the only members of $O(p,\,q)$ are of the form $e^X$ where $X\in \mathfrak{so}(p,\,q)$. So therefore, if we put $\Nid = \exp(\mathcal{B}(\Or,\,\epsilon_2)\cap \mathfrak{so}(p,\,q))$, $\bigcup\limits_{k=1}^\infty \Nid^k$ is a connected Lie group with the matrix logarithm $\log:\Nid\to \mathcal{B}(\Or,\,\epsilon_2)\cap \mathfrak{so}(p,\,q)$ as the labeller function. This connected Lie group is called $SO^+(p,\,q)$. For $p=1,\,q=3$ we have the group $SO^+(1,\,3)$, which is called the identity component of the full Lorentz group or the proper, orthochronous Lorentz group.

However, this connected Lie group is not the whole of $O(p,\,q)$. For example, take heed that the matrices defined by $\eqref{SO13Example_9}$ have all noughts along the leading diagonal; in particular they are traceless, so their exponentials have unit determinant, by the formula $\det e^X = e^{\mathrm{tr}\,X}$ that we have met in Example 1.5. The whole of the connected Lie group $SO^+(p,\,q)$ thus comprises finite products of unimodular matrices, therefore $SO^+(p,\,q)$ is made up of only unimodular matrices. Whereas, for example, $\eta$ itself, fulfilling $\eta\,\eta\,\eta^T = \eta$ conserves the quadratic form of $\eqref{SO13Example_6}$, but is not generally unimodular. In particular it has determinant -1 in $O(1,\,3)$. Furthermore, $SO^+(p,\,q)$ has the following property. Given that it conserves $t^2-x^2-y^2-z^2$, suppose that we chose a $4\times1$ column vector $v$ with $t>0$ and $t^2-x^2-y^2-z^2>0$ so that we can write $t^2=x^2+y^2+z^2+\alpha^2$ for the vector $\gamma\,v$ where $\alpha$ is constant. Suppose $\gamma\in SO^+(1,\,3)$, so that $\gamma$ is a finite product $\prod\limits_{k=1}^R\,e^{X_k}$ for $X_k\in \mathfrak{so}(1,\,3)$. So now consider the vector $\prod\limits_{k=1}^R\,e^{\tau\,X_k}\,v$; at $\tau=0$ it is simply the vector $v$ and as $\tau$ varies the first component $t$ of $v$ is a continuous (indeed $C^\omega$) function of $\tau$. At $\tau=0$ we have $t=+\sqrt{x^2+y^2+z^2+\alpha^2}$ and at any $\tau>0$ the value of $t$ cannot lie in the interval $-\alpha < t < +\alpha$. In other words, $t$ cannot vary continuously and become negative. Therefore, the first ($t$) component of $\prod\limits_{k=1}^R\,e^{\tau\,X_k}\,v$ must remain positive and indeed its lower bound is $+\alpha$. So members of $SO^+(1,\,3)$ cannot change the sign of $t$ when applied to a vector with $t^2-x^2-y^2-z^2>0$ (such a vector is called a timelike vector in relativity). Such members are said to be orthochronous since in special relativity they cannot switch the sign of the time component of a four-vector.

However, in the general group $O(1,\,3)$, the member $\mathrm{diag}[-1,\,1,\,1,\,1] \in O(1,\,3)$ most certainly does change the sign of the time component in this way. The member $\mathrm{diag}[-1,\,-1,\,-1,\,-1] \in O(1,\,3)$ has unity determinant but is not orthochronous either, so the orthochronous property certainly does not follow from unimodularity. Indeed we must state without proof that $SO^+(1,\,3)$, the identity connected component of $O(1,\,3)$, is the group defined as the subgroup of $O(1,\,3)$ whose matrices are (i) unimodular and (ii) orthochonous. Indeed $SO^+(1,\,3)$ is a normal subgroup of $O(1,\,3)$ with the quotient $O(1,\,3)/SO^+(1,\,3)\cong \mathbb{V}_4$, where $\mathbb{V}_4$ is the Klein fourgroup and the cosets of $SO^+(1,\,3)$ are defined by the following matrix realisation of $\mathbb{V}_4$

\begin{equation}\label{SO13Example_14}\begin{array}{l}O(1,\,3)/SO^+(1,\,3) = \{\id,\,P,\,T,\,P\,T=-\id\}\cong\mathbb{V}_4\\P=\mathrm{diag}[1,\,-1,\,-1,\,-1];\,T=\mathrm{diag}[-1,\,1,\,1,\,1]\end{array}\end{equation}

The ideas in Example 1.10 which define families of groups conserving generalised “norms” and “inner products” defined by quadratic forms $Q(u,\,v) = u^\dagger\,eta\,v$ are can be readily generalised in several ways as follows

  1. The conservation condition leads to the matrix equation $\gamma^\dagger\,\eta\,\gamma = \eta$, whence a condition $X^\dagger \,\eta = – \eta\,X$ on the tangents $X$ to $C^1$ paths through the matrix group at the identity.
  2. The latter condition $X^\dagger \,\eta = – \eta\,X$ is a linear condition, and those tangents fulfilling it form a real linear space $\g = \{X|\,X=\sum\limits_{k=1}^N\,x_k\,\hat{X}_k\}$ whence, arguing exactly as in Example 1.10, one proves that the group of matrices fulfilling $\gamma^\dagger\,\eta\,\gamma = \eta$ contains a subgroup which is a connected Lie group $\G$ with $\Nid = \exp\left(\{X|\,X=\sum\limits_{k=1}^N\,x_k\,\hat{X}_k;\,\left\|X\right\|<\epsilon\}\right)$ for some $\epsilon > 0$, $\mathcal{V} = \{X|\,X=\sum\limits_{k=1}^N\,x_k\,\hat{X}_k;\,\left\|X\right\|<\epsilon\}$ and $\lambda = \log$.
  3. As we have seen, this connected Lie group $\G$ is not always the whole group of matrices fulfilling $\gamma^\dagger\,\eta\,\gamma = \eta$ but a normal subgroup.

The idea of a signature in the case of real matrix groups is also general for groups conserving quadratic forms. We can always write a quadratic form as $Q(u,\,v) = \sum\limits_{k,\,j} =\eta_{j,\,k}\, u_j^*\,v_k$ and we pair up “off-diagonal” terms as $\eta_{\ell,\,\nu}\, u_\ell^*\,v_\nu + \eta_{\nu,\,\ell}\, u_\nu^*\,v_\ell$. We can then see that in the real case, where there is no complex conjugate involved, the sum is unchanged if we replace pairs of elements $\eta_{\ell,\,\nu}$ and $\eta_{\nu,\,\ell}$ by their average, i.e. we set them both to $(\eta_{\ell,\,\nu}+\eta_{\nu,\,\ell})/2$. Therefore, we can always choose the real matrix $\eta$ to be a symmetric matrix i.e. $\eta=\eta^T$. Recall that such a matrix can always be diagonalised by a real orthogonal matrix and the eigenvalues are also all real. Therefore we can write:

\begin{equation}\label{SymmetricQuadraticForm}\begin{array}{lcl}Q(u,\,v) &=& u^T\,\eta\,v=u^T\,T\,\operatorname{diag}[\sqrt{|\lambda_1|},\,\sqrt{|\lambda_2|},\,\cdots,\,\sqrt{|\lambda_N|}]\; \times\\&&\quad\quad\operatorname{diag}[\operatorname{sgn} \lambda_1,\,\operatorname{sgn} \lambda_2,\,\cdots,\,\operatorname{sgn} \lambda_M]\,\operatorname{diag}[\sqrt{|\lambda_1|},\,\sqrt{|\lambda_2|},\,\cdots\,\sqrt{|\lambda_M|}]\,T^T\,v\end{array}\end{equation}

where $T$ is the diagonal matrix diagonalising $\eta$ as $\eta = T\,\Lambda\,T^{-1}=T\,\Lambda\,T^T$ and $\Lambda = \operatorname{diag}[\lambda_1,\,\lambda_2,\,\cdots,\,\lambda_M]$ is the matrix of eigenvalues $\lambda_j$. Firstly, if all the eigenvalues $\lambda_j$ are nonzero, the transformation $\R^M\to\R^M;\;v\mapsto \operatorname{diag}[\sqrt{|\lambda_1|},\,\sqrt{|\lambda_2|},\,\cdots\,\sqrt{|\lambda_M|}]\,T^T\,v$ is bijective and thus simply invertibly changes the basis of the real vector space $\R^M$. Therefore, an $M\times M$ real matrix conserves the quadratic form $Q(u,\,v)$ for all $u,\,v\in\R^M$ if and only if it conserves the quadratic form $\tilde{Q}(u,\,v) = u^T\,\operatorname{diag}[\operatorname{sgn} \lambda_1,\,\operatorname{sgn} \lambda_2,\,\cdots,\,\operatorname{sgn} \lambda_M]\,v$ (for every $u,\,v\in\R^N$ there are unique $\tilde{u} = \operatorname{diag}[\sqrt{|\lambda_1|},\,\sqrt{|\lambda_2|},\,\cdots\,\sqrt{|\lambda_M|}]\,T^T\,u\in\R^N$ and $\tilde{v} = \operatorname{diag}[\sqrt{|\lambda_1|},\,\sqrt{|\lambda_2|},\,\cdots\,\sqrt{|\lambda_M|}]\,T^T\,v\in\R^N$ and contrariwise), therefore the group of real matrices defined as conserving the quadratic form $Q(u,\,v)$ is precisely a group of the form $O(p,\,q)$ for $p,\,q>0$.

If, on the other hand, some of the eigenvalues $\lambda_j$, say $m$ of them, are nought, then we simply use the appropriate co-ordinate transformation and decompose $R^N$ as $\R^M= \ker \eta \oplus (\ker\eta)^\perp$, i.e. decompose as a direct sum of the subspace $\ker \eta$ spanned by the null eigenvectors of $\eta$ and all the others. We also invertibly transform the basis by $\tilde{u} = \operatorname{diag}[1,\,1,\,\cdots,\,1,\,\sqrt{|\lambda_1|},\,\sqrt{|\lambda_2|},\,\cdots\,\sqrt{|\lambda_{M-m}|}]\,T^T\,u\in\R^N$ where we replace the $m$ zero eigenvalues by $1$. It is then not hard to show (do this) that the invertible transformations conserving the quadratic form are precisely those of the form:

\begin{equation}\label{SingularCase}\gamma=\left(\begin{array}{c|c}\gamma_m & 0\\ \hline 0 & \gamma_{M-m}\end{array}\right)\end{equation}

where $\gamma_m\in GL(m,\,\R)$ and $\gamma_{M-m} \in O(p,\,q);\;p+q = M-m$, so we reduce the situation to the combination of the nonsingular case, with$\gamma_{M-n} \in O(p,\,q)$ and a general, invertible $n\times n$ matrix $\gamma_n$. So the singular case also leads to the nonsingular case.

In the complex matrix case we also have the concept of signature if the quadratic form has conjugate symmetry, i.e. $Q(u,\,v) = (Q(v,\,u))^*$. In this case, $(Q(u,\,v))^* = (u^\dagger\, \eta\,v)^\dagger = v^\dagger\,\eta^\dagger\,u$ so that $(Q(u,\,v))^* =Q(v,\,u);\,\forall\,u,\,v\in\mathbb{C}^M$ if and only if $\eta=\eta^\dagger$ and again $\eta$ is a Hermitian matrix, which can be always be diagonalised by an orthonormal matrix $T$, i.e. $T^{-1} = T^\dagger$ and $\eta = T\,\Lambda\,T^{-1} = T\,\Lambda\,T^\dagger$ and $\Lambda$ is the diagonal matrix of eigenvalues, which are still all real.

However, in the case of a general quadratic form with a general complex $M\times M$ matrix $\eta$, no further simplification can be made.

Let us take stock of the wealth of connected Lie group examples which we shall constantly refer to in our discussion of general connected Lie groups:

  1. One dimensional connected Lie groups: the real numbers with addition or positive reals with multiplication are noncompact examples, whilst the circle group $U(1)\cong SO(2)$ is compact;
  2. Abelian Lie groups, which are either compact torus groups $\mathbb{T}^N$, noncompact real vector spaces $\R^N$ (which category includes the complex vector spaces, through the Freudenthal-de Vries concept of waiving) or, lastly can be direct products of a compact torus and a noncompact $\R^N$;
  3. The Special Orthogonal Group $SO(3)$ and its $M$-dimensional generalisation $SO(M)$. The general orthogonal groups $O(M)$ are not connected Lie groups; they instead include $SO(M)$ as a normal subgroup;
  4. The Special Unitary Group $SU(2)$ and its $M$-dimensional generalisation $SU(M)$. The general unitary groups $U(M)$, i.e. those which conserve the square norm $\left\|u\right\| = u^\dagger\,u$ and thus the inner product $\left<u,\,v\right> = u^\dagger,\,v$ for all $u,\,v\in\mathbb{C}^M$ but which are not unimodular are also connected Lie groups. This last fact follows from $\gamma^\dagger\,\gamma = \gamma\,\gamma^\dagger=\id$ for $\gamma\in U(M)$ and so $\gamma$ is a normal matrix, thus always diagonalisable by an orthonormal matrix and thus can always be written as $\gamma = e^X$ where $X$ is skew-symmetric;
  5. The identity component of the general linear real group $G^+(M,\,\R)$, the general linear complex group $G(M,\,\mathbb{C})$ and their unimodular versions $SL(M,\,\R)$ and $SL(M,\,\mathbb{C})$ are all connected Lie groups;
  6. The projective special linear real group $PSL(2,\,\R)$ is the group of all orientation-preserving isometries of the Lobachevskian Hyperbolic plane, i.e. isometric to the group of all Möbius transformations of the form $f: \mathbb{C}\to\mathbb{C};\,f(z) = \frac{a\,z+b}{c\,z+d};\,a\,d-b\,c = 1;\,a,\,b,\,c,\,d\in\R$ is a connected Lie group;
  7. The projective special linear complex group $PSL(2,\,\mathbb{C})$ is isomorphic to the group of all bijective Möbius transformations and is a connected Lie group;
  8. The group of proper, orthochronous Lorentz transformations $SO^+(1,\,3)$, being the identity connected component of the general Lorentz group $O(1,\,3)$ is a connected Lie group, as is its generalisation $SO^+(p,\,q)$ of generalised norm preserving real matrices.


  1. Wulf Rossmann: “Lie Groups: An Introduction through Linear Groups (Oxford Graduate Texts in Mathematics)”
  2. K. Engø, “On the BCH-formula in so(3)”, BIT Numerical Mathematics, 41, number 3, 2001, pp629-632
  3. Roger Penrose, “The Road to Reality: A Complete Guide to the Laws of the Universe“, Random House, London, 2004
  4. Chris Doran and Anthony Lasenby, “Geometric Algebra for Physicists“, Cambridge University Press, Cambridge, 2003
  5. William Rowan Hamilton, William Edwin Hamilton (Editor), “Elements of Quaternions“, Longmans, Green & Co, London, 1866.
  6. Hans Freudenthal and Hendrik de Vries, “Linear Lie Groups“, Academic Press, New York, 1969, Definition 1.8
  7. Hans Freudenthal, “Die Topologie der Lieschen Gruppen als algebraisches Phänomen. I“, Annals of Mathematics, Second Series, 42 #5, 1941, pp1051-1074
  8. John Stillwell, “Geometry of Surfaces“, Springer Verlag, New York, 1992, §4.4
  9. Henri Poincaré, “Théorie des groupes Fuchsiens“, Acta. Math. 1, 1882 pp1-62