Chapter 4: The Adjoint Representation

This short post introduces the big A Adjoint Representation of a connected Lie group $\mathcal{G}$ which is a homomorphism $\Ad:\G\to GL(\g)$ between $\mathcal{G}$ and the group $GL(\g) = \mathrm{Aut}(\g)$ of invertible homogeneous linear transformations of the Lie algebra $\g$. To define this homomorphism, we study the action of a general $\gamma \in\G$ on a $C^1$ path through the identity. We take a path $\sigma_Y(\tau)$ through the identity with tangent $Y$ there and study the tangent of the new $C^1$ path defined by $\gamma\,\sigma_Y(\tau)\,\gamma^{-1}$, which is also a $C^1$ path through the identity. In general the new path has a different tangent from $Y$. We introduce the Adjoint representation through a theorem, a figure and an important example.

Theorem 4.1 (Adjoint Shuffle of Paths):

Given a $C^1$ path through the identity $\sigma_Y:[-a,\,a]\to\Nid$ where $\sigma_Y(0)=\id$ and $Y\in\g$ is its tangent there, then the path:

\begin{equation}
\label{AdjointShuffleTheorem_1}
\phi_\gamma: [-b,\,b]\to\Nid;\,\phi_\gamma(\tau) = \gamma\,\sigma_Y(\tau)\,\gamma^{-1}
\end{equation}

where $\gamma\in\G$ is any member of the connected Lie group is a $C^1$ path through the identity for some $b>0$ and its tangent there is defined by a linear, bijective (i.e. invertible) map $\Ad(\gamma):\g\to\g$ applied to the tangent $Y$ of $\sigma_Y$ at the identity. Thus $\Ad(\gamma):\g\to\g$ is an automorphism of $\g$. The mapping $\Ad: \G\to\mathrm{Aut}(\mathcal{g})\subseteq GL(N,\,\mathbb{R})$ defined by $\Ad(\gamma) = \Ad_\gamma$ is a homomorphism from the Lie group $\G$ onto the group of inner automorphisms $\mathrm{Inn}(\g)\subset\mathrm{Aut}(\mathcal{g})$ of $\g$.

Proof: Show Proof

By the Homgeneity Axiom 5, we can write $\gamma =\prod\limits_{j=1}^M \gamma_j$ for some finite $M$ and where $\gamma_j \in \Nid,\,j=1\cdots M$. Now let $\sigma:[-\tau_0,\,\tau_0]\to \Nid$ be a $C^1$ path through $\id$ where $\tau_0>0$ and $\sigma(0) = \id$. Thus $\gamma_M \,\sigma(\tau)\,\gamma_M^{-1} =\sigma_M(\tau)$ for $|\tau|<\tau_M $ for some $\tau_M > 0$ is a $C^1$ path through $\id$ (by the Group Product Continuity Axiom 3 and the Nontrivial Continuity Axiom 4) as inductively are $\gamma_{j-1}\,\sigma_j (\tau)\,\gamma_{j-1}^{-1}=\sigma_{j-1}(\tau),\,j=M,\,M-1,\,\cdots,\, 1$ for $|\tau|<\tau_j $ for some $\tau_j > 0$. Note, in particular, $\gamma_1\,\sigma_2(\tau)\,\gamma_1^{-1}=\gamma \,\sigma(\tau)\,\gamma^{-1}$, so that:

\begin{equation}
\label{AdjointShuffleTheorem_2}
\phi_\gamma: [-b,\,b]\to\Nid;\,\phi_\gamma(\tau) = \gamma \,\sigma_Y(\tau)\,\gamma^{-1}
\end{equation}

is indeed a $C^1$ path through the identity for some $b>0$, thus it defines a mapping:

\begin{equation}
\label{AdjointShuffleTheorem_3}
\Ad(\gamma):\g\to\g; \Ad(\gamma)\,Y = \left.\mathrm{d}_\tau \lambda\left(\gamma\,\sigma_Y(\tau)\,\gamma^{-1}\right)\right|_{\tau=0}
\end{equation}

of the Lie algebra $\g$ into itself. This map is readily shown to be linear by writing $\gamma\, \sigma_X(\tau)\,\sigma_Y(\tau)\, \gamma^{-1} = \gamma\, \sigma_X(\tau)\,\gamma^{-1}\,\gamma\,\sigma_Y(\tau)\, \gamma^{-1}$ and then applying the method of Lemma 3.2 to the two transformed entities $\gamma\, \sigma_X(\tau)\,\gamma^{-1}$ and $\gamma\, \sigma_Y(\tau)\,\gamma^{-1}$.

Therefore the mapping is wholly defined by its matrix $\Ad(\gamma)$ acting on the Lie algebra. It is now readily shown that $\Ad(\gamma_1)\,\Ad(\gamma_2) = \Ad(\gamma_1\,\gamma_2)$, since the group product is associative (consider the mapping $Y \mapsto \left.\mathrm{d}_\tau \lambda\left(\gamma_1\gamma_2\,\sigma_Y(\tau)\,\gamma_2^{-1}\,\gamma_1^{-1}\right)\right|_{\tau=0} $). Therefore, in particular $\Ad(\gamma) \,\Ad(\gamma^{-1}) = \Ad(\id) = \id_N$, where $\id_N$ is the $N\times N$ identity matrix, so that $\Ad(\gamma)$ is invertible and $\Ad(\gamma)^{-1} = \Ad(\gamma^{-1})$. We have therefore shown that $\Ad(\gamma)$ is nonsingular for all $\gamma\in\G$ and so $\Ad$ defines a homomorphism between $\G$ and a group $\Ad(\G)=\mathrm{Inn}(\g)\subseteq \mathrm{Aut}(\g)\subseteq GL(N,\,\mathbb{R})$ of invertible $N\times N$ matrices: the inner automorphisms of $\g$.$\quad\square$

The matrix, or linear transformation, $\Ad(\gamma)$ is the differential of the Lie group automorphism $\zeta\mapsto\gamma\,\zeta\,\gamma^{-1}$ at $\gamma= \id$. If $\gamma\in\G$ commutes with every member of $\zeta\in\G$, i.e. $\gamma\in\mathcal{Z}(\G)$, the centre of $\G$, then clearly the transformation of the $C^1$ path $\sigma_Y$ through the identity wrought by $\sigma_Y(\tau)\mapsto\gamma\,\sigma_Y(\tau)\,\gamma^{-1} = \sigma_Y(\tau)$ is trivial and therefore $\Ad(\gamma)=\id_N$. Hence, every central Lie group member also belongs to the kernel of the homomorphism $\Ad$, i.e. $\mathcal{Z}(\G)\subseteq\ker(\Ad)$. We shall see later that the  homomorphism’s kernel is precisely the group’s centre.

It is instructive to take heed that we can give a different proof of the nonsingularity of $\Ad(\gamma)$ that is wholly analogous to the proof of Theorem 3.19: we note that $\Ad(\id) = \id_N$ and that the determinant of $\Ad(\sigma(\tau))$ is a continuous function of $\tau$ for a $C^1$ path $\sigma$, therefore, we show analogously with the proof of Theorem 3.19 that $\Ad(\gamma)$ is nonsingular for all $\gamma\in\Nid$. Then, since any group member is a product of members of $\Nid$, it is readily shown that $\Ad(\gamma)$ is nonsingular for all $\gamma\in\G$.

One can also use the proof that $\Ad(\gamma)$ is nonsingular to give another proof of Theorem 3.19. Wholly analogously with the tangent map $\mathbf{M}(\gamma_1,\,\gamma_2)$ induced by left translation from $\gamma_1\in\Nid$ to $\gamma_2\in\Nid$, we can define the tangent map $\mathbf{\tilde{M}}(\gamma_1,\,\gamma_2)$ induced by by right translation; e.g. the $C^1$ path through the identity $\sigma_X(\tau)$ is mapped to the $C^1$ path through $\gamma$ defined by $\sigma_X(\tau)\,\gamma$, and $\mathbf{\tilde{M}}(\id,\,\gamma)$ is the linear map induced on the tangent:

\begin{equation}
\label{TranslatedDerivativeEquation}
X=\left.\mathrm{d}_\tau \lambda(\sigma(\tau))\right|_{\tau=0} \mapsto \left.\mathrm{d}_\tau \lambda(\sigma(\tau)\,\gamma)\right|_{\tau=0}=\mathbf{\tilde{M}}(\id,\,\gamma) X
\end{equation}

for $\gamma\in\Nid$. One can then readily show from the definitions of the tangents that if the $C^1$ path $\sigma_X$ through the identity is first left-translated to the $C^1$ path through $\gamma$ by $\sigma_X(\tau)\mapsto\gamma\,\sigma_X(\tau)$ and then right-translated to the $C^1$ path $\gamma\,\sigma_X(\tau)\,\gamma^{-1}$ through the identity, then the map induced on the tangent $X$ is $X\mapsto \mathbf{\tilde{M}}(\gamma,\id)\,\mathbf{M}(\id,\,\gamma)\,X$ for any $\gamma\in\Nid$. But, as we talked about above, this is the same map as $X\mapsto \Ad(\gamma) X$, and this holds for any vector $X\in\g$. Therefore we have $\Ad(\gamma) = \mathbf{\tilde{M}}(\gamma,\id)\,\mathbf{M}(\id,\,\gamma),\,\forall\,\gamma\in\Nid$ and so neither of the $N\times N$ matrices $\mathbf{\tilde{M}}(\gamma,\id),\,\mathbf{M}(\id,\,\gamma)$ can be singular, because $\Ad(\gamma)$ is nonsingular.

The reason for my nickname “the Adjoint Shuffle” for this lemma will become clearer later. A sketch below (Figure 4.1) tries to give some intuition. One can think of the tangent at $\gamma$ defined by the $C^1$ path $\gamma\,\sigma^\prime(\tau)$ through $\gamma$ as living in the Lie algebra $\g$: the $C^1$ path through the identity $\sigma^\prime(\tau)$ left translated to a path $\gamma\,\sigma^\prime(\tau)$ through $\gamma$ and then right translated back to a $C^1$ path $\sigma(\tau)$ through the identity, i.e. $\sigma^\prime(\tau) = \gamma^{-1}\,\sigma(\tau)\,\gamma$, so that the tangent $X^\prime$ to $\sigma^\prime$ can be thought of as $X^\prime=\Ad(\gamma^{-1})\,X= \Ad(\gamma)^{-1} \,X;\,X = \Ad(\gamma) \,X^\prime$.

Adjoint ShuffleFigure 4.1: The “Adjoint Shuffle”

We end with a definition:

Definition 4.2 (Adjoint Representation of a connected Lie group):

Let $\G$ be a connected Lie group with Lie algebra $\g$. Then for each $\gamma\in\G$ we define an inner automorphism $\Gamma_\gamma:\G\to\G;\;\Gamma_\gamma(\zeta) = \gamma\,\zeta\,\gamma^{-1}$. We define $\Ad(\gamma)$ to be the differential of this automorphism at the identity element. That is, if $\sigma_X:\R\to\G$ is a $C^1$ path through $\G$ passing through the identity with tangent $X$ there, then the path $\sigma_{\gamma,\,X}:\R\to\G;\;\sigma_{\gamma,\,X}(\tau)=\gamma\,\sigma_X(\tau)\,\gamma^{-1}$ is a $C^1$ path through the identity and its tangent there is $\Ad(\gamma)\,X$. Thus  $\Ad:\G\to\operatorname{Inn}(\g)$, where $\operatorname{Inn}(\g)$ is the group of all inner automorphisms of $\G$, is called the adjoint representation. It is a homomorphism between the Lie group $\G$ and $\operatorname{Inn}(\g)$.

The connected Lie group $\G$ can be said to “act on its own Lie algebra” through the Adjoint Representation $\Ad:\G\to\mathrm{Inn}(\g)$.

The wording “act on the Lie algebra” means that the group members are mapped to invertible square matrices that transform the Lie algebra. We shall come back to study the adjoint representation in detail later after we have studied exponential co-ordinates. There are many theorems to prove which can be readily tackled with exponential co-ordinates and which have a lovely intuitive feel to them, such as: the group $\mathrm{Inn}(\g)$ is itself a connected Lie group and a matrix Lie group and its Lie algebra is the matrix Lie algebra $\mathrm{Der}(\g)$ of derivations on $\g$. The Lie group homomorphism $\Ad:\G\to\mathrm{Inn}(\g)$ induces and is induced by the Lie algebra homomorphism $\ad:\g\to\mathrm{Der}(\g)$; the connected Lie group $\mathrm{Inn}(\g)$ has $\mathrm{Der}(\g)$ as its own Lie algebra.

Example 4.3: (Adjoint Representation of $SU(2)$)

We hearken back to $SU(2)$, the group in Example 1.4 of unimodular (determinant = 1), unitary $2\times2$ complex matrices. In readiness for later chapters, we make use of the existence of the universally convergent matrix Taylor series for $\exp$ and the Mercator-Newton series for the inverse matrix logarithm $\log$, uniformly convergent for any square matrix $U$ for which $\left\|U – \id\right\|<1$. The notions of $\exp$ and $\log$ will be broadened to general Lie groups in the next chapter.

So any member of $SU(2)$ can be written as $\gamma(\tau, X)=e^{\tau\,X}$, where $\tau\in\R$ and $X\in\g$, where $\g$ is the three dimensional vector space spanned by the matrices $\hat{s}_x,\,\hat{s}_y,\,\hat{s}_y$ of Example 1.4. So we now think of the image of $\gamma(\tau, X)$ under $\Ad$.

Representing the members of the Lie algebra $\mathfrak{su}(2)$ as $3\times1$ column vectors of superposition weights of the $\g$-basis, an element of $\Ad(SU(2))$ is a $3\times3$, nonsingular square matrix. Moreover, $\zeta: \R\times \g\to\Ad(SU(2));\,\zeta(\tau, \,X)=\Ad(e^{\tau\,X})$ is (i) a continuous $3\times3$ matrix function of $\tau$ and, by the homomorphism relationship spoken of in Theorem 4.1, we can see that (ii) $\zeta(\tau + \varsigma, \,X)=\zeta(\tau, X)\,\zeta(\varsigma,\,X)$ since this arises from the action on the same functional relationship which $e^{\tau\,X}$ fulfills ($e^{(\tau+\varsigma)\,X} =e^{\tau\,X}\,e^{\varsigma\,X}$) as well. The only matrix function of $\tau$ which is both continuous and fulfills this functional relationship is $\zeta(\tau,\,X) = \exp(\tau\, \ad(X))$, where the exponential function is wholly characterised by its tangent $\ad(X)$ at $\tau=0$, and this characterising tangent must be a function of $X$ alone. The function $\ad$ (take heed: the small, lower-case, “a” in $\ad$) will be discussed later: for now, we know that it is a function of the tangent $X\in\mathfrak{su}(2)$ alone and wholly defines the exponential $\zeta(\tau,\,X) $

In a matrix Lie group, the action of $\Ad(e^{\tau\,X})$ on the Lie algebra can be computed as the literal matrix product $Y\mapsto e^{\tau\,X}\,Y\,e^{-\tau\,X}$ for any $Y\in\g$ (and when we think of $Y$ as the $2\times2$, traceless, skew-Hermitian matrix rather than the $3\times1$ column vectors of superposition weights of the $\g$-basis). This is the linear operator $\exp(\tau\, \ad(X))\,Y$, when we think of $Y$ as the $3\times1$ column vectors. Now, thinking again of $Y$ as the $2\times2$ matrix, we can compute the action $Y\mapsto\ad(X)\,Y$ of $\ad(X)$ on a $Y\in\mathfrak{su}(2)$ as $\left.\d_\tau e^{\tau\,X}\,Y\,e^{-\tau\,X}\right|_{\tau=0} = X\,Y-Y\,X$. So, thinking of $\Ad(e^{\tau\,X})=\exp(\tau\, \ad(X))$ as the $3\times3$ matrix, we compute the $3\times3$ matrix for $\ad(X)$ by computing the matrix for the linear operation $\mathfrak{su}(2)\to\mathfrak{su}(2);\,Y\mapsto X\,Y-Y\,X$ on the Lie algebra. We meet here the mattrix Lie bracket:

\begin{equation}\label{SU2AdjointRepresentationExample_1}[X,\,Y]\stackrel{def}{=}X\,Y-Y\,X\end{equation}

for the first time. This is the structure further to $\g$’s being a real linear space that every Lie algebra has that was spoken of straight after Lemma 3.2; to wit: the Lie algebra must be closed under the billinear Lie bracket operation, which is the matrix commutator bracket in the case of a matrix Lie group and which can be generelised to the general Lie group setting. It is not hard to show that:

\begin{equation}\label{SU2AdjointRepresentationExample_2}\begin{array}{lcl}\left[\hat{s}_x,\,\hat{s}_y\right] &=& 2\,\hat{s}_z \\\left[\hat{s}_z,\,\hat{s}_x\right] &=& 2\,\hat{s}_y\\\left[\hat{s}_y,\,\hat{s}_z\right] &=& 2\,\hat{s}_x \end{array}\end{equation}

together with the relationships $[\hat{s}_x,\,\hat{s}_x]=[\hat{s}_y,\,\hat{s}_y]=[\hat{s}_z,\,\hat{s}_z]=0$ and the equivalent antisymmetry $[\hat{s}_j,\,\hat{s}_k]=-[\hat{s}_k,\,\hat{s}_j];\,j,\,k\in\{x,\,y,\,z\}$. So therefore:

\begin{equation}\label{SU2AdjointRepresentationExample_3}\begin{array}{lcl}\ad(\hat{s}_x) &=& \left(\begin{array}{ccc}0&0&0\\0&0&-2\\0&2&0\end{array}\right)\\\ad(\hat{s}_y) &=& \left(\begin{array}{ccc}0&0&2\\0&0&0\\-2&0&0\end{array}\right)\\\ad(\hat{s}_z) &=& \left(\begin{array}{ccc}0&-2&0\\2&0&0\\0&0&0\end{array}\right)\end{array}\end{equation}

are the linear operators that wholly define the transformation $Y\mapsto [X,\,Y]$ for any $X,\,Y\in\mathfrak{su}(2)$. Now, on comparing $\eqref{SU2AdjointRepresentationExample_3}$ with Example 1.3, we see that:

\begin{equation}\label{SU2AdjointRepresentationExample_4}\ad(\hat{s}_x) = 2\,\hat{S}_x;\;\ad(\hat{s}_y) = 2\,\hat{S}_y;\;\ad(\hat{s}_z) = 2\,\hat{S}_z\end{equation}

and so, if we absorb the factor of 2 into the superposition weights, we can see that $\Ad(SU(2)$ is the group:

\begin{equation}\label{SU2AdjointRepresentationExample_5}\begin{array}{lcl}\Ad(SU(2)) &=& \bigcup\limits_{k=1}^\infty\left\{\left.\exp\left(x\,\hat{S}_x+y\,\hat{S}_y+z\,\hat{S}_z\right)\right|\;x,\,y,\,z\in\R\right\}^k \\ &=& \left\{\left.\exp\left(x\,\hat{S}_x+y\,\hat{S}_y+z\,\hat{S}_z\right)\right|\;x,\,y,\,z\in\R\right\} \\&=& SO(3)\end{array}\end{equation}

that is, $\Ad(SU(2)) = SO(3)$, the $3\times3$ rotation group (group of unimodular, orthogonal $3\times3$ matrices).

Example 4.4: (Adjoint Representation of $SO(3)$)

We now hearken back to $SO(3)$, the $3\times3$ rotation group in Example 1.3, i.e. the group of unimodular (determinant = 1), orthogonal $3\times3$ matrices and reason exactly as we did above in Example 4.3 above to find the linear operator acting on the Lie algebra $\mathfrak{su}(3)$ which is the image $\Ad(e^{\tau\,X})$ under the Adjoint representation of the general $SO(3)$ member $e^{\tau\,X}$, where $X\in\mathfrak{so}(3)$ belongs to the Lie algebra $\mathfrak{so}(3)$ of $3\times3$, skew-Hermitian real matrices, i.e. linear superpositions of the $\mathfrak{so}(3)$-basis $\{\hat{S}_x,\,\hat{S}_y,\,\hat{S}_z\}$, where the $\hat{S}_j$ are as defined in Example 1.3. Exactly as in Example 4.3 above, for to calculate the matrix of $\Ad(e^{\tau\,X})$, we must calculate the matrices for the Lie bracket operators $\mathfrak{so}(3)\to\mathfrak{so}(3); Y\mapsto \ad(X)\, = X\,Y-Y\,X$ for $X\in\mathfrak{so}(3)$. We find:

\begin{equation}\label{SO3AdjointRepresentationExample_1}\ad(\hat{S}_x) = \hat{S}_x;\;\ad(\hat{S}_y) = \hat{S}_y;\;\ad(\hat{S}_z) = \hat{S}_z\end{equation}

and so that we get precisely the same result as in Example 4.3 above, to wit, that $\Ad(SO(3)$ is the group

\begin{equation}\label{SO3AdjointRepresentationExample_3}\begin{array}{lcl}\Ad(SO(3)) &=& \bigcup\limits_{k=1}^\infty\left\{\left.\exp\left(x\,\hat{S}_x+y\,\hat{S}_y+z\,\hat{S}_z\right)\right|\;x,\,y,\,z\in\R\right\}^k \\ &=& \left\{\left.\exp\left(x\,\hat{S}_x+y\,\hat{S}_y+z\,\hat{S}_z\right)\right|\;x,\,y,\,z\in\R\right\} \\&=& SO(3)\end{array}\end{equation}

that is, $SO(3)$ is mapped to itself under the adjoint representation.