[Weinberg QFT] 15.1 Gauge Invariance

This article is one of the posts in the Textbook Commentary Project.


We assume that the Lagrangian of our theory is invariant under a set of infinitesimal transformations on the matter fields $$\begin{align}\phi_l(x)=i\epsilon^\alpha (x)(t_\alpha)_l^m\phi_m(x)\end{align}$$, with some set of independent constant matrices $t_\alpha$, and with real infinitesimal parameters $\epsilon^\alpha(x)$ which (as for gauge transformations in electrodynamics) are allowed to depend on position in spacetime.

We assume that these symmetry transformations are the infinitesimal part of a Lie group; as shown in Section 2.2, this requires that the $t_\alpha$ obey commutation relations $$\begin{align} \left[ t_\alpha,t_\beta \right] = i C^\gamma_{\alpha\beta} t_\gamma\end{align}$$ where $C^\gamma_{\alpha\beta}$ are a set of real constants, known as the structure constants of the group. The antisymmetry of the commutator immediately tells us that the structure constants are similarly antisymmetric: $$\begin{align} C^\gamma_{\alpha\beta} = -C^\gamma_{\beta\alpha}\end{align}$$

Also, from the Jacobi identity $$\begin{align} 0=\left[ \left[ t_\alpha,t_\beta\right] ,t_\gamma \right] + \left[ \left[ t_\gamma,t_\alpha \right], t_\beta \right] + \left[ \left[ t_\beta, t_\gamma \right], t_\alpha \right] \end{align}$$ we see that the $C$s satisfy the further constraint $$\begin{align} 0=C^\delta_{\alpha\beta} C^\epsilon_{\delta\gamma} + C^\delta_{\gamma\alpha}C^\epsilon_{\delta\beta}+C^\delta_{\beta\gamma}C^\epsilon_{\delta\alpha}.\end{align}$$ 

Any set of constants $C^\gamma_{\alpha\beta}$ that satisfy Eqs. (3) and (5) define at least one set of matrices $t^A_\alpha$: $$\begin{align} (t^A_\alpha)^\beta_\gamma \equiv -iC^\beta_{\gamma\alpha},\end{align}$$ that satisfy the commutation relation (2) with structure constants $C^\gamma_{\alpha\beta}$: $$\begin{align} \left[ t^A_\alpha,t^A_\beta\right]= iC^\gamma_{\alpha\beta} t^A_\gamma\end{align}$$

This is known as the 'adjoint' (or regular') representation of the Lie algebra with structure constants $C^\alpha_{\beta\gamma}$


Example: $SU(2)$ Yamg-Mills Theory

The matter fields were the doublet consisting of proton and neutron fields $\phi_p$ and $\phi_n$; $$\phi = \begin{pmatrix} \phi_p \\ \phi_n \end{pmatrix}$$ and the $t_\alpha$ with $\alpha=1,2,3$ were the isospin matrices $$t_1=\frac{1}{2}\begin{pmatrix}0 & 1 \\ 1 & 0\end{pmatrix},\ t_2=\frac{1}{2}\begin{pmatrix}0 & -i \\ i & 0 \end{pmatrix},\ t_3 = \frac{1}{2}\begin{pmatrix}1 & 0 \\ 0 & 1 \end{pmatrix}.$$ These satisfy the commutation relations (2) with $$C^\gamma_{\alpha\beta} = \epsilon_{\gamma \alpha\beta}$$ where as usual $\epsilon_{\gamma\alpha\beta}$ is $+1$ or $-1$ if $\gamma,\alpha,\beta$ is an even or an odd permutation of $1,2,3$, respectively, and vanishes otherwise. We recognize this as the same as the Lie algebra (2.4.18) of the three-dimensional rotation group; the matrices $t_\alpha$ here furish what we recognize as the spin $1/2$ representation of this Lie algebra. The matrices (6) of the adjoint representation are here (in a basis with rows and columns labbelled $1,2,3$): $$t^A_1 = \begin{bmatrix} 0 & 0 & 0 \\ 0 & 0 & -i \\ 0 & i & 0 \end{bmatrix},\ t^A_2 = \begin{bmatrix} 0 & 0 & i \\ 0 & 0 & 0 \\ -i & 0 & 0 \end{bmatrix},\ t^A_3 = \begin{bmatrix} 0 & -i & 0 \\ i & 0 & 0 \\ 0 & 0 & 0 \end{bmatrix}$$ This is the spin 1 representation of the Lie algebra of the rotation group.

Now consider what is needed to make the Lagrangian inveriant under the transformations (1). If there were no derivatives acting one the fields. the task would be easy - any functon of the matter fields that was invariant under the transformation (1) with $\epsilon^\alpha$ constant would also be inveriant with $epsilon^\alpha$ arbitrary real functons of the spacetime coordinates. This is not the case if the Lagrangian involves deerivatives of the fields (as it must), because with position-dependent functions $\epsilon^\alpha(x)$, the derivatives of the matter fields do not transform like the fields themselves. Differentiating Eq. (1) gives $$\begin{align}\delta\left( \partial_\mu \psi_l(x)\right) = i \epsilon e^\alpha (x)(t_\alpha)_l ^m \left( \partial_\mu\psi_m(x)\right) + i\left( \partial_\mu \epsilon^\alpha(x)\right)(t_\alpha)_l^m\psi_m(x).\end{align}$$ 

To make the Lagrangian invariant, we need a field $A^\alpha_\mu$, whose transformation rule involves a term $\partial_\mu^\alpha$, which can be used to cancel the second term in Eq. (8). Since this field carries an $\alpha$-index, we would  expect it also to undergo a matrix transformation like Eq. (1), but with $t_\alpha$ replaced with the adjoint representation matrices (6). Let us therefore tentively take the transformation relation of these new 'gauge' fields as $$\delta A^\beta_\mu = \partial_\mu \epsilon^\beta + i\epsilon^\alpha (t^A_\alpha)^\beta_\gamma A^\gamma_\mu$$ or , using Eq. (6), $$\begin{align} \delta A^\beta_\mu = \partial_\mu \epsilon^\beta + C^\beta_{\gamma\alpha}\epsilon^\alpha A^\gamma_\mu. \end{align}$$ 

This allows us to construct a 'covariant derivative': $$\begin{align} \left( D_\mu \phi(x)\right)_l = \partial_\mu \phi_l(x)-iA^\beta_\mu(x)(t_\beta)_l^m\phi_m(x).\end{align}$$

As planned, the term $\partial_\mu \epsilon^\beta$ in the transformation of $A^\beta_\mu$ in the second term of Eq. (10) cancels the term proportional to $\partial_\mu \epsilon^\beta$ in the transformation of the first term, lieaving us with $$\delta(D_\mu \phi)_l = i\epsilon^\alpha(t_\alpha)_l^m\partial_\mu\phi_m - i C^\beta_{\gamma\alpha}\epsilon^\alpha A^\gamma_\mu (t_\beta)_l^m \phi_m+ A^\gamma_\mu(t_\gamma)_l^m(t_\alpha)_m^n\phi_n$$ or, using Eq. (2), $$\begin{align}\delta(D_\mu \phi)_l = i\epsilon^\alpha(t_\alpha)_l^m(D_\mu\phi)_m,\end{align}$$ so that $D_\mu \phi$ transforms just like $\phi$ itself.


We also need to worry about derivatives of the gauge field. In order to eliminate the term $\partial_\nu \partial_\mu \epsilon^\beta$ in the transformation of $\partial_\nu A^\beta_\mu$, we antisymmetrize with respect to $\mu$ and $\nu$, just as in electrodynamics. However, we still have terms in the transformation of $\partial_\nu A^\beta_\mu - \partial_\mu A^\beta_\nu$ proportional to first derivatives of $\epsilon(x)$, arising from the second term in Eq. (9). The easiest way to construct a 'covatriant curl', $F^\gamma_{\nu\mu}$ in whose transformation rule all such derivatives of $\epsilon(x)$ cancel, is to consider the commutator of two covariant derivatives acting on a matter field $\phi$: $$\begin{align}\left( [D_\nu,D_\mu]\phi\right)_l = -i(t_\gamma)_l^mF^\gamma_{\nu\mu}\phi_m,\end{align}$$ where $$\begin{align} F^\gamma_{\nu\mu}\equiv \partial_\nu A^\gamma_\mu - \partial_\mu A^\gamma_\nu + C^\gamma_{\alpha\beta}A^\alpha_\nu A^\beta_\mu.\end{align}$$

Eq. (12) makes it obivous that $F^\gamma_{\nu\mu}$ must transform just like a matter field that happens to belong to the adjoint representation: $$\begin{align}\delta F^\beta_{\nu\mu} \equiv i \epsilon^\alpha (t^A_\alpha)^\beta_\gamma F^\gamma_{\nu\mu} = \epsilon^\alpha C^\beta_{\gamma\alpha}F^\gamma_{\nu\mu}.\end{align}$$

The reader may check by direct calculation (using the relation (5)) that the quantity $F^\alpha_{\nu\mu}$ defined in (13) actually has the simple transformation rule (14).


For some purposes, it is useful to know that these infinitesimal gauge transformations can be upgraded to finite transformtaions. A group element can be parametrized by a set of real functions $\Lambda^\alpha(x)$ so that it acts on a general matter field $\phi_l(x)$ through the matrix transformation $$\begin{align} \phi_l(x)\rightarrow \phi_{l\Lambda}(x)=\left[ \exp \left(it_\alpha \Lambda^\alpha(x)\right) \right]_{lm} \phi_m(x).\end{align}$$

We want the covariant derivative to transform in the same way: $$\begin{align}(\partial_\mu - it_\alpha A^\alpha_{\mu\Lambda})\phi_\Lambda = \exp (it_\alpha\Lambda^\alpha)(\partial_\mu - it_\alpha A^\alpha_\mu)\phi,\end{align}$$ so we must impose on $A^\alpha_\mu$ the transformation rule $A^\alpha_\mu\rightarrow A^\alpha_{\mu\Lambda}$, with $$\partial_\mu \exp(it_\beta \Lambda^\beta)-it_\beta \exp (it_\alpha \Lambda^\alpha)A^\beta_{\mu\Lambda}=-i\exp (it_\alpha \Lambda^\alpha)t_\beta A^beta$$ or in other words $$\begin{align} t_\alpha A^\alpha_{\mu\Lambda} = \exp(it_\beta \Lambda^\beta)t_\alpha A^\alpha_\mu \exp (-it_\beta\Lambda^\beta)-i\left[ \partial_\mu \exp (it_\beta \Lambda^\beta)\right] \exp(-it_\beta\Lambda^\beta).\end{align}$$

Eqs. (15) and (17) reduce to the previous transformation rules (1) and (9) in the limit where $\Lambda^\alpha(x)$ in an infinitesimal $\epsilon^\alpha(x)$.


From Eq. (17), we can see that by a suitable choice of $\Lambda^\beta(x)$, it is always possible to make $A^\alpha_{\mu\Lambda}(x)$ vanish at any one point, say $x=z$. (Simply take $\Lambda^\alpha(z)$ to vanish, and $\partial \Lambda^\alpha (x)/\partial x^\mu=-A^\alpha_\mu(x)$ at $x=z$.) Also, it is always possible to choose $\Lambda^\beta(x)$ so that any one spacetime component of $A^\alpha_{\mu\Lambda}(x)$ vanishes for all $\alpha$ everywhere in at least a finite domain around any given point. For instance, to make $A^\alpha_{\beta\Lambda}(x)$ vanish, we must solve the set of ordinary first-order differential equations for the parameters $\Lambda^\beta(x)$: $$\begin{align} \partial_3 \exp(it_\beta \Lambda^\beta) = -i\exp (it_\beta \Lambda^\beta)=-i\exp (it_\beta\Lambda^\beta)t_\alpha A^\alpha_3,\end{align}$$ which always have a solution in at least a finite domain around any ordinary point.


However, in general it is not possible to choose $\Lambda^\alpha(x)$ to make all four components $A^\alpha_{\mu\Lambda}(x)$ vanish in a finite region. For this purpose, we would have to be able to satisfy the partial differential equations $$\begin{align}\partial_\mu \exp (it_\beta \Lambda^\beta)=-i\exp (it_\beta \Lambda^\beta)t_\alpha A^\alpha_\mu, \end{align}$$ which cannot be solved unless certain integrability conditions are satisfied. In particular, if $A^\alpha_{\mu\Lambda}(x)$ vanishes everywhere then so does $F^\alpha_{\mu\nu\Lambda}(x)$, but since the field strength transforms homogeneously, $F^\alpha_{\mu\nu\Lambda}(x)$ can vanish only if $F^\alpha_{\mu\nu}(x)$ does. A gauge field $A^\alpha_\mu(x)$ is called a ;pure gauge'  field if there exists a gauge transformation which makes it vanish everywhere. It is not difficult to show that the condition that $F^\alpha_{\mu\nu}$ should vanish everywhere is not only necessary but also sufficient for $A^\alpha_\mu(x)$ to be expressible in any simply connected region as a pure gauge field.


There is a deep analogy between the construction here of objects that transform simply under gauge transformations and the construction in general relativity of objects that transform covariantly under general coordinate transformations. Just as we use the gauge field to construct covariant derivatives $D_\mu\phi_l$ of matter fields with the same gauge transformation properties as the matter fields themselves, so we use the affine connection $\Gamma^\mu_{\nu\lambda}(x)$ to construct covariant derivatives of tensors $T^{\rho\sigma\cdots}_{\kappa\lambda\cdots}$: $$T^{\rho\cdots}_{\kappa\cdots;\nu}\equiv \partial_\nu T^{\rho\cdots}_{\kappa\cdots}+\Gamma^\rho_{\nu\lambda}T^{\lambda\cdots}{\kappa\cdots}+\cdots-\Gamma^\mu_{\nu\kappa}T^{\rho\cdots}_{\mu\cdots}-\cdots,$$ which are themselves tensors. Also, from the derivatives of the gauge field we constructed a field strength $F^\alpha_{\mu\nu}$ with the gauge transformation property of a matter field belonging to the adjoint representation of the gauge group; correspondingly, from the derivatives of the affine connection we may construct a quantity: $$R^\lambda_{\mu\nu\kappa} = \frac{\partial \Gamma^\lambda_{\mu\nu}}{\partial x^\kappa}-\frac{\partial \Gamma^\lambda_{\mu\kappa}}{\partial x^\nu} + \Gamma^\eta_{\mu\nu}\Gamma^\lambda_{\kappa\eta}-\Gamma^\eta_{\mu\kappa}\Gamma^\lambda_{\nu\eta}$$, which transforms as a tensor, the Riemann-Christoffel curvature tensor. The commutator of two gauge-covariant derivatives $D_\mu$ and $D_\nu$ may be expressed in terms of the field-strength tensor $F^\alpha_{\mu\nu}$; similarly, the commutator of two covariant derivatives with respect to $x^\nu$ and $x^\kappa$ may be expressed in terms of the curvature: $$T^{\lambda\cdots}_{\mu\cdots;\nu;\kappa}-T^{\lambda\cdots}_{\mu\cdots;\kappa;\nu}=R^\lambda_{\sigma\nu\kappa}T^{\sigma\cdots}_{\mu\cdots}+\cdots-R^\sigma_{\mu\nu\kappa} T^{\lambda\cdots}_{\sigma\cdots}-\cdots.$$

The necessary and sufficient condition for the existence of a gauge in which the gauge field vanshes in a finite simply connected region is the vanishing of the field-strength tensor, and the necessary and sufficient condition for the existence of a coordinate system in which affine connection vanishes in a finite simply connected region is the vanishing of the Riemann-Christoffel curvature tensor. The analogy breaks down in one important respect; in general relativity the affine connection is itself constructed from first derivatives of the metric tensor, while in gauge theories the gauge fields are not expressed in tems of any fundamental fields.