A Simplified
Introduction to General Relativity
Before launching
into general relativity, I thought I'd review a few basic derivations in
differential geometry. The simplest possible case of interest would probably be:
s2 = x2 + y2
or, over small
distances,
ds2 = dx2+ dy2.
As a review
exercise, we could use the calculus of variations to derive the equation for
the geodesic curve: i.e., the shortest distance between two points through the
Euclidean space defined by this Pythgorean theorem distance formula . (It
better come out to be a straight line!) The distance to be minimized is:
![]()
and the standard
variational integral is:
![]()
Then the standard,
grind-the-crank Euler-Lagrange equations are:
![]()
.
But inasmuch as f
isn't an explicit function of either x or y,
,
and
,
which means that
;
;
so that
;
.
Evaluating fx' and fy', and remembering that ds = f dx:
![]()
and
![]()
What these
equations are saying in their cumbersome way is that the line which minimizes
the distance between any two points on a plane makes a constant angle with the
horizontal and the vertical...that is, that it's a straight line.
Well, okay, that
checked out...like using a cannon to shoot a squirrel, but it worked. Now for
something more ambitious, like the shortest distance between two points in
space-time. The equation for the shortest distance between two points the
geodesic curve becomes:
![]()
Then the
Euler-Lagrange equations become:
![]()
![]()
![]()
and as before,
since f isn't an explicit function of either x or y,
![]()
and
![]()
![]()
;
;
;
;
; ![]()
But
![]()
![]()
.
And what's this
briar patch of equations trying to say? Well, the last equation says that along
the paths of shortest distance in space-time, the total speed, v, (which is the
slope of ![]()
with respect to distance,
ct along the time axis), is constant. This means:
(1) that the paths
of shortest distance have constant slopes...that is, are straight lines;
(2) that, in the
language of physics, the speeds of particles (or planets), which move along
paths of shortest distance (geodesics) in ordinary Euclidean space-time, are
constant. In other words, the last equation says that energy is a constant of
the motion. It says that particles move at constant speeds in the absence of
force fields (which we're about to perceive to be space-time vortices regions
in which space-time becomes warped and non-Euclidean) not because of some
principle of conservation of energy but because particles follow geodesics in
Euclidean space-time and the shortest distance between two points in Euclidean
space-time is a straight line of constant slope, which is to say: of constant
speed. Impressive, huh?
The first two
equations say that the x and y components of the velocity will
remain constant along straight lines in space-time, and are the basis for
conservation of momentum. In effect, we've derived the principle of
conservation of energy (or at least of kinetic energy), and Newton's first law
of motion, not as postulates but as results.
Now for the same
thing in polar coordinates. Our distance to be minimized is:
![]()
and the
Euler-Lagrange equations are:
![]()
![]()
![]()
and we go through
the usual hocus-pocus to arrive at the following quantities which are constant
along the paths of least distance in undistorted space-time:
![]()

![]()
The last equation
gives the same result as before, but to me, the interesting equation is the
second one. Multiply it by m0 and it
becomes the relativistic angular momentum. It states that angular momentum will
be conserved. I had never realized before that conservation of angular momentum
was a consequence of the coordinate system we select and has nothing to do with
central force fields, but I can see now that that's the case. It makes sense
once you think about it. In the absence of any forces, the total speed of a particle
moving along a straight line isn't going to change. At the same time, if we put
a point of origin in there and start to measure dr/dt's and d/dt's, at
infinity, the radial velocity will be the total speed, since there won't be any
tangential component. At the same time, the radial velocity has to go to zero
at the point of closest approach, so the tangential velocity has to take up the
slack until, at the point of closest approach, it's equal to the total speed.
I found this treatment of general relativity in a library book in 1988.
Unfortunately, for some unknown reason, I didn't include the book's title or
the name of its author in this write-up. Years later, when I went back to the
library to find it, I was unable to locate it. (It may have been out on loan.)
I wrote this derivation in 1988 in an effort to ensure that I understood the
derivation I had found.
The value of this cut-to-the-chase approach to general
relativity is that it arrives at the Schwartzchild solution to two-body central
force motion without requiring the machinery of tensor calculus. The author
moves directly to the solution.
The author begins by observing that the most general form for a measure of
distance ds would probably be something like:
ds = f(x1,
x2,... xn , dx1, dx2,... dxn),
where n is the
number of dimensions with which we're working. .
But in the real world (as opposed to pure mathematics), the actual length of a displacement
can't change if we change dimensional units. If we change our units of
measurement from meters to centimeters, then when we multiply dx, dy, dz
by 100 to change them to centimeters, ds will also have to increase by
100 to keep the actual distance ds unaltered. Saying this
quantitatively, if
dx1'
= l dx1,
dx2' = l dx2,
......,
dxn' = l dxn,
and ds' = l ds,
then l ds = f(x1, x2,...
xn ,l dx1, l dx2,..., l dxn).
But lcan be factored out if and only if dshas the form of the m'th root of a
product of the n dx's: dx1, dx2,... dxn,
taken m at a time, and multiplied by coefficients gi,j,...n(x1, x2,...
xn) that are functions of the variables x1, x2,...
xn , ...that is, only if ds is a homogeneous function of degree m. To
say it with an equation,
(2) dsm = S gi,j,...n(x1, x2,...
xn)dxi dxj ...dxn,
For example, if m = 4, there would be 44 or 256 terms in this
summation for ds.
Now at this point, you might well ask, "How come we didn't
put l in front of the x's in f(x1, x2,...
xn , l dx1, l dx2,..., l dxn)?" I
don't know. However, I can tell you that the g coefficients in Equation (2) are
dimensionless. The reason they're dimensionless is that they contain scale
factors that cancel out the units of distance that go with the x's.
For example, the grr
term that gives the spatial compression in the neighborhood of a star is given
by grr = 1 - rs/r, where rs is the
Schwartzchild radius of the star. The Schwartzchild radius is the radius of the
star that it would have if it were to become a black hole. To say it another
way, the Schwartzchild radius is the radius at which a star with a given mass
would have an escape speed equal to the speed of light.
You might also ask, "Aren't measures of distance are
based upon the Pythagorean Theorem? Shouldn't it be,
ds2 = S gi,j,dxi dxj?
Furthermore, we're dealing with four dimensions: x, y, z,
and ct. So shouldn't we have 16 terms?"
To the best of my knowledge, that's correct. I
kept things in this more general form because that's the way they were
presented in the book, but in fact, we do come right down to,
( 3)
|
ds2
= |
gxx(x,
y, z, ct) dx2 |
+
gxy(x, y, z, ct) dx dy |
+
gxz(x, y, z, ct) dx dz |
+
gx(ct)(x, y, z, ct) dx d(ct) |
|
+
gyx(x, y, z, ct) dy dx |
+
gyy(x, y, z, ct)dy2 |
+
gyz(x, y, z, ct) dy dz |
+
gy(ct)(x, y, z, ct) dy d(ct) |
|
+
gzx(x, y, z, ct) dz dx |
+
gxz(x, y, z, ct) dz dy |
+
gxz(x, y, z, ct) dz2 |
+
gz(ct)(x, y, z, ct) dz d(ct) |
|
+ g(ct)x(x,
y, z, ct) d(ct) dx |
+
g(ct)y(x, y, z, ct) d(ct)dy |
+
g(ct)z(x, y, z, ct) d(ict) dz |
+
gc2t2(x,
y, z, ct) d(c2t2) |
As you can see, the terms in Equation (3) are organized like a 4 X 4 matrix.
The expression for ds isn't a matrix, since it's the sum of 16 terms (only 10
of which will turn out to be independent). However, to transform a vector at a
given point in our curved space-time to another point in our curved space-time,
we need only multiply the vector by the 4 X 4 matrix whose elements are the
above g's.
This 4 X 4 matrix that maps from one location in a
gravitational field to another location is called the metric tensor. Why
is it called the metric tensor instead of the metric matrix? I don't know. A matrix
is a tensor of rank 2.
Next, we note that dxi dxj. = dxj.dxi i. e., they're always
commutative. But that means that gi,j = gj,i. (I noticed in the book by the Liebers that this is
true for all metrics: the dxi dxj's are always commutative and all the tensor elements that have the
same set of subscripts i, j,....n have the same value independent of the
order of the order in which the subscripts appear. The proof for this is beyond
the scope of this paper. A proof is given on pager 240 of the book, "The
Einstein Theory of Relativity", by by Lillian R. Lieber and Hugh Gray
Lieber, Rinehart Publishing, 1936. (It involves showing that the Christoffel
symbols {et, a} and {te, a} are equal, and that

because, as mentioned above, {et, a} = {te, a}.)
Now consider
the case of a spherically symmetric body.... a star, for instance. Switching to
spherical coordinates, we get:
ds2 = S gi j(r, J, j, ct) dxidxj
But now, because there's spherical symmetry, the g's can't vary with J or j but only with r and t. Then
if we say that the gravitational field is time-independent...that is, if its
mass and therefore its gravitational field isn't changing as an explicit
function of time...the g's can’t vary with time, either, so they can depend
only upon r. Thus,
ds2 = gij(r) dxidxj = g11(r)dr2+g22(r) r2 (dq)2+ g33(r)r2sin2q (dj)2- g44(r)dt2
+ g12(r) r dr dq + g13(r) r sinqdr dj...
(The g's are
functions of r alone only when we use the usual spherical coordinates, dr,
r dq, r
sinq dj, and d(ct). Otherwise, the g's
that go with terms having a j in them have to have to
contain a sinq factor.)
`At this point, all the books on general relativity I've read make the
assumption that space-time is isotropic - that things look the same in all
directions. If we make this assumption, then ds must remain unaltered if we
replace dj by -dj or dq by –dq. But that can be true only if, for example, grqdr dq = - grqdr dq...that is, if all the
coefficients of the cross products are zero. The result is that all the
off-diagonal terms in the "metric matrix" are zero. The Liebers discuss
this on page 235 of their book (cited above) They say,
"Well, obviously, a term like dr dq (or dq df or dr df 0 would be different
for q (or f or r) positive or negative, and, consequently, the expression for ds
would be different if we turn in opposite directions---which would contradict
the experimental evidence that the universe is isotropic. And of course, the
use of the same expression for ds from any point reflects the idea of
homogeneity. And so we see that it is reasonable to have in (61b) only terms
involving dq2
df2, dr2, in which it makes no difference whether we substitute
+dq or -dq, etc.
"Similarly, since in getting a measure for ds, we are
considering a static condition, and not one that is changing over time, we must
therefore not include terms that will have different values for +dt and -dt; in
other words we must not include product terms like dr, dt, etc."
Finally, we end up with the expression
(4)ds2 = g11(r)dr2+g22(r) r2dq2+ g33(r)r2sin2q (df)2 - g44(r)dt2,
all more or less in
one fell swoop, and without the use of tensors.
Now all we have to
do is to determine g11(r),
g22(r), g33(r), and g44(r). g22(r) and g33(r) can be set equal to each
other and can be included in rbecause g11(r) can be chosen so that it incorporates the
effects of the other two. The justification for taking this step is supposed to
be listed on page 239 of R. C. Tolman's book "Relativity, Thermodynamics
and Cosmology". But we pay a price for this maneuver and that is that r
is no longer the distance from the origin to a spherical shell of radius r. I
presume that's because r shrinks as we get close to the star, so that
the r we have to use is some integral of dr', integrated
from ¥ to r. Incidentally, what does it really mean when we say that r
shrinks when we get close to a star? The answer is that the shrunken r we're
talking about is the r that we see when we're looking on from outside the
neighborhood of the star. To a person living on the star, things seem perfectly
normal. All the covariant "laws" of physics are adjusted so that the
star-dweller can't perceive any difference in his environment except that
things outside the neighborhood of the star look farther away than they really
are...like looking through the wrong end of a telescope. Also, his clocks
appear to us to run slower than our clocks and our clocks appear to him to be
running faster than his clocks...i. e., we both agree that his clocks are
running slower than ours. Note that this is different from the situation in
special relativity where A's clock appears to be running slower than B's, and
B's clock appears to be running slower than A's and either point of view is
equally valid. That's because, in special relativity, we're dealing with
"hyperbolic rotations", and the apparent distortions of the measuring
rods and the clocks actually stem from the fact that we're projecting their
space and time coordinates onto ours and projecting our space and time
coordinates onto theirs, and looking at, so to speak, perspective effects. But
with general relativity, it's like the "twins paradox": there really
is an asymmetry in the situation.
Does the term
"general relativity" refer to the fact that we observers sitting
outside the distorted space-time around the star may ourselves be (and probably
are) in a distorted space-time? If so...if I'm interpreting this situation
correctly (and I may not be)...we have no way of knowing it by looking at the
local neighborhood around us or by running local physical experiments any more
than the star-dweller can tell that he's in a distorted space-time without
being able to look outside it. In that sense, there's no way of ever knowing
whether we're in the preferred coordinate system. Also, we can't tell
from the laws of physics that we're in an accelerated frame because the laws of
physics work the same for us as they do for someone in flat space-time. (The
reason I'm not sure about what I'm saying is that the Riemann curvature tensor
provides a mechanism for determining whether a space-time is curved or flat, so
I may be speaking out of school. I'll have to check on that.) So that's what's
"relative" about "general relativity".
Anyway, the "r"
that's used in the equation for the metric is defined as the radius of
curvature of the spherical shell at "radius" r.
Determining g11(r) and g44(r) takes some arithmetic
but it's not too bad. I'm going to jump ahead and say that the answer...the
Schwartzchild metric...turns out to be fairly simple. It is:
,
where rSis the Schwartzchild radius,
typically of the order of a few kilometers for a star, and is the radius at
which a star undergoes gravitational collapse and becomes a black hole.
Now for the
derivation.
We can derive the
equations for the geodesic curves---the curves which minimize the integrals of
the distances ds in the warped space in the neighborhood of the star---with the
calculus of variations. If we label each point along the path ds by a
parameter z, then
dr/dz = r', dq/dz = q', df/dz =f', d(ct)/dz = ct'
Normally, you'd use
dt or ds as parameters to measure positions along the path, but in
this case, ds is the quantity being minimized and dt is one of the dependent
variables, so some other symbol has to be used. But not to worry. This
symbol zdrops out of the picture almost immediately.
Next, we'll take
the customary step in dealing with central-force motion of saying that we can
orient our coordinate system any way we please, so we might as well orient it
so that q = 0.
That way, the df2 term drops out. All the motion will occur in one
plane in general relativity as it does in classical mechanics because, with
radial symmetry, in general relativity as is the case in classical mechanics,
nature has no preferential reason to drift in one direction or the other out of
a plane of orbital motion...no basis for choosing one plane of motion over
another. Then our distance to be minimized becomes:
(6)
.
The primes here represent derivatives (slopes) with respect to z., where z is the distance along the
path. (z plays the role usually played by x or t, since in this situation, x and
t are dependent variables.) .
Equation (6) is exactly the same as the equation
we
got for finding the shortest distance between two points in empty 4-space when
using radial coordinates, except for the presence of the g11(r)and the g44(r) terms,which are all
that remains that we can adjust in the metric to allow for gravitational
distortions. Everything else has been ruled out on the basis of symmetry
conditions, the assumption that space-time isn't changing with time, and so
forth. If we set the g's equal to one, which we have to do in Euclidean space
far from a star, then we'll get exactly the same results we did earlier. So
far, we haven't put anything into the mathematics about gravitation...that is,
about the fact that mass distorts space-time. (It's worthwhile noting the
similarities and differences between the Lagrangian for inverse square law
motion and the ds above. The first term in Equation (6) is going to wind up
acting like a gravitational force term.) The last two terms are like the
velocity terms in the relativistic Lagrangian. The Euler-Lagrange equations are
the same as they were before:
![]()
![]()
![]()
Since the metric
function in Equation (6) has no q or ct terms, we know immediately that
and
=
0, and therefore,
and
=
0, and
and
=
constants k1 and k4, respectively.

However, this time,
because of the presence of g11(r) and g44(r), which are functions of r, we won't assume that
![]()
since it isn’t.
We'll skip trying to work with the first equation for r, since the g's
are unknown functions of r. (Actually, we know what they are but we
can't prove it yet.) Turning to the second equation, since = 0,
=constant = angular momentum
= Jq = r2 dq/ds = r2q’.(“J” is a symbol
that’s frequently used to designate angular momentum.)
the same as it was
for special relativity. Hey! We're wheeling and dealing in general relativity
as though it were classical mechanics!
Having had such
smashing success with our first attempt, let's try the last equation. It's also
the same as before except that this time, we're carrying along a g44(r) term.
![]()
This looks formidable but it's not. g44(r) is only
![]()
and d(ct)/ds = 1/(ds/d(ct))
is the reciprocal of the speed of the particle along its trajectory, s. At
large values of r, where the conventional ds2= c2dt2 - dr2- r2dJ2Minkowski metric is
applicable:
![]()
and in close to a
star, the exact metric is given by
![]()
The term has been carried along in order to derive the full equations of
motion. However, since the g's are only associated with the dr and dt terms, it
will be simpler to evaluate the g's using a simple radial free-fall problem
with the angular momentum set to zero. Now the expression for the metric can be
written:


Looking
just at the time term,
and 




If we define
to
be
,
then the above equation can be written,
![]()


![]()
![]()
![]()
![]()
![]()
![]()
![]()
![]()



d
= d
[ g44(r) (ct’)2- g11(r)r'2 - r 2q'2]1/2 dz = d
dz = 0