A Simplified
Introduction to General Relativity
Before launching
into general relativity, I thought I'd review a few basic derivations in
differential geometry. The simplest possible case of interest would probably be:
s^{2} = x^{2} + y^{2}
or, over small
distances,
ds^{2} = dx^{2}+ dy^{2}.
As a review
exercise, we could use the calculus of variations to derive the equation for
the geodesic curve: i.e., the shortest distance between two points through the
Euclidean space defined by this Pythgorean theorem distance formula . (It
better come out to be a straight line!) The distance to be minimized is:
and the standard
variational integral is:
Then the standard,
grindthecrank EulerLagrange equations are:
.
But inasmuch as f
isn't an explicit function of either x or y,
,
and
,
which means that
;
;
so that
;
.
Evaluating f_{x}' and f_{y}', and remembering that ds = f dx:
and
What these
equations are saying in their cumbersome way is that the line which minimizes
the distance between any two points on a plane makes a constant angle with the
horizontal and the vertical...that is, that it's a straight line.
Well, okay, that
checked out...like using a cannon to shoot a squirrel, but it worked. Now for
something more ambitious, like the shortest distance between two points in
spacetime. The equation for the shortest distance between two points the
geodesic curve becomes:
Then the
EulerLagrange equations become:
and as before,
since f isn't an explicit function of either x or y,
and
; ; ; ; ;
But
.
And what's this
briar patch of equations trying to say? Well, the last equation says that along
the paths of shortest distance in spacetime, the total speed, v, (which is the
slope of
with respect to distance,
ct along the time axis), is constant. This means:
(1) that the paths
of shortest distance have constant slopes...that is, are straight lines;
(2) that, in the
language of physics, the speeds of particles (or planets), which move along
paths of shortest distance (geodesics) in ordinary Euclidean spacetime, are
constant. In other words, the last equation says that energy is a constant of
the motion. It says that particles move at constant speeds in the absence of
force fields (which we're about to perceive to be spacetime vortices regions
in which spacetime becomes warped and nonEuclidean) not because of some
principle of conservation of energy but because particles follow geodesics in
Euclidean spacetime and the shortest distance between two points in Euclidean
spacetime is a straight line of constant slope, which is to say: of constant
speed. Impressive, huh?
The first two
equations say that the x and y components of the velocity will
remain constant along straight lines in spacetime, and are the basis for
conservation of momentum. In effect, we've derived the principle of
conservation of energy (or at least of kinetic energy), and Newton's first law
of motion, not as postulates but as results.
Now for the same
thing in polar coordinates. Our distance to be minimized is:
and the
EulerLagrange equations are:
and we go through
the usual hocuspocus to arrive at the following quantities which are constant
along the paths of least distance in undistorted spacetime:
The last equation
gives the same result as before, but to me, the interesting equation is the
second one. Multiply it by m0 and it
becomes the relativistic angular momentum. It states that angular momentum will
be conserved. I had never realized before that conservation of angular momentum
was a consequence of the coordinate system we select and has nothing to do with
central force fields, but I can see now that that's the case. It makes sense
once you think about it. In the absence of any forces, the total speed of a particle
moving along a straight line isn't going to change. At the same time, if we put
a point of origin in there and start to measure dr/dt's and d/dt's, at
infinity, the radial velocity will be the total speed, since there won't be any
tangential component. At the same time, the radial velocity has to go to zero
at the point of closest approach, so the tangential velocity has to take up the
slack until, at the point of closest approach, it's equal to the total speed.
I found this treatment of general relativity in a library book in 1988.
Unfortunately, for some unknown reason, I didn't include the book's title or
the name of its author in this writeup. Years later, when I went back to the
library to find it, I was unable to locate it. (It may have been out on loan.)
I wrote this derivation in 1988 in an effort to ensure that I understood the
derivation I had found.
The value of this cuttothechase approach to general
relativity is that it arrives at the Schwartzchild solution to twobody central
force motion without requiring the machinery of tensor calculus. The author
moves directly to the solution.
The author begins by observing that the most general form for a measure of
distance ds would probably be something like:
ds = f(x_{1},
x_{2},... x_{n }, dx_{1}, dx_{2},... dx_{n}),
where n is the
number of dimensions with which we're working. .
But in the real world (as opposed to pure mathematics), the actual length of a displacement
can't change if we change dimensional units. If we change our units of
measurement from meters to centimeters, then when we multiply dx, dy, dz
by 100 to change them to centimeters, ds will also have to increase by
100 to keep the actual distance ds unaltered. Saying this
quantitatively, if
dx_{1}'
= l dx_{1},
dx_{2}' = l dx_{2},
......,
dx_{n}' = l dx_{n},
and ds' = l ds,
then l ds = f(x_{1}, x_{2},...
x_{n },l dx_{1}, l dx_{2},..., l dx_{n}).
But lcan be factored out if and only if dshas the form of the m'th root of a
product of the n dx's: dx_{1}, dx_{2},... dx_{n},
taken m at a time, and multiplied by coefficients g_{i,j},...n(x_{1}, x_{2},...
x_{n}) that are functions of the variables x_{1}, x_{2},...
x_{n }, ...that is, only if ds is a homogeneous function of degree m. To
say it with an equation,
(2) ds^{m} = S gi,j,...n(x_{1}, x_{2},...
x_{n)}dxi dxj ...dxn,
For example, if m = 4, there would be 4^{4} or 256 terms in this
summation for ds.
Now at this point, you might well ask, "How come we didn't
put l in front of the x's in f(x_{1}, x_{2},...
x_{n }, l dx_{1}, l dx_{2},..., l dx_{n})?" I
don't know. However, I can tell you that the g coefficients in Equation (2) are
dimensionless. The reason they're dimensionless is that they contain scale
factors that cancel out the units of distance that go with the x's.
For example, the g_{rr}
term that gives the spatial compression in the neighborhood of a star is given
by g_{rr} = 1  r_{s}/r, where r_{s} is the
Schwartzchild radius of the star. The Schwartzchild radius is the radius of the
star that it would have if it were to become a black hole. To say it another
way, the Schwartzchild radius is the radius at which a star with a given mass
would have an escape speed equal to the speed of light.
You might also ask, "Aren't measures of distance are
based upon the Pythagorean Theorem? Shouldn't it be,
ds^{2} = S gi,j,dxi dxj?
Furthermore, we're dealing with four dimensions: x, y, z,
and ct. So shouldn't we have 16 terms?"
To the best of my knowledge, that's correct. I
kept things in this more general form because that's the way they were
presented in the book, but in fact, we do come right down to,
( 3)
ds^{2}
= 
g_{xx}(x,
y, z, ct) dx^{2} 
+
g_{xy}(x, y, z, ct) dx dy 
+
g_{xz}(x, y, z, ct) dx dz 
+
g_{x(ct)}(x, y, z, ct) dx d(ct) 

+
g_{yx}(x, y, z, ct) dy dx 
+
g_{yy}(x, y, z, ct)dy^{2} 
+
g_{yz}(x, y, z, ct) dy dz 
+
g_{y(ct)}(x, y, z, ct) dy d(ct) 

+
g_{zx}(x, y, z, ct) dz dx 
+
g_{xz}(x, y, z, ct) dz dy 
+
g_{xz}(x, y, z, ct) dz^{2} 
+
g_{z(ct)}(x, y, z, ct) dz d(ct) 

_{+ }g_{(ct)x}(x,
y, z, ct) d(ct) dx 
+
g_{(ct)y}(x, y, z, ct) d(ct)dy 
+
g_{(ct)z}(x, y, z, ct) d(ict) dz 
+
g_{c}_{2}_{t}_{2}(x,
y, z, ct) d(c^{2}t^{2}) 
As you can see, the terms in Equation (3) are organized like a 4 X 4 matrix.
The expression for ds isn't a matrix, since it's the sum of 16 terms (only 10
of which will turn out to be independent). However, to transform a vector at a
given point in our curved spacetime to another point in our curved spacetime,
we need only multiply the vector by the 4 X 4 matrix whose elements are the
above g's.
This 4 X 4 matrix that maps from one location in a
gravitational field to another location is called the metric tensor. Why
is it called the metric tensor instead of the metric matrix? I don't know. A matrix
is a tensor of rank 2.
Next, we note that dxi dxj. = dxj.dxi i. e., they're always
commutative. But that means that gi,j = gj,i. (I noticed in the book by the Liebers that this is
true for all metrics: the dxi dxj's are always commutative and all the tensor elements that have the
same set of subscripts i, j,....n have the same value independent of the
order of the order in which the subscripts appear. The proof for this is beyond
the scope of this paper. A proof is given on pager 240 of the book, "The
Einstein Theory of Relativity", by by Lillian R. Lieber and Hugh Gray
Lieber, Rinehart Publishing, 1936. (It involves showing that the Christoffel
symbols {et, a} and {te, a} are equal, and that
because, as mentioned above, {et, a} = {te, a}.)
Now consider
the case of a spherically symmetric body.... a star, for instance. Switching to
spherical coordinates, we get:
ds^{2} = S g_{i j}(r, J, j, ct) dx_{i}dx_{j}
But now, because there's spherical symmetry, the g's can't vary with J or j but only with r and t. Then
if we say that the gravitational field is timeindependent...that is, if its
mass and therefore its gravitational field isn't changing as an explicit
function of time...the g's can’t vary with time, either, so they can depend
only upon r. Thus,
ds^{2} = g_{ij}(r) dx_{i}dx_{j} = g_{11}(r)dr^{2}+g_{22}(r) r^{2} (dq)^{2}+ g_{33}(r)r^{2}sin^{2}q (dj)^{2} g_{44}(r)dt^{2}
+ g_{12}(r) r dr dq + g_{13}(r) r sinqdr dj...
(The g's are
functions of r alone only when we use the usual spherical coordinates, dr,
r dq, r
sinq dj, and d(ct). Otherwise, the g's
that go with terms having a j in them have to have to
contain a sinq factor.)
`At this point, all the books on general relativity I've read make the
assumption that spacetime is isotropic  that things look the same in all
directions. If we make this assumption, then ds must remain unaltered if we
replace dj by dj or dq by –dq. But that can be true only if, for example, g_{r}_{q}dr dq =  g_{r}_{q}dr dq...that is, if all the
coefficients of the cross products are zero. The result is that all the
offdiagonal terms in the "metric matrix" are zero. The Liebers discuss
this on page 235 of their book (cited above) They say,
"Well, obviously, a term like dr dq (or dq df or dr df 0 would be different
for q (or f or r) positive or negative, and, consequently, the expression for ds
would be different if we turn in opposite directionswhich would contradict
the experimental evidence that the universe is isotropic. And of course, the
use of the same expression for ds from any point reflects the idea of
homogeneity. And so we see that it is reasonable to have in (61b) only terms
involving dq^{2}
df^{2}, dr^{2}, in which it makes no difference whether we substitute
+dq or dq, etc.
"Similarly, since in getting a measure for ds, we are
considering a static condition, and not one that is changing over time, we must
therefore not include terms that will have different values for +dt and dt; in
other words we must not include product terms like dr, dt, etc."
Finally, we end up with the expression
(4)ds^{2} = g_{11}(r)dr^{2}+g_{22}(r) r^{2}dq2+ g_{33}(r)r^{2}sin^{2}q (df)^{2 } g_{44}(r)dt^{2},
all more or less in
one fell swoop, and without the use of tensors.
Now all we have to
do is to determine g_{11}(r),
g_{22}(r), g_{33}(r), and g_{44}(r). g_{22}(r) and g_{33}(r) can be set equal to each
other and can be included in rbecause g_{11}(r) can be chosen so that it incorporates the
effects of the other two. The justification for taking this step is supposed to
be listed on page 239 of R. C. Tolman's book "Relativity, Thermodynamics
and Cosmology". But we pay a price for this maneuver and that is that r
is no longer the distance from the origin to a spherical shell of radius r. I
presume that's because r shrinks as we get close to the star, so that
the r we have to use is some integral of dr', integrated
from ¥ to r. Incidentally, what does it really mean when we say that r
shrinks when we get close to a star? The answer is that the shrunken r we're
talking about is the r that we see when we're looking on from outside the
neighborhood of the star. To a person living on the star, things seem perfectly
normal. All the covariant "laws" of physics are adjusted so that the
stardweller can't perceive any difference in his environment except that
things outside the neighborhood of the star look farther away than they really
are...like looking through the wrong end of a telescope. Also, his clocks
appear to us to run slower than our clocks and our clocks appear to him to be
running faster than his clocks...i. e., we both agree that his clocks are
running slower than ours. Note that this is different from the situation in
special relativity where A's clock appears to be running slower than B's, and
B's clock appears to be running slower than A's and either point of view is
equally valid. That's because, in special relativity, we're dealing with
"hyperbolic rotations", and the apparent distortions of the measuring
rods and the clocks actually stem from the fact that we're projecting their
space and time coordinates onto ours and projecting our space and time
coordinates onto theirs, and looking at, so to speak, perspective effects. But
with general relativity, it's like the "twins paradox": there really
is an asymmetry in the situation.
Does the term
"general relativity" refer to the fact that we observers sitting
outside the distorted spacetime around the star may ourselves be (and probably
are) in a distorted spacetime? If so...if I'm interpreting this situation
correctly (and I may not be)...we have no way of knowing it by looking at the
local neighborhood around us or by running local physical experiments any more
than the stardweller can tell that he's in a distorted spacetime without
being able to look outside it. In that sense, there's no way of ever knowing
whether we're in the preferred coordinate system. Also, we can't tell
from the laws of physics that we're in an accelerated frame because the laws of
physics work the same for us as they do for someone in flat spacetime. (The
reason I'm not sure about what I'm saying is that the Riemann curvature tensor
provides a mechanism for determining whether a spacetime is curved or flat, so
I may be speaking out of school. I'll have to check on that.) So that's what's
"relative" about "general relativity".
Anyway, the "r"
that's used in the equation for the metric is defined as the radius of
curvature of the spherical shell at "radius" r.
Determining g_{11}(r) and g_{44}(r) takes some arithmetic
but it's not too bad. I'm going to jump ahead and say that the answer...the
Schwartzchild metric...turns out to be fairly simple. It is:
,
where r_{S}is the Schwartzchild radius,
typically of the order of a few kilometers for a star, and is the radius at
which a star undergoes gravitational collapse and becomes a black hole.
Now for the
derivation.
We can derive the
equations for the geodesic curvesthe curves which minimize the integrals of
the distances ds in the warped space in the neighborhood of the starwith the
calculus of variations. If we label each point along the path ds by a
parameter z, then
dr/dz = r', dq/dz = q', df/dz =f', d(ct)/dz = ct'
Normally, you'd use
dt or ds as parameters to measure positions along the path, but in
this case, ds is the quantity being minimized and dt is one of the dependent
variables, so some other symbol has to be used. But not to worry. This
symbol zdrops out of the picture almost immediately.
Next, we'll take
the customary step in dealing with centralforce motion of saying that we can
orient our coordinate system any way we please, so we might as well orient it
so that q = 0.
That way, the df^{2} term drops out. All the motion will occur in one
plane in general relativity as it does in classical mechanics because, with
radial symmetry, in general relativity as is the case in classical mechanics,
nature has no preferential reason to drift in one direction or the other out of
a plane of orbital motion...no basis for choosing one plane of motion over
another. Then our distance to be minimized becomes:
(6)
Equation (6) is exactly the same as the equation
_{}
_{}
_{}
Since the metric
function in Equation (6) has no q or ct terms, we know immediately that
_{}
However, this time,
because of the presence of g_{11}(r) and g_{44}(r), which are functions of r, we won't assume that
since it isn’t.
We'll skip trying to work with the first equation for r, since the g's
are unknown functions of r. (Actually, we know what they are but we
can't prove it yet.) Turning to the second equation, since = 0,
=constant = angular momentum
= J_{q} = r^{2} dq/ds = r^{2}q’.(“J” is a symbol
that’s frequently used to designate angular momentum.)
the same as it was
for special relativity. Hey! We're wheeling and dealing in general relativity
as though it were classical mechanics!
Having had such
smashing success with our first attempt, let's try the last equation. It's also
the same as before except that this time, we're carrying along a g_{44}(r) term.
This looks formidable but it's not. g_{44}(r) is only
and d(ct)/ds = 1/(ds/d(ct))
is the reciprocal of the speed of the particle along its trajectory, s. At
large values of r, where the conventional ds^{2}= c^{2}dt^{2}  dr^{2} r^{2}dJ^{2}Minkowski metric is
applicable:
and in close to a
star, the exact metric is given by
The term has been carried along in order to derive the full equations of
motion. However, since the g's are only associated with the dr and dt terms, it
will be simpler to evaluate the g's using a simple radial freefall problem
with the angular momentum set to zero. Now the expression for the metric can be
written:
_{
}_{Looking
just at the time term,
}_{ }and _{}
_{}
_{}
If we define _{
}to
be _{
},
then the above equation can be written,
_{}
_{}
_{}
d = d[ g_{44}(r) (ct’)^{2} g_{11}(r)r'^{2}  r ^{2}q'^{2}]^{1/2} dz = d dz = 0