Back to Dmytro Taranovsky's home page
View as Microsoft Word document

Dmytro Taranovsky

Date Started: February 12, 2001

Last Modified: January 4, 2002

General Relativity

The paper presents non-quantum physics with emphasis on the general theory of relativity. The entire theory but not many applications or approximation techniques are included. Required mathematics is rigorously developed.

Part I: Mathematical Background

Section 1: Special Mathematical Notation

To keep formulas simple, special notation is used. When a statement has free indexes (superscripts and subscripts that are variable and not assigned a particular value), the statement is assumed true for all values of the indexes. For example, "x_i=c_it where i can be 1, 2, or 3" means "x₁=c₁t, x₂=c₂t, and x₃=c₃t". Because of the extensive need for indexes, superscripts of variables are typically interpreted as indexes instead of powers. Instead, use parenthesis to express power. For example, x²≠x*x=(x)². Sequences are denoted as follows: x₁, x₂, ..., x_n is x₁, x₂, ..., x_n‑1, x_n if n>1, is x₁ if n=1, and is nothing when n=0.

Summation convention states that when an index variable appears twice in an expression, summation over the allowed values of the index is assumed. For example, the chain rule simplifies to. To avoid using the summation convention, state "no summation" or enclose the indexes in parenthesis. The expression must be expanded before applying the summation: (x_i+y_i)*(x_i+y_i)=x_ix_i+2x_iy_i+y_iy_i=(x₁+y₁)²+(x₂+y₂)² with i=1, 2. The summation is not assumed when the indexes (after the expression is expanded) are separated by addition or subtraction: a_i+b_i is not summed up over all i. The summation occurs as close to the indexes as reasonable: (x_iy_i)² with i=1, 2 is (x₁y₁+x₂y₂)². When manipulating expressions using the summation convention, you can usually ignore implicit summation sign.

Section 2: Tensors and Tensor Operations

A tensor T is defined as an ordered collection of its components: where each of the i and each of the j can assume any integer value from 1 to N and all components are real numbers. N is called dimension, m is contravariant order, n is covariant order, and (m, n) is rank of T. (The total number of components is N^m+n.) Tensor field assigns a tensor to every point in a space. Tensors are often referred by their components.

Definition: Let T be the value of a tensor field at a certain point. Let T have components in coordinates x¹, x², ..., x^N. Then in the coordinates the components of T are where W is a real number (often a fraction or an integer) that is dependent only on the tensor field. (Do not forget to use the summation convention on every a and every b.) If W=0, the tensor is said to be absolute, otherwise the tensor is a relative tensor of weight W. Unless stated otherwise, W=0.

Theorem: The definition is non-contradictory for arbitrary tensor fields and coordinates provided that the expression is defined and the Jacobian is not zero.

Tensors are added, multiplied by a scalar, and subtracted component by component: C=A+B means; B=c*A means; -B=-1*B; A-B=A+-B. The operations above are undefined if there is a mismatch in rank or dimension. Outer product, C=A⊗B, means. Contraction is setting a subscript equal to a superscript and invoking the summation convention. For example, . (The covariant and contravariant orders are decreased by 1.) Inner product is an outer product followed by contraction(s). Addition, subtraction, multiplication by a scalar, contraction, outer product, and thus inner product of tensors can be shown to transform under coordinate change as tensors would by the definition. For example, C=A⊗B⇔ where the bars indicate a new coordinate system. Tensor differentiation and integration, if done component by component, does not transform properly: often, . Section 6 describes how to differentiate tensors.

When a quantity with proper components is defined in multiple coordinate systems or in terms of tensors, the quantity is said to be a tensor if it always obeys the transformation equation. Tensors are usually preferred over "nontensors" because a tensor needs to be defined in only one coordinate system and because change of tensor components under coordinate transformations is predictable and depend only on the coordinates and the nature of the tensor.

Section 3: Examples of Tensors

Kronecker delta, also called identity tensor, δ^ij=δ_i^j=δ_ij=1 if i=j and 0 otherwise. Generalized delta tensor has components. Each component is 0 if a_i=a_j or b_i=b_j for some unequal i and j. Otherwise, if a₁a₂...a_n is an even permutation of b₁b₂...b_n, the component is one, odd permutation--the component is -1. Otherwise (that is if ∃i (1≤i≤n) ∀j (1≤j≤n) a_j≠b_i), the component is zero. T is called is symmetric with respect to i_x and i_y (x≠y) if exchange of values of i_x and i_y does not affect any of the components, and anti-symmetric if the exchange multiplies every component by -1. (Both i_x and i_y must be either subscripts or superscripts.) For example, δ_ij is a symmetric tensor; is an anti-symmetric tensor with respect to any subscripts or superscripts. Note that if T is anti-symmetric with respect to i_x and i_y, then the value of every component where i_x=i_y is zero. Permutation tensor, e, is anti-symmetric with respect to any two components, and e^12..n=e_12..n=1. Weight of e^ij..k, e_ij..k, δ_ij, and δ^ij is not zero.

A tensor of rank (0, 0) is called a scalar and has only one component. A tensor of rank (1, 0) is called contravariant vector or simply vector. For example, position and velocity are vectors. A contravariant vector can be displayed as a sequence such as (a₁, a₂, ..., a_n). The transformation equation implies that in smaller coordinates vector components are larger. A tensor of rank (0, 1) is called covariant vector or a 1-form. Covariant vectors can be interpreted as assigning for each contravariant vector a real number in a linear way through a dot product. The transformation equation implies that larger coordinates cause larger components.

Let E_iE_j...E_kE^lE^m...Eⁿ be defined such that (E_iE_j...E_kE^lE^m...Eⁿ)=1 and other components are zero. E_iE_j...E_kE^lE^m...Eⁿ form a basis since an arbitrary tensor A= E_iE_j...E_kE^lE^m...Eⁿ (The summation is over all components.) (Visualization of 1-forms: x_iEⁱ as a set of yⁱE_i such that x_iyⁱ=1.)

A multilinear form is a function that takes n vectors and returns a scalar in a linear way with respect to each of the vectors. For example, f(cE₁, E₂) = cf(E₁, E₂), f(E₁, E₂+E₃) = f(E₁, E₂)+f(E₁, E₃). Multilinear forms are, informally, essentially tensors of rank (0, n). That is T_ij...k*uⁱv^j...w^k=f(u, v, ..., w) for an appropriate absolute tensor T.

Note: Determinant of a square matrix A, det A=.

Section 4: Metric Tensor

An inner product between two vectors u and v, a real number <u, v>, is defined such that <u, v> = <v, u>, <u + w, v> = <u, v> + <w, v>, u≠0⇒∃w <u, w>≠0. Metric tensor, g≡<E_i, E_j>EⁱE^j. Conjugate metric tensor, g, is such that g^ijg_jk=. Metric is a smooth (that is infinitely differentiable) tensor field made of metric tensors. Geometry of any space with inner product--including distance, angles, and curvature--is entirely determined by the metric. Length squared, ||u||²≡<u, u>. Note that for some spaces ||u||² can be zero or even negative for nonzero u. Metric is called positive definitive if in all cases <u, u> ≤ 0. An associate tensor to tensor T is a tensor to which T can be transformed through inner product(s) with metric tensor and/or conjugate metric tensor. Example: g_abT^a=T_b.

Note: Outside of relativity, inner product is defined so that <u, u> ≥ 0.

Theorem: (a) g_ab is an absolute tensor of rank (0, 2); (b) g_ab=g_ba; (c) if metric tensor exists, conjugate metric tensor also exists; (d) g_abu^av^b = <u, v>. (e) For every symmetric tensor g such that the matrix g_ij is invertible, there exists an inner product such that g_abu^av^b = <u, v>. (f) If metric is positive definitive, <u, u> = 0⇔u=0.

Section 5: Manifolds

Definition: n-dimensional smooth manifold is a set of points combined with a set of coordinate systems such that:

1. Every coordinate system (also called smooth chart) consists of a subset R of the manifold and (x₁, x₂, ..., x_n) for each point in R. All x_i are real numbers.

2. In every coordinate system, different points have different coordinates.

3. Every coordinate system is open and connected. Open means that for each (x₁, x₂, ..., x_n) in the system there exists r>0 such that every (y₁, y₂, ..., y_n) belongs to the coordinate system provided that (x_i-y_i)*(x_i-y_i)<r*r. Connected means that for every pair of coordinates y=(y₁, y₂, ..., y_n) and z=(z₁, z₂, ..., z_n), there exists a continuous function f(t)= (x₁, x₂, ..., x_n)(t) (where 0≤t≤1, and each (x₁, x₂, ..., x_n) are valid coordinates for the coordinate system) such that f(0)=y and f(1)=z.

4. Union of all R is the set of points that make up the manifold.

5. For every 2 coordinate systems that share some of the points, the transformation between the coordinate systems must be smooth (that is infinitely differentiable) and thus have a nonzero Jacobian.

6. For every 2 points A and B, there exists A₁, A₂, ..., A_n such that A₁=A, A_n=B, and for every integer i (0<i<n) there exists a coordinate system such that A_i and A_i+1 belong to the system. In other words, the manifold is connected.

Manifold is, informally, a generalization of a smooth surface or a space. Once a manifold is defined, more coordinate systems can be added provided that the Jacobian of all coordinate transformations exists and is nonzero at all points where the coordinates apply. Two manifolds are called topologically equivalent if there is a one-to-one transformation between the manifolds that preserves continuity. For example, all 2-dimensional ellipsoids are topologically equivalent, but a sphere is not topologically equivalent to a torus. Tensor field on a manifold is specified by specifying the field for every coordinate system such that the tensors obey the transformation equation.

Definition: Pseudo-Riemannian manifold is a smooth manifold with a metric.
Definition: Riemannian manifold is a pseudo-Riemannian manifold whose metric is positive definitive.

Section 6: Covariant Differentiation and Curvature

The actual change with location of a quantity depends not only on the change its coordinate components but also on the change of nature (size) of coordinates with location. For example, decrease of length of x¹ (derived from g₁₁) with increasing x² can give incorrect results because ∂x¹/∂x²=0 despite the change in lengths. Therefore, tensor differentiation is adjusted for the coordinates:

where Christoffel symbol of the second kind, =g^ia *[jk, a] and Christoffel symbol of the first kind, [ac, b] ≡. Covariant (the adjusted differentiation is called covariant) derivative of an absolute tensor of rank (m, n) is an absolute tensor of rank (m, n+1). The derivatives are denoted by adding a comma after the last subscript (such as A_i,j) and placing the number of the axis/coordinate after the comma. Second and higher derivatives do not require additional commas: Aⁱ_,jk. The derivative is a linear operator. Covariant derivative, especially of a scalar, is sometimes called gradient: B=∇A means B_i=A_,i. Theorem: g_ij,k=0. The theorem confirms that derivatives are adjusted for the change in coordinate size--that is the metric.

A smooth curve is defined by a smooth function r(t)=(x¹(t), x²(t), ..., xⁿ(t)). The curve is called normalized if ||dr/dt||=1 or ||dr/dt||=-1 for every t on an open interval. Intrinsic derivative along a curve, and is used instead of the non-adjusted derivative to compensate for the change of coordinate sizes and angles.

Curvature (more precisely intrinsic curvature of the manifold) at a point is fully described by the Riemann Christoffel Tensor: .

R_abcd≡g_auR^u_bcd. Theorem: R_abcd=-R_bacd=R_dcba; R_abcd+R_adbc+R_acdb=0; R_abcd,e+ R_abec,d+ R_abde,c=0.

Theorem: For every point in every pseudo-Riemannian manifold, there exists a coordinate system (called geodesic coordinate system) for which ∂g_ij/∂x^k (and thus all of the Christoffel symbols) are zero at the point.

Part II: General Theory of Relativity

Section 1: Geometry of space-time.

Our world is modeled by a 4-dimensional pseudo-Riemannian manifold called space-time. For every point, there exists a frame of reference, that is a coordinate system, such that in the frame at that point, the metric, g=η where η≡E¹E¹+E²E²+E³E³-E⁴E⁴. Such frame of reference is called locally normal. If g=η and ∂g_ab/∂x^c=0, the frame of reference (for each point, such frame can be proven to exist) is called locally inertial. MCRF is a locally normal frame of reference where the point object is currently at rest.

To be a complete theory, the mathematical model must be related to human perception. If g= η, x⁴=A⁴E₄ refers to a point in time and AⁱE_i = (x₁, x₂, x₃) (i=1, 2, 3) refers to a location in space. Let an observer be at rest in a locally inertial frame of reference. The observer will locally define distance in space between (x₁, x₂, x₃) and (y₁, y₂, y₃) as √((x¹-y¹)²+(x²-y²)²+(x³-y³)²) and interval in time between x⁴ and y⁴ as |x⁴-y⁴|. Observers disagreeing on who is at rest may disagree about distance in space and interval in time. An observer at time t tends to consider space at time T>t as the future (and not yet occurring), space at T=t to be the present real world, and space at T<t as the past (what has already happened) provided that the direction of time is positive; otherwise, reverse '<' and '>'.

If g→η at xⁱ and d→0, then V(R)/(d)³→1, where V(R) is volume in space of all points rⁱ such that xⁱ<rⁱ<xⁱ+d (i=1, 2, 3). For a property F based on a region of space, its density is dF/dV.

Section 2: Dynamics of Point Objects

A point object is a useful approximation for objects whose size that is small compared to the distance between the objects. It is described by its smooth curve: r(t). (t is a parameter, r⁴ is time; two curves r(s) and R(t) are equivalent if and only if there exists invertible s(t) such that r(s(t))=R(t).) Four-velocity, u≡dr/dt where r(t) is normalized. Observation shows that when u exists, ||u||=-1: Objects cannot move in space faster than they move forward in time. 4-velocity is undefined if the curve cannot be normalized. 4-velocity is used because objects move not only in space but also in time. Note: If g=η, 3‑velocity, v=(u¹, u², u³)/u⁴. Theorem: (1) MCRF exists unless u is undefined (2) u is a contravariant vector with value (0, 0, 0, 1) in any MCRF.

For a normalized curve, acceleration (a) is the intrinsic derivative of the velocity. When a particle does not interact, its a is 0, and it is said to move on a geodesic. (If the curve cannot be normalized, a=0 when for an appropriate invertible s(t), δr(s(t))/δs=0.) Light has undefined 4-velocity and zero acceleration. Since an event cannot cause itself, for every smooth curve r(s) with ||u||=-1 at all points, s≠t ⇒ r(s)≠r(t).

Rest mass (an absolute scalar m₀) is mass in MCRF. 4-momentum, p ≡ m₀u. p is a contravariant vector; ||p||²=-(m₀)². If g=η, the first 3 components form the usual momentum and the 4^th component is energy (energy is relativistic mass). 4‑momentum for a set of non-intersecting objects is defined as the sum of 4-momentums for all objects in the set--p is additive. Force is intrinsic derivative of the momentum (assuming that the curve is normalized). When acceleration is zero, force is zero.

Section 3: Relativistic Fluids

Since all objects have sizes, point objects are impossible. Instead, the universe is modeled by tensor fields (sometimes called fluids) such as the stress-energy tensor (T^ab). If a part of the world is modeled by tensor field A and A is sum of Bs, then each B is called an element of A. Thus, each tensor field is additive for all of its elements. Let 4-momentum density in space (a vector) be denoted P. For a uniform element of 4-velocity U, T=U⊗P (caution: elements of P do not match with elements of T); ‑g_abT^ab is rest mass density. For an element, the force density, F^a=T^ab_,b (spatial components are pressure and the time component is density of power). Force density is based on area in space on which the force is acting. The task in non-quantum physics is to use some information about tensor fields to obtain more information.

Section 4: Gravitation and the Foundations of Relativity

Gravitation is represented by curvature of space-time. Ricci's tensor, R_bd = R^a_bad; R^ab=g^aug^bvR_uv. Einstein's tensor is G^ab=R^ab-½g^abg^cdR_cd.

(Postulate) Einstein's Field Equation: G=8πT.

Theorem: (a) G^ab=G^ba; (b) G^ab_,b=0.

Corollary: (a) T^ab=T^ba; (b) T^ab_,b=0.

Interpretation of part (b) of the corollary:

1. Locally, energy and momentum are conserved.

2. Whenever two elements interact, the change of T for element A due to element B is exactly opposite to the change of T for element B due to element A.

General relativity is a template model--it specifies how interactions can be described but does not state all equations for all interactions. An interaction is specified by stating the type of tensor fields matter and the interaction have, the equations for the tensor fields, stress-energy tensor of the interaction field, and force density on the element by the field. All equations are expressed in terms of the tensors (including the metric g) through addition, contraction, products, covariant differentiation, and operations derived from these operations. Index variables but not specific index values may be used, and no variable in an expression may appear twice as a subscript or a superscript (except when separated by '+' in the expanded form).

Theorem: (It is stated informally. It is also called Principle of Equivalence)

1. Physical laws are the same in all frames of reference.

2. Other operations on tensors may not be included in the equations

3. Physical laws are local (no actions at a distance)

Justification of part (1): The expressions transform as the appropriate tensors do.

An implication of part (2): Riemann Christoffel tensor (derived from the metric through non-adjusted differentiation and other operations) cannot be included.

Section 5: Laws of Electromagnetism

Let j_i be electric current density in i direction where xⁱ are coordinates of space where g=η; and let electric charge density, dq/dV=ρ_q. Then, 4-current, J = (j₁, j₂, j₃, ρ_q), is a contravariant vector. For a uniform element, J=uρ_q. Electromagnetic field is characterized by Faraday's tensor: F^ab. (Note: F^a_v=g_vbF^ab; F_uv=g_uaF^a_v.) The field obeys F^uv=-F^vu, F^uv_,v=4πk*J^u and F_uv,w+ F_vw,u+ F_wu,v= 0, has stress-energy tensor T^uv = (F^uaF^v_a-g^uvF_abF^ab/4)/(4πk), and force density on an element is F^c = F^c_aJ^a.

Note: Coulomb's constant, k=0.007297...ħc/e²= c²*10^‑7(kilogram*meter)/(coulomb)².

Theorem: J^a_,a=0. Interpretation: Electric charge can move but cannot be created or destroyed.

Section 6: Concluding Remarks

A non-quantum physical model is called complete if, given an initial state of the system, its development in time can be (theoretically) fully predicted. Theorem: Gravitation and electromagnetism, as presented above, are complete. In addition to electromagnetism, user-defined interactions can be added to avoid using quantum mechanics: Such interactions usually describe macroscopic predictions of quantum mechanics. If the result of a calculation is in different units than you want the result to be, multiply and/or divide by the following conversion constants: c = 299792458 meter/second, G ≈ 6.67259*10^‑11 (meter)³/(kilogram*(second)²). From this paper, you have learned the entire non-quantum physics in the sense that the knowledge that is not part of the paper can be derived from knowledge that is. However, theorems, relevant approximations in real-world situations, and practice are required to effectively solve problems.

Exercises

1. Show that C=A*B⇔C_ik=A_ijB_jk for all matrices A, B, and C.

2. For a tensor of rank (l, m) and weight W in n-dimensional space increase of coordinates in 'a' times increases the tensor components in 'b' times. Prove: W = (l-m+log_ba)/n.

3. How does coordinate change affect
(a) scalars (b) contravariant vectors (c) covariant vectors
(d) tensors of rank (0, 2) (e) tensors of rank (2, 0) (f) tensors of rank (1, 1).

4. Prove that (a) =det A where A_ij= (b).
5. Show that ||u + v||² = ||u||² + ||v||² + 2 <u, v>.

6. In parts (a) and (b), show that the space is a Riemannian manifold.

(a) Euclidian n-dimensional space with metric g_ij=δ_ij.
(b) The set of all x (0≤x<1) such that coordinate system 1: 0<x<1, y₁=x, and system 2:
0≤x<½, y₂=x, or ½<x<1, y₂=x-1. For metric, g₁₁=1 in both coordinate systems.
Note: The manifold can be represented as a circle of circumference 1.

(c) Show that no appropriate coordinate system can span the entire manifold in part (b).
7. For tensors in problem 3, find (a) covariant derivative (b) intrinsic derivative.

8. For an arbitrary 2-dimensional orthogonal (g₁₂=0) coordinate system find
(a) All of the Christoffel symbols.

(b) Riemann Christoffel tensor in terms of the metric.
Hint: Use symmetry in the equations to save time.

9. Lorentz transformation (of coordinates) is

(x¹, x², x³, x⁴)→((x¹-vx⁴)/√(1-(v)²), x², x³, (x⁴-v*x¹)/ √ (1-(v)²)) where ‑1<v<1.
(a) Show how Lorentz transformation affects tensors of rank (0, 2).

(b) Use your answer to (a) to prove that if g=η, then g=η after a Lorentz transformation.

10. Find T for a uniform element of momentum density (p_x, 0, 0, E) where g=η.
11. Find Einstein's tensor (a) for metric in problem 7

(b) for an arbitrary locally inertial (4-dimensional) coordinate system.
Note: Part (b) may be omitted because it may require a long calculation.

Appendix: Preliminary Mathematics

Let x₁, x₂, ..., x_n be independent variables related to y₁, y₂, ..., y_n. Then, is the rate of change of y_i divided by rate of change of x_j assuming that the other x are held constant. ∂f/∂x is sometimes denoted f_x. Jacobian, J(x/y), is determinant of the matrix whose cell of index ij is ∂x_i/∂y_j. When coordinates work properly, the Jacobian is not zero.

Let a space have a measure based on a region (length, area (A(R)), and volume (V(R)) are measures), m(R); and an additive function whose argument is a region: F(R). Then, f(r) = dF/dm=limit as size of R→0 of F(R)/m(R) when such limit exists, such that r always belongs to R. Then,. Properties of measure: m(R)≥ 0; measure is additive, that is measure of two non-intersecting regions is sum of measures of the regions. Note: dF/dm is sometimes called F density. F(x) is continuous if as change in x→0, change in f(x)→0. df/dt ≡ f′(t) ≡ limit of (f(t+a)-f(t))/a as a→0 for any f(t) for which the limit exists. dⁿ⁺¹f/dtⁿ⁺¹=(dⁿf/dtⁿ)/dt with d⁰f/dt⁰=f(t). Function f is smooth if for all positive integers n, dⁿf/dtⁿ exists. In phrases such as "a→b, c→d", symbol '→' means 'approaches arbitrarily close to'.

'√' is a symbol for square root. |x|=√(x)². a≡b means a is defined to be equal to b.

π ≡ 1/1-1/3+1/5-1/7+1/9-.... ∃x means 'there exists x such that'. ∀x means 'for all x'. When ∃ or ∀ are used, parenthesis after the variable, if present, indicate restrictions on the values of the variable. 0 is a zero vector (or matrix or tensor), that is a vector whose all components are zero. A permutation is any rearrangement (including no rearrangement at all). Permutation is called even if it can be done using an even number of interchanges and odd if it can be done using an odd number of interchanges. Theorem: no permutation is both even and odd.

'A⇒B' means 'A implies B', and 'A⇔B' means 'A⇒B and B⇒A'.