I found this question about special relativity in my email today:

The mathematical principle behind Minkowski’s formula, as described in the [Wikipedia] article, is quite straightforward and logical, and I understand how come time contracts as space expands and vice versa. I also see how this formula mathematically develops into the Lorentz transformation. My quibble, however, is as follows:

In two-dimensional space, we define the distance “d” to equal the square root of x squared plus y squared. In three-dimensional space we add the third coordinate – the z coordinate – to the equation. However, by adding “t” as the fourth dimension, the rules change and diffent mathematical parameters must be applied as time is an intrinsically different physical term from space. From the formula the article expresses –

d^2 = x^2 + y^2 + z^2 – (ct)^2

– it is deduced that the size of the vector k in 4D space where k: (a,b,c,d) + u(x,y,z,t) can be formulated thus:

d = sqrt(x^2 + y^2 + z^2 + (ict)^2) where “i” is the imaginary unit and the square root of -1, and “c” is the constant of the speed of light. This is all very nice mathematically, yet it sheds no light on why the t coordinate must be multiplied by the term “ic”, and what *physical* relevance does this constant have, other than the fact that it works mathematically in the formulae?

This is an interesting question. Let me start by noting that the *i* is a red herring. It is only there to have addition for all the terms. It is more illuminating to write

d = sqrt(x^2 + y^2 + z^2 – (ct)^2)

which breaks the symmetry but gets rid of the imaginary quantities that make it harder to visualise what’s going on. Likewise the speed of light is only there to have consistent units – it is an artifact of using the SI system, which isn’t necessarily useful for relativistic physics. In the natural units used in particle physics – my own field – we define c=1, and get

d = sqrt(x^2 + y^2 + z^2 – t^2).

Now, with the distractions stripped away, what is the physical meaning of that minus sign? It tells you that the coordinate t is decisive in determining whether two spacetime points can be causally connected. If two events have a positive relative Minkowski metric, then neither of them was caused by the other, because the (ordinary, space) distance sqrt(x^2+y^2+z^2) is larger than the distance light will travel in the time available. In other words, it is nothing but the familiar idea you learned in school, that if you look at an object 1 light-year distant, you will see what was happening there 1 year ago. (Of course, you can flip all the signs if you like, and now it is a negative distance that means the events are causally disconnected. Some textbooks do this; hence the convention in many books and articles to say “We use a (-1,1,1,1) metric” – meaning time has the minus sign and comes first in the four-vector. Or (-1,-1,-1,1) if they want time positive and last, and so on.)

I dismissed the ‘c’ as an artifact of using the SI system; but you might reasonably object that ‘natural’ units are propagandistically named, and that the SI system is just as good. The lack of a ‘c’ in my equation, you would say, is just an artifact of using natural units; but you are a free and inquiring thinker, and won’t be browbeaten by this word ‘natural’ into abandoning your own units. Fair enough; what does this ‘c’ mean, then? It is the scaling factor between space units and time units. That is, it tells you how many space units go with one time unit to give a metric of zero. If you choose to measure space distance in meters and time distance in seconds, then for each 3×10^8 space units you need one time unit for a zero metric. This is why natural units are called that – the space and time units are the same size.

You might say that the physical meaning of the minus sign is just that time is not the same as space; and the meaning – I hesitate to call it ‘physical’ – of the constant c is just that we conventionally measure space and time in differently sized units. As interpretations of quantities in equations go, these aren’t very interesting ones, nor do they particularly help in understanding special relativity. This is because the Minkowski formulation is just that – it is a formulation, a piece of mathematics useful for the working physicist doing calculations within special relativity. It is not as useful in getting a good qualitative understanding. If you want to do calculations, though, especially of anything even moderately interesting within field theory, you need Minkowski and four-vectors. In my own thesis work, I need to calculate decay times of particles in the particle’s frame of reference, knowing its momentum, production point and decay point in the lab frame of reference; out come the four-vectors! Trying to do it with the proper time, proper length, and so on, would be much more difficult to generalise – particles come out of collisions every which way, they don’t always go along the X axis as in textbook problems. Of course you can redefine the coordinate system for every particle, but it’s simpler to just use four-transformations with general coordinates, especially when you need to know the direction of a particle in the rest frame of another, as when calculating a helicity angle.

It’s useful to keep in mind that although you can derive Lorentz transformations from the Minkowski metric, the causality flows the other way: The Minkowski metric is useful because we can derive Lorentz transformations from them. All the physical insight is contained in the Lorentz parts; Minkowski just gave us a convenient mathematics for dealing with it. If you understand relativity from the Lorentz point of view, there is nothing further to be gained from learning about Minkowski unless you happen to do a lot of calculations with special relativity. In a similar vein, with multiplication, all the understanding lies in looking at five rows of four and counting it out to twenty by repeated addition. Memorising the times table is just a convenient optimisation.

My interlocutor goes on to ask another question about the Wiki article:

Furthermore, in the subsequent proof that the speed of light is a constant, it seems to rely on the assumption of the aforementioned formula whereby “c” is the constant speed of light, a postulation I know is critically integral to Einsteinian special relativity, but which has no mathematical backing – at least in the article.

I think the article is a bit badly phrased, here, and if I could be bothered I’d edit it. It is not proving that the speed of light is constant; rather it is proving that if one observer sees an object moving at one space unit per time unit, then all other observers will see the same speed. Or to put it differently: In a universe where causal distance is described by a Minkowski metric, there exists a speed that all observers will agree on, and that speed is one space unit per time unit. The further observation that in our universe, light moves at this speed, is purely empirical. It has no foundation in the mathematics; it cannot be shown from first principles, but must be referred back to experiment.