How to solve linear equations

Impossible but true: a new approach to linear systems

Prasad Raghavendra is an expert in many aspects of complexity theory, especially the foundations of approximation theory. He recently was a colleague at Georgia Tech, but now has moved on to Berkeley. He will be greatly missed at Tech.

Today I want to talk about a brilliant new result that Prasad has on linear equations.

I was recently at the advisory meeting for the Berkeley Simons Theory Institute. During the meeting we had two presentations on new areas, besides talks on planned special projects. One was given by Prasad on theory, but almost in passing he mentioned that he had a way to solve linear systems. I liked the whole talk, but his almost causal comment surprised me. It seemed to me to be an entire new approach to solving linear systems. My initial thought was it must be something well known, but I did not know it.

As soon as he finished his talk we had a break, and I ran up to thank him for a wonderful talk. I quickly asked was his result on linear systems new? Or was it something I had missed? He answered to several of us, who now were waiting to hear his answer, that it was something that he had just proved. I was relieved: I am not completely out of it.

I asked if we could write about this result and he agreed. Even better, he wrote up a draft paper with just the algorithm and its analysis, which are part of larger results and projects that were the body of his talk.

Related text  How to tell someone you love them

Solving Linear Systems

The solving of linear systems of equations is ancient, dating back to 600 BCE. It is of extreme importance, and still an active area of research. For arbitrary systems Gaussian Elimination is still quite powerful. A historical note, apparently Carl Gauss only used the method named after him on six-variable problems. In those days there was no notion of, I can solve the general case in time cubic in {n}. There was no “n”: of course there was the letter “n,” but not the notion of solving a problem of size determined by {n}.

Of course now we have faster methods than this for general systems, methods that descend from the famous breakthrough of Volker Strassen. For special systems there are almost-linear-time methods, but these all require that the system have special structure, and work over the reals.

The Idea, and a Problem

Prasad’s work is for solving linear systems over finite fields, specifically {\mathbb{F}_p} with {p} prime. For example, consider these equations over the field of two elements:

\displaystyle \begin{array}{rcl} u + w + z &=& 1 \\ v + w + x + y &=& 0\\ u + v + w + x + y + z &=& 1 \\ \end{array}

Such systems are of great importance in theory, and it shocked me that he found an entirely new approach to solving them. Note that Gaussian elimination happens to work well in this example: the second and third equations yield {u + z = 1}, and this gives {w = 0}. However, in general this requires sifting through each equation multiple times, and defines all the solutions. Can we do better by considering each equation only once, and by needing only some solution?

So what does he do?

He starts with a random set of vectors {V_{0}}. Then he checks which ones satisfy the first equation. If the vectors are random, then about half of them will satisfy the equation. He throws away all the vectors that do not satisfy the equation. Call the remaining set {S_1}. This is not a typo—we have a reason for using {S} not {V}—you will see soon.

Related text  How to change date format in excel

The obvious idea is to iterate with {S_1} on the second equation: About half the vectors in {S_1} will satisfy it; let {S_2} be those that do. Vectors in {S_2} still satisfy the first equation, so the winnowing-down process that continues by taking the vectors in {S_2} that satisfy the third equation finds solutions to all considered equations. The problem is that with high probability it winnows down to zero before all {m} equations are satisfied—unless you start with {V_0} having order-of {2^m} vectors, which is too many.

How to solve this problem? The surprising answer is that one doesn’t need to start with a large initial {V_0} to maintain a high probability of catching a solution to all the equations. It suffices to maintain a suitable large-enough set {V_{k-1}} of solutions to the first {k-1} equations. But how does he do this?

The Algorithm

Here is the actual algorithm from his paper draft.

As always see his paper for the details and full proof that the method works. This is the hard part. I take that back, both the creation of the algorithm and its proof of correctness are equally tricky. Indeed the fact that there is a new algorithm is perhaps most surprising of all.

Like this post? Please share to your friends: