<D <M <Y
Y> M> D>

I got called for jury duty. Then my jury service got postponed every day for a week. The jury commissioner's office says that, if your jury service is postponed past the end of the week, your jury service is deemed completed and you don't need to appear. That's exactly what's happened to me. So I've now served jury duty in the City and County of San Francisco, without doing anything more than picking up the telephone a couple of times.

That was my first experience with jury duty. I don't believe I would survive voir dire if I were actually summoned to appear on a criminal jury, but it might be vaguely possible in a civil case. The trouble starts with the fact that I work for lawyers who spend so much time representing defendants that, when some of them actually ended up representing plaintiffs in a declaratory judgment case, it took them days to get used to referring to their clients as "Plaintiffs" in court filings!

I used to work for Bill Saphir at Berkeley Lab. He taught me a lot of things. One of them was an argument that using n computers instead of one to attack a problem should never make the solution more than n times faster; in an ideal case, it might make the solution n times faster, but it should never be n+k times faster. Usually, it would be something less than n. I thought about this when reading about the mathematical techniques Hal Finney and others noted could be used to produce hard-to-verify signatures: you might be able to use problems that are particularly hard to parallelize. And it is clear that there are some problems that are far harder than others; problems that have a simple way to adapt them to parallel processing (for example, a brute-force search that doesn't depend on prior results, or independently processing large numbers of data points or records) were called "embarrassingly parallel", but there are problems that are just the reverse. (Searching for prime numbers is very easy to parallelize, but verifying that a given large number is prime is pretty hard to parallelize. Many physics simulations are pretty hard to parallelize. Just how hard is something that could be discussed for quite a while, and of course people are still getting Ph.Ds by discussing it.)

So Bill said, for example, that if you have 10 PCs instead of one, maybe you could solve a problem up to 10 times as fast, but never any faster. (We have to ignore considerations about things like running out of memory -- it's purely a question of the number of identical CPUs.)

Bill's argument for this was that, if you knew a parallel algorithm for computing something that was faster than a sequential algorithm, you ought to be able to design a simulator that would run on the original machine and simulate all the other machines. Or you could have a multitasking operating system, and start 10 processes on the single machine and let them communicate with each other.

A friend of mine made an objection to this later on; basically, she said that it assumes that the overhead of the simulator (or of context switches and interprocess communication, or of whatever corresponds to interprocess communication if you somehow combine the logic of the 10 processes into a single program) is less than the latency of network communications. I think this objection is strong. There might be some machines where context switches or interprocess communication are very slow, and there might be some networks that are extremely fast. In an extreme case, suppose it takes a week to look something up in RAM, but just two seconds to send or receive data between machines on a LAN. Then using the LAN would give a dramatic speedup for some computations relative to using multitasking or writing a simulator (because of all the expensive memory accesses involved in the latter case). So if you're talking about actual machines, you can't really prove rigorously that it might not be n+k times as fast. Bill's argument was incomplete because it depended on a hidden assumption.

But you can try to make the additional assumption that network communications are never faster than any kind of local operation for which they could conceivably substitute (like memory access, or, in the case of a simulator or multitasking OS, task switching overhead plus memory access). If you add that assumption, I think Bill's argument is reasonable.

It's interesting to see that Intel has added what it calls hyperthreading to a number of recent CPUs. Hyperthreading makes a single CPU look like two CPUs to software, so that software thinks it's running on a multi-CPU machine with SMP, even though there's just one chip. You might think that Bill's argument indicates that hyperthreading is useless, because you can't gain anything from trading your existing CPU for 2 CPUs that are half as fast. However, that application of the argument is wrong for at least two reasons. One is that you might want your computer to be able to do something other than just solve a particular math problem as quickly as possible. For example, you might want to have the computer service an interrupt (like respond to network traffic or user input) as quickly as possible while also doing other things. In that case, having a dedicated CPU for the particular task that you want to be really fast would be useful in terms of making that task happen quickly. Similarly, if you wanted to run a real-time operating system, having an entire CPU to dedicate to it might be helpful, because then you could avoid sharing that CPU with other tasks.

The other reason is that the two CPUs are not actually half as fast. Apparently Intel has put in some redundant circuitry so that both CPUs can perform certain operations concurrently; they can't perform all operations concurrently, but they can perform some of them concurrently. So that extra circuitry actually does represent more processing capability and there is no paradox in saying that the total processing capacity of a hyperthreaded CPU is greater. It actually has more processing logic built in, so that some tasks can be parallelized with no slowdown at all.


[Main]
Support Bloggers' Rights!
Support Bloggers' Rights!


Contact: Seth David Schoen