But then something changed. Leakage current reared its ugly head. As transistors became smaller they also became less substantial. Whereas once the trickle of current that would leak through a transistor when it was off would be tiny compared to the stream it would take to shift it from off to on, as transistors became smaller they became more and more permeable. And to make it even worse, shrinking transistors meant that even if overall power usage remained the same, that power was concentrated in a smaller and smaller area.
Something had to give, and it did. After the ill-fated Pentium IV, Intel figured out that using more, slower transistors meant that the power usage was spread out over a larger area, and the lower voltage those transistors operated at meant that leakage again retreated, and was again less important than the fundamental power needed to drive those transistors from one state to another.
Adding more transistors to execute more instructions at once only goes so far, however, because often one instruction depends on the previous one, and finding places where they don't gets harder and harder the more you try to do it. So one way which CPU designers tried to use all these extra transistors to speed things up was by duplicating the entire CPU, creating more cores. But what happens when you have more cores than you can really use? That's when we start entering the realm of dark silicon.
Its a well known fact that specialized hardware is almost always faster than general purpose hardware. This is why we have graphics cards, a chip that was designed explicitly to do 3D graphics can make assumptions about what its doing, it can neglect areas that aren't needed, it can encode rules directly into silicon instead of having to fetch abstract rules from memory. If you have a graphics chip of a certain number of transistors and a certain power budget, it would take many chips of the same size to do as much work if they were all general purpose CPUs.
But why stop with graphics? If specialized circuits can do a task faster and more efficiently, why not use them for as many things as possible? As Moore's law keeps working and the number of transistors you can cram onto a chip keeps going up, and as the number of general purpose CPUs you can usefully cram onto said chip doesn't, it makes more and more sense to put specialized circuitry on board. Now mind you, those same transistors could be used for more cache, but as caches get bigger each marginal transistor used for such gets less and less valuable. Sooner or later a specialized processor for doing media encoding or encryption looks more and more like a better use for those marginal transistors.
And sure enough that's already happening. Intel's newest chips have both specialized circuits for encryption and media encoding. AMD is planning similar things, and it looks like the age of the co-processor is about to begin. This is called "dark silicon" because most of the time it just sits there, dark and unpowered, not using any power at all. But when you need to do something it's suited for, it thrums to life tears through whatever task you throw at it before growing silent once again.
As time goes on we're likely to see more and more such accelerators come into being. What will be next, more Java accelerators? Physics accelerators? Haskell accelerators? A configurable blocks that can become whatever is needed? In any event operating systems will have to evolve to be able to dispatch tasks to whatever specialized hardware can handle them, or just emulate those specialized functions in software. It looks like Apple might be leading the way here, but the final result might end up looking very different.