Is it ever worth the effort to spend time doing optimizations that result in just 1-2% performance improvement? Sort of itches to answer “just focus on features and quit being that obsessed with performance”, right?
At my previous day job, which was in video games, I once went through an optimization death-march (or maybe death-October, to be more precise). The sole goal was to get the 3D renderer run at 30 fps min. The problem was that it went down to 20 fps and sometimes even worse at certain camera angles. To add some “icing” on the “cake”, when you enable V-sync to avoid tearing, falling to 19.9 fps means you’re actually doing 15, because monitor runs at 60 Hz, and if you haven’t managed to show a new frame within 3/60-ths of a second (20 fps) since the last one, you’re waiting until the next V-sync that happens at 4/60-ths of a second (15 fps). Now, human eye runs at 24 fps (basically), so 30 fps is perfectly smooth, 20 fps is slightly uncomfortable but generally OK, but 15 fps is quite laggy.
Another problem was that there were no more major optimizations to pull out of the hat and save the day. (Battle hardened graphics developer would silently insert minor intentional hidden.. “reserves” here and there over the course of the project, so that artists would do their best to hit the budget in that “reserved” version, then quickly use all those reserves the week before shipping gold master and boost the frame rate nicely. Well. If I’m ever back to video games, I’m definitely doing that.)
So I started with a certain especially bad camera angle that resulted in 19.1 fps or something like that. And kept trying all the minor changes I could come up with. Some of them weren’t even optimizations, actually, because once implemented, they’d hurt my precious-s-s fps.
Most of those optimizations were tiny. Changes that improved things by 0.1 fps, which is 0.5%, did get committed into trunk. Most of the changes were in 0.1 to 0.5 fps range. I got a huge one once that made a whopping 1.2 fps of an improvement. Huge. Once.
That was pretty exhausting. But a week or two later, we had 25+ fps min. That, in turn, was pretty satisfying. Also, that was a 30% improvement over 19 fps that initially seemed “impossible” to optimize.
Optimizations in general, including tiny 2% optimizations, pile up. And they pile up in a non-linear fashion. 30 different 2% optimizations result in 1.81x improvement, not 1.6x one. 10 different 5% ones result in 1.63x, not 1.5x. Of course, big optimizations pile up even better. But you rarely get many of those if you write your code more or less properly.
So do we hunt every single 2% optimization possibility in Sphinx? No, we definitely don’t. 20X difference on a code that gets executed once on startup and eats 0.001 sec anyway? Could not care less. A new feature that introduces 1% general indexing impact that is very complicated (if at all possible) to eliminate? Introduce this delay (with a heavy sigh), it’s extra 30 seconds per hour after all. But we don’t blindly dismiss these tiny things either, and when it takes reasonable effort to write slightly more efficient code, we’re going for that. Because that piles up.
Not that I’m not doing a death-march on Sphinx when I have a chance. But I’m rather starting with an analogue of “25 fps” the next time.