I think the most important thing to understand is that, because time-to-market is valuable, and employees salaries are valuable, development time is a resource that must be deployed efficiently. Thus if you are going to spend time optimizing, it's important that you deliver real value for your work. This means two things I suppose:
- Don't optimize code that's already fast enough.
- Don't optimize code that is so irrelevant to your program's performance that it doesn't translate into improved user experience.
Similarly we could try to optimize the performance of our menubar drawing code. The menu bar is drawn enough that it affects users. But the menu bar is such an infinitely small fraction of program performance that even if we made it 100x faster, the translation into overall program speed would be unmeasurably small.
So in that context, I can only say this about virtual functions: virtual functions are an organization tool that reduce the total time that it takes to code by giving a programmer better code-management and organization tools. Since the performance of a program is already limited by how many developer hours are available to optimize:
- Most of the time virtual functions will outweigh their performance cost by improved productivity.
- I suspect that if you are really worried about speed, you'll get better program performance by using "more expensive" OOP techniques in most parts of the program and using the freed-up development time to optimize the few parts that really matter.
(An instrumenting profiler changes your code to record timing data; an adaptive sampling profiler uses the hardware to poke at it at fixed intervals. While instrumenting profilers would seem to provide more accurate data, I don't like them because often the profiling overhead changes performance characteristics so much that you can't tell what the real problem is. This is particularly a problem for OpenGL-based applications, where half the performance is on the GPU and thus the CPU-GPU execution speed ratio matter a lot.
Adapative sampling profilers work because the most important code to fix is by definition running most of the time, so most of the samples will fall into functions you care about. For example, when profiling X-Plane we can find up to 40% of samples falling into OpenGL, mesh and OBJ draw code, which is indeed the most important part of X-Plane to tune. Sure the adaptive sampling profiler gives us really unrealistic data about the cost of drawing the menu bar, but the time spent there is so slow that we don't care.
Shark is strictly an adaptive sampling profiler. VTune comes with both types. GCC and most compiler packages also provide an instrumenting-profiler option.)
One more thought: X-Plane is fast by design -- that is, some of the things we did to have high throughput from the rendering engine were design decisions that had to be made early on. This goes against the idea of opportunistic optimization. If we were planning these ideas from day one, how did we know from running the program that they would matter? Did we just guess?
The answer is: we knew where to optimize in X-Plane 8 from the performance of X-Plane 7. That is, we use the design-induced performance limitations of a given revision to direct our development into the next version. This is just opportunistic optimization combined with code refactoring.