Lessons learned from Performance Testing a complex application
After two years too busy to write any blog, I have now a bit more time to share some of the lessons I have learned while doing Performance Testing for a complex application (Enterprise-level, several millions code lines kind of application).
- Complex application usually mean lots of different components: network, hardware, software... When producing performance data, you need to gather data about how each component behaves. It is not useful to say "the application is slow" you need to say "the application has a bottleneck in the disk I/O".
- You will be able to gather pretty much all the information you need about how your system is working using the performance counters embedded into your OS (MS, Linux...).
- Use only the performance counters you need, as they consume resources. Once you realize the values provided by a counter are "good", you can stop using it.
- When the application is far from being optimized, run all the counters and make a list of the worse one grouping them by logical areas (data repository, UI, business processes...)
- With the previous list, test individual components that are described by the counter. For example you may focus on the Data repository layer, looking for the disk I/O, for hard/soft page faults etc.
- Based on the previous results, concentrate then in a single counter. Using the previous example you may focus in the disk I/O counter if the other counters look Ok.
- Focus first in the low hanging fruit: most performance problems are related to network bottlenecks, lack of memory, disk I/O  and pinning CPUs. Once these are removed, you can focus in secondary problems: too many threads, unoptimized queries, unmaintained databases etc. After that, you do not have so many options. The most available would be to throw more hardware to the problem. If the Performance is still unsatisfactorily, then rearchitecting your application is usually the only option left 
- Unless there are egregious errors, refactoring your code helps very little if at all.
- Use automation to create load. To measure response time to user's actions use a human with a stopwatch
This last point may be surprising for some. My experience is that introducing timers in the code doesn't work. First it is not always possible, second, in practice it is pretty much impossible to capture certain events: you can capture when an image starts to be displayed in the screen, but not when it is shown entirely. If you want to measure the user experience, better to do it by hand, assuming the role of your user. 
 
