High-Frequency Trading (HFT) and low-latency trading are becoming one of the few preserves of C++. The fact that it is amenable to extensive optimisation, including micro-optimisations, has made it highly effective, some of the major trading systems are hybrid FGPA/C++ solutions or native C++ solutions.
I shall provide an analysis of some micro-optimisation techniques that have been successfully used, but also an investigation of the pitfalls that may arise. For example: Performance anomalies lead to a discovery of quirks in generated assembler due to different compiler versions. Exactly what is static branch-prediction, and how is it (ab)used? Why is counting the number of set bits of the remotest interest? And the "curious case of the switch-statement" will be investigated. Finally the performance of a fully-functional FIX-to-MIT/BIT format exchange link that uses all of these techniques will be examined.