This is part one of an open ended series on writing a super fast math library using SIMD. The goal of the project was to write a math library that is both cross platform and very, very fast.
I've had a lot of fun doing it and have learned quite a bit and thought I'd share what I've done so far.
Let's skip the small talk and get down to business. The goal is to write vector, matrix and quaternion types that are 1) correct, 2) cross platform, and 3) fast. All of the development has, thus far, been done using Microsoft Visual Studio 5 under Windows.
To get timing results, I'm using Agner Fog's Test programs for measuring clock cycles and performance monitoring. It uses the rdtsc instruction to determine clock counts, cache misses, branch mispredictions and a host of other stuff.
For intrinsics, I am strictly limiting myself (at the moment) to SSE version 1. It is supported on Intel, AMD, and VIA chipsets going back quite a bit.
The platforms I'm targeting are the Xbox, Playstation3, and, of course, the PC but all of the information I'm putting here is PC oriented. The platforms are vastly different and, while I'd love to cover all three in deep meaningful investigation... I really don't have the time. :)