hcavolsdsadgadsg I'm a spammer
Reputation: 26
Joined: 11 Jun 2007 Posts: 5801
|
Posted: Sat Dec 11, 2010 3:04 am Post subject: The compiler always wins |
|
|
Thought this was interesting.
In the first code snippet, I figured it was going to try and generate a bunch of movss... it didn't. The second code snippet was closer to what I thought would be more ideal, and closer to what it actually did generate.
But despite the similarities, the first one is massively faster in reality. I'm doing it 1,048,576 times per frame in this little bench m,ark.
test1: compiler decide: 54.64 fps, 18.30 ms
test2: compiler decide: 32.16 fps, 31.10 ms
test1: mul forced inline: 46.43 fps, 21.54 ms
test2: mul forced inline: 30.10 fps, 33.22 ms
| Code: | fast!
inline void Matrix::translate(f32 x, f32 y, f32 z)
{
Matrix m;
m._11 = 1.0f; m._12 = 0.0f; m._13 = 0.0f; m._14 = 0.0f;
m._21 = 0.0f; m._22 = 1.0f; m._23 = 0.0f; m._24 = 0.0f;
m._31 = 0.0f; m._32 = 0.0f; m._33 = 1.0f; m._34 = 0.0f;
m._41 = x; m._42 = y; m._43 = z; m._44 = 1.0f;
mul(m);
}
I was really surprised to see that it actually generated movaps.
movaps xmm0, XMMWORD PTR __xmm@2
movaps XMMWORD PTR _m$[esp+64], xmm0
movaps xmm0, XMMWORD PTR __xmm@0
movaps XMMWORD PTR _m$[esp+80], xmm0
movaps xmm0, XMMWORD PTR __xmm@3
movaps XMMWORD PTR _m$[esp+96], xmm0
xorps xmm0, xmm0
movss DWORD PTR _m$[esp+112], xmm0
movss DWORD PTR _m$[esp+116], xmm0
movss xmm0, DWORD PTR _z$[ebp]
movss DWORD PTR _m$[esp+120], xmm0
movss xmm0, DWORD PTR __real@3f800000
lea eax, DWORD PTR _m$[esp+64]
movss DWORD PTR _m$[esp+124], xmm0
call ?mul@Matrix@math@bnr@@QAEXABU123@@Z ; bnr::math::Matrix::mul
slow!
inline void Matrix::translate(f32 x, f32 y, f32 z)
{
Matrix m;
f128 zilch = _mm_setzero_ps();
m.m128[0] = zilch;
m.m128[1] = zilch;
m.m128[2] = zilch;
m.m128[3] = zilch;
m._11 = 1;
m._22 = 1;
m._33 = 1;
m._41 = x; m._42 = y; m._43 = z; m._44 = 1;
mul(m);
}
xorps xmm1, xmm1
xorps xmm0, xmm0
movaps XMMWORD PTR _m$[esp+112], xmm0
movaps XMMWORD PTR _m$[esp+64], xmm0
movaps XMMWORD PTR _m$[esp+80], xmm0
movaps XMMWORD PTR _m$[esp+96], xmm0
movss xmm0, DWORD PTR __real@3f800000
movss DWORD PTR _m$[esp+112], xmm1
movss DWORD PTR _m$[esp+116], xmm1
movss xmm1, DWORD PTR _z$[ebp]
lea eax, DWORD PTR _m$[esp+64]
movss DWORD PTR _m$[esp+64], xmm0
movss DWORD PTR _m$[esp+84], xmm0
movss DWORD PTR _m$[esp+104], xmm0
movss DWORD PTR _m$[esp+120], xmm1
movss DWORD PTR _m$[esp+124], xmm0
call ?mul@Matrix@math@bnr@@QAEXABU123@@Z ; bnr::math::Matrix::mul |
|
|