Cheat Engine Forum Index Cheat Engine
The Official Site of Cheat Engine
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


[ASM] advanced floating point instructions crash

 
Post new topic   Reply to topic    Cheat Engine Forum Index -> General programming
View previous topic :: View next topic  
Author Message
SteveAndrew
Master Cheater
Reputation: 30

Joined: 02 Sep 2012
Posts: 323

PostPosted: Tue Sep 18, 2012 6:32 am    Post subject: [ASM] advanced floating point instructions crash Reply with quote

Hello Hitler I was recently hacking a game where I had to add a value to 32bit float value packed inside an xmm register but it wasn't right in the front of it... That's where it got a little tricky as I needed to add to that value only without effecting the other 96 bits of xmm register (it was the Z value where X [bits 0-31] was the first thing packed into the xmm0 register and Z [bits 32-63] was the second)

(Normally an addss [Single Scalar] would've done fine, but that would've added to the X value only, and I needed to offset the Z value)

After looking at the intel instruction reference, I picked 'addps' and figured I could just add 0 for the other 3 doublewords beside the one I was wanting to change... Now I wasn't sure if a 'packed' instruction was what I needed but now I think 'packed' just means a smaller than fits the space data type is packed into the xmm# register, and there wasn't any instruction that seemed equivalent that didn't say packed

As it turns out, it did work the way I expected, but first I crashed before I changed my code a little bit. I'm posting here though because I'm confused at why it crashed and want to know what's wrong and how to make it work for future occasions.

Here's a test script I made when I was checking to see how it behaved and after debugging this test script I could see that it was going to work...

ADDPS

Ok the documentation states that the source operand can either be another xmm register or a 128bit memory location...

I was trying to use it with a 128bit memory location to avoid what I ended up doing, using another xmm register... (I had to push and pop it to not change it's state)

Code:

//Push xmm4
sub esp,10
movdqu dqword [esp],xmm4

//Pop xmm4
movdqu xmm4,dqword [esp]
add esp,10


Maybe I don't understand what it means by 128bit memory location, I thought it was just some place in memory which has 128 bits (4 * 4 bytes) worth of values that you wanted to add with the destination operand.

Code:

[enable]
alloc(PackedFloatingPointTest,256)
label(OneTwentyEightBits)
label(Result)
label(AddMe)
registersymbol(PackedFloatingPointTest)
registersymbol(OneTwentyEightBits)
registersymbol(Result)
createthread(PackedFloatingPointTest)

PackedFloatingPointTest:
//jmp PackedFloatingPointTest //Uncomment and place a bp after if desired
mov eax,OneTwentyEightBits  //then nop the jmp
movups xmm0,[eax]
movups xmm1,[eax+10]
addps xmm0,xmm1 //This works fine :)
//addps xmm0,[AddMe]  //Confused why these two instructions crash
//addps xmm0,[eax+10] //either of these crashes
movups [Result],xmm0
ret

OneTwentyEightBits:
dd (float)50.0
dd (float)2.0
dd (float)86.0
dd (float)133.7

AddMe:
dd (float)0
dd (float)-1
dd (float)0
dd (float)0

Result:
dd 0 0 0 0

[disable]

dealloc(PackedFloatingPointTest)
unregistersymbol(PackedFloatingPointTest)
unregistersymbol(OneTwentyEightBits)
unregistersymbol(Result)


Inject it into any exectuable for testing, As you can see I set up two memory locations with 128 bits worth of floats each. If you run it in its current form with the comments still commented you can see that the result is as expected. It loads the first four floats into xmm0, and the second four floats into xmm1, then adds xmm1 to xmm0 and moves the result into [Result]

The second float value turns into 1.0 from 2.0 because -1 was added to it, and the other values stay intact. Debugging it and peeking into the xmm0 register itself after the 'addps xmm0,xmm1' instruction yields the same thing.

However what I want to do is see it work with a memory location rather than an second xmm register as the source operand...

What did I do wrong with
Code:

addps xmm0,[AddMe]  //Confused why these two instructions crash


OR:

Code:

addps xmm0,[eax+10] //either of these crashes


in place of the 'addps xmm0,xmm1' instruction?

Is niether [AddMe] or [eax+10] pointing to a 128 bit memory location?

Do I have to feed it a pointer or something? Help me out here! Very Happy Thanks!

_________________
Back to top
View user's profile Send private message
SteveAndrew
Master Cheater
Reputation: 30

Joined: 02 Sep 2012
Posts: 323

PostPosted: Wed Feb 04, 2015 3:02 pm    Post subject: Reply with quote

I had to deal with this again recently, and I finally came to my senses and realized what was wrong! So I'm answering my own question.

The problem was alignment! movups will work on an unaligned double quadword, but addps still requires 128bit alignment.

To solve the problem either start your double quadwords at the start of a block of newly allocated memory, or use mgr.inz.Players LUA autorun addon which adds padding instructions (and multibyte nops) which can be gotten from here: http://forum.cheatengine.org/viewtopic.php?t=574426

1.
Code:

[enable]
alloc(OneTwentyEightBits,1024)
label(PackedFloatingPointTest)
label(Result)
label(AddMe)
registersymbol(PackedFloatingPointTest)
registersymbol(OneTwentyEightBits)
registersymbol(Result)
createthread(PackedFloatingPointTest)

OneTwentyEightBits:
dd (float)3.1415
dd (float)2.0
dd (float)86.0
dd (float)133.7

AddMe:
dd (float)0
dd (float)-1
dd (float)0
dd (float)0

Result:
dq 0 0

PackedFloatingPointTest:
mov rax,OneTwentyEightBits
movaps xmm0,[rax]
movaps xmm1,[rax+10]
addps xmm0,xmm1
addps xmm0,[AddMe]  //With proper alignment, they work fine
addps xmm0,[eax+10]
movaps [Result],xmm0
ret

[disable]

dealloc(PackedFloatingPointTest)
unregistersymbol(PackedFloatingPointTest)
unregistersymbol(OneTwentyEightBits)
unregistersymbol(Result)


and

2.
Code:

[enable]
alloc(PackedFloatingPointTest,1024)
label(OneTwentyEightBits)
label(Result)
label(AddMe)
registersymbol(PackedFloatingPointTest)
registersymbol(OneTwentyEightBits)
registersymbol(Result)
createthread(PackedFloatingPointTest)

PackedFloatingPointTest:
jmp PackedFloatingPointTest
mov rax,OneTwentyEightBits
movaps xmm0,[rax]
movaps xmm1,[rax+10]
addps xmm0,xmm1
addps xmm0,[AddMe]  //With proper alignment, they work fine
addps xmm0,[eax+10]
movaps [Result],xmm0
ret

padding16 //one of mr.inz.Player's added padding instructions
OneTwentyEightBits:
dd (float)9.8696
dd (float)2.0
dd (float)86.0
dd (float)133.7

AddMe: //still aligned
dd (float)0
dd (float)-1
dd (float)0
dd (float)0

db 90 90 90 //whoops not aligned anymore
padding16 //128bit re-aligned after this

Result:
dq 0 0

[disable]

dealloc(PackedFloatingPointTest)
unregistersymbol(PackedFloatingPointTest)
unregistersymbol(OneTwentyEightBits)
unregistersymbol(Result)

_________________
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Cheat Engine Forum Index -> General programming All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group

CE Wiki   IRC (#CEF)   Twitter
Third party websites