Cheat Engine Forum Index Cheat Engine
The Official Site of Cheat Engine
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


[Snippet / ASM] Fast MMX alpha blending

 
Post new topic   Reply to topic    Cheat Engine Forum Index -> General programming
View previous topic :: View next topic  
Author Message
hcavolsdsadgadsg
I'm a spammer
Reputation: 26

Joined: 11 Jun 2007
Posts: 5801

PostPosted: Sun Apr 05, 2009 9:07 pm    Post subject: [Snippet / ASM] Fast MMX alpha blending Reply with quote

Last night I sat down and tried my hand at some MMX. After some reading and experimentation, I got to work.

dest = alpha * (source - dest) / 255 + dest;

Previously the best I could do was work on each channel of the pixel, so doing the math for the entire pixel at once is probably a bit quicker I imagine. Wink

I use a DIB Section for drawing, so I get a pointer to the bits. (which is dst)
color is a dword, and serves as the source.

Basically this is how I tried to lay it out...

Code:
00 XX 00 RR 00 GG 00 BB
-
00 XX 00 RR 00 GG 00 BB
*
00 AA 00 AA 00 AA 00 AA
/
00 FF 00 FF 00 FF 00 FF
+
00 XX 00 RR 00 GG 00 BB


and the code behind it.

Code:
pxor mm3, mm

mov eax, dword ptr [dst]
movd mm0, dword ptr [eax]
movd mm1, dword ptr [color] //src
movd mm2, dword ptr [alpha]

punpcklbw mm0, mm3 //unpack dst to words
punpcklbw mm1, mm3 //unpack color
punpcklbw mm2, mm2 //unpack alpha
punpcklbw mm2, mm2
punpcklbw mm2, mm3

psubusb mm1, mm0 //(color - dest)
pmullw mm1, mm2 //alpha * (color - dest)
psrlw    mm1, 8 //alpha * (color - dest) / 256
paddusw mm1, mm0 //alpha * (color - dest) / 256 + dest

packuswb mm1, mm3
movd dword ptr [eax], mm1


Have fun.
Back to top
View user's profile Send private message
Fallen`
Expert Cheater
Reputation: 0

Joined: 24 Aug 2008
Posts: 224
Location: United States

PostPosted: Sun Apr 05, 2009 9:29 pm    Post subject: Reply with quote

Thanks for the Information, Ill be sure to check it out later.
_________________
Maplestory's Fun, But It waste's your Time. Your hard work Get's demolished Because You're better than everyone else. Nexon, You truly. Suck.

.::Acomplishments::.
Level 114 Arch Mage (i/l)
Level 100 Sniper (Hacked)
Level 98 Cheif Bandit(Hacked)
Back to top
View user's profile Send private message Send e-mail Yahoo Messenger
hcavolsdsadgadsg
I'm a spammer
Reputation: 26

Joined: 11 Jun 2007
Posts: 5801

PostPosted: Mon Apr 06, 2009 12:24 am    Post subject: Reply with quote

For the alpha, you can also use the SSE instruction pshufw instead of 3 unpacks.

Code:
pshufw mm2, mm2, 0


http://www.tommesani.com/SSEPrimer.html
http://avisynth.org/mediawiki/Filter_SDK/Simple_MMX_optimization
Back to top
View user's profile Send private message
hcavolsdsadgadsg
I'm a spammer
Reputation: 26

Joined: 11 Jun 2007
Posts: 5801

PostPosted: Fri Apr 10, 2009 2:15 am    Post subject: Reply with quote

Moose wrote:
This is slower than using plain x86 instructions.

The idea of max speed with mmx comes from utilizing all instruction pipelines in parallel. Essentially you can process at least 2 pixels at once, or at least perform 2 instructions in 1 cc.

Good thing you are learning though.


Of course, but drawing can't always be done 2 pixels at a time. The function this is for basically mimics SetPixel.


Later on, drawing for sprites, pictures, etc will probably all be handled like that. Coding seems to come in inspired bursts recently though, so like always, it's done when it's done.


Anyway, there's not really that much chance for instruction pairing here, since it's so short. The emms is expensive, which sucks, but this still seems to be faster than the straight C version, since you have to do it per channel in the pixel.


I'll also probably try threading the drawing eventually, or maybe something like dirty rectangles.

Really, once I get a bit more experience, I think I'll try my hand at something a little more modern, like DX.
Back to top
View user's profile Send private message
pkedpker
Master Cheater
Reputation: 1

Joined: 11 Oct 2006
Posts: 412

PostPosted: Sun Apr 12, 2009 11:45 am    Post subject: Reply with quote

one person created a new array for C++ using all registers + MMX (xmm0-xmm7) and SSE (mm0-mm7) registers and hes array works 9-12 times speedup faster then the normal array C++ that uses just normal registers. So its not slow
_________________
Hacks I made for kongregate.
Kongregate Universal Badge Hack: http://forum.cheatengine.org/viewtopic.php?p=4129411
Kongreate Auto Rating/Voter hack: http://forum.cheatengine.org/viewtopic.php?t=263576
Took a test lol
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Cheat Engine Forum Index -> General programming All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group

CE Wiki   IRC (#CEF)   Twitter
Third party websites