Cheat Engine Forum Index Cheat Engine
The Official Site of Cheat Engine
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


pointer scanner How Gpu computing
Goto page 1, 2, 3, 4, 5  Next
 
Post new topic   Reply to topic    Cheat Engine Forum Index -> Cheat Engine
View previous topic :: View next topic  
Author Message
bowbowtap
Newbie cheater
Reputation: 0

Joined: 27 Apr 2013
Posts: 12
Location: 台灣

PostPosted: Mon Apr 29, 2013 12:12 pm    Post subject: pointer scanner How Gpu computing Reply with quote

How GPu computing?

cpu Slow...
Back to top
View user's profile Send private message AIM Address MSN Messenger
Dark Byte
Site Admin
Reputation: 470

Joined: 09 May 2003
Posts: 25807
Location: The netherlands

PostPosted: Mon Apr 29, 2013 12:17 pm    Post subject: Reply with quote

When graphics cards can hold more than 6gb ram for the pointertree i'll look into it.
Also, a big bottleneck is the writing of the results to disk, so get a 2tb ssd

_________________
Do not ask me about online cheats. I don't know any and wont help finding them.

Like my help? Join me on Patreon so i can keep helping
Back to top
View user's profile Send private message MSN Messenger
bowbowtap
Newbie cheater
Reputation: 0

Joined: 27 Apr 2013
Posts: 12
Location: 台灣

PostPosted: Mon Apr 29, 2013 12:21 pm    Post subject: Reply with quote

Dark Byte wrote:
When graphics cards can hold more than 6gb ram for the pointertree i'll look into it.
Also, a big bottleneck is the writing of the results to disk, so get a 2tb ssd


pc ram can not it?
Back to top
View user's profile Send private message AIM Address MSN Messenger
Dark Byte
Site Admin
Reputation: 470

Joined: 09 May 2003
Posts: 25807
Location: The netherlands

PostPosted: Mon Apr 29, 2013 12:24 pm    Post subject: Reply with quote

From what i've read gpu computing can not access cpu memory. The cpu first has to send the data to the gpu first
_________________
Do not ask me about online cheats. I don't know any and wont help finding them.

Like my help? Join me on Patreon so i can keep helping
Back to top
View user's profile Send private message MSN Messenger
bowbowtap
Newbie cheater
Reputation: 0

Joined: 27 Apr 2013
Posts: 12
Location: 台灣

PostPosted: Mon Apr 29, 2013 12:27 pm    Post subject: Reply with quote

Dark Byte wrote:
From what i've read gpu computing can not access cpu memory. The cpu first has to send the data to the gpu first


How to speed up?


game+ce Store RamDisk?
Back to top
View user's profile Send private message AIM Address MSN Messenger
Dark Byte
Site Admin
Reputation: 470

Joined: 09 May 2003
Posts: 25807
Location: The netherlands

PostPosted: Mon Apr 29, 2013 12:30 pm    Post subject: Reply with quote

If you have more than 500gb ram you can use a ramdisk, but i doubt that.
In the future i might add distributed computing to the pointerscan so you can have 100 computers working on the same pointerscan

_________________
Do not ask me about online cheats. I don't know any and wont help finding them.

Like my help? Join me on Patreon so i can keep helping
Back to top
View user's profile Send private message MSN Messenger
bowbowtap
Newbie cheater
Reputation: 0

Joined: 27 Apr 2013
Posts: 12
Location: 台灣

PostPosted: Mon Apr 29, 2013 12:37 pm    Post subject: Reply with quote

GPU
I thought feasible
Results can not be


I saw the program

「RAR GPU Password Recovery」

The legend ..
9 password CPU crack 43 years, GPU 48 days


Last edited by bowbowtap on Mon Apr 29, 2013 12:45 pm; edited 1 time in total
Back to top
View user's profile Send private message AIM Address MSN Messenger
Dark Byte
Site Admin
Reputation: 470

Joined: 09 May 2003
Posts: 25807
Location: The netherlands

PostPosted: Mon Apr 29, 2013 12:42 pm    Post subject: Reply with quote

It's still a theoretic idea.
But you'd probably have to set them up yourself and let 'workers' connect to the 'cloud' where it will fetch jobs and create new jobs to other workers if needed
Basically like the current pointerscanner where each thread can give any other thread a job when it can, but then with a variable amount of threads

Of course, every worker will need access to the pointertree which can be a 6GB+ file (So a slow initial initialization and high network traffic when a new worker gets added)

Edit: just read about amd's hUMA project which will be useful here

_________________
Do not ask me about online cheats. I don't know any and wont help finding them.

Like my help? Join me on Patreon so i can keep helping


Last edited by Dark Byte on Tue Apr 30, 2013 9:11 am; edited 1 time in total
Back to top
View user's profile Send private message MSN Messenger
bowbowtap
Newbie cheater
Reputation: 0

Joined: 27 Apr 2013
Posts: 12
Location: 台灣

PostPosted: Mon Apr 29, 2013 12:48 pm    Post subject: Reply with quote

Dark Byte wrote:
It's still a theoretic idea.
But you'd probably have to set them up yourself and let 'workers' connect to the 'cloud' where it will fetch jobs and create new jobs to other workers if needed
Basically like the current pointerscanner where each thread can give any other thread a job when it can, but then with a variable amount of threads

Of course, every worker will need access to the pointertree which can be a 6GB+ file (So a slow initial initialization and high network traffic when a new worker gets added)


TY~XD
Back to top
View user's profile Send private message AIM Address MSN Messenger
Dark Byte
Site Admin
Reputation: 470

Joined: 09 May 2003
Posts: 25807
Location: The netherlands

PostPosted: Tue Nov 26, 2013 11:11 pm    Post subject: Reply with quote

Seeing that Titans have 6GB ram I've decided to give this a test.

Result: Not fast enough ( source: http://code.google.com/p/cheat-engine/source/browse/trunk/Cheat+Engine/CUDA+pointerscan/ )

Anyhow, next version has the multiple worker method implemented, which does provided a great speed improvement

_________________
Do not ask me about online cheats. I don't know any and wont help finding them.

Like my help? Join me on Patreon so i can keep helping
Back to top
View user's profile Send private message MSN Messenger
Gniarf
Grandmaster Cheater Supreme
Reputation: 43

Joined: 12 Mar 2012
Posts: 1285

PostPosted: Thu Dec 12, 2013 4:51 pm    Post subject: Reply with quote

I saw a few cuda-related commits in the SVN, so before this project goes too far, I'd suggest switching to OpenCL, simply because the radeons do not support cuda, but all modern gpus support OpenCL.

Actually since opencl code can also run on some cpus, you could also use the same code cpu and gpu pointerscanning (I'm NOT speaking about merged scanning), in a distant future.

Also I'm not very competent on the matter, but I heard geforce are more FPU oriented and radeons perform faster on logical operations, which is why they are preferred for bitcoin mining. Considering that pointerscanning is more about integer operations I'd expect radeons to perform significantly faster, incase you have one laying around.

_________________
DO NOT PM me if you want help on making/fixing/using a hack.
Back to top
View user's profile Send private message
Dark Byte
Site Admin
Reputation: 470

Joined: 09 May 2003
Posts: 25807
Location: The netherlands

PostPosted: Thu Dec 12, 2013 5:49 pm    Post subject: Reply with quote

Right now I've stopped work on this as the performance is too slow.
(A scan took about as long as a single threaded scan in ce when compiled in debug mode , and that while this cuda pointerscanner didn't even write the results)

Even with the minor difference between float and int calculations it's way to slow to be usable
99% of the time are lookups in a map and the other 1% is iterating through a linked list.
There's no complex math that the gpu has to do, so there's no gain there

One of the reasons is that gpu threads (in nvidia at least) only execute the same line of code at the same time:
Code:

thread 1 executes line 10
thread 2's if statement path didn't lead to executing line 10, so it waits till thread 1 has reached line 45
thread 3's if statement path didn't lead to executing line 10, so it waits till thread 1 has reached line 45
thread 4 executes line 10
...


And since the pointerscanner has a lot of loops and iterations based on a positive result of a map lookup or not, this basically reduces the number of threads actively running to 1

Another big problem is that each thread may not run longer than 2 seconds. But the pointerscanner is designed to let a thread run for as long as it needs fetching work commands from a queue if it runs out, and add to that same queue if possible


But if you feel like experimenting or changing it to opencl you can give it a shot. (The pointerscanner lookup method is now basically ported to C)
http://cheatengine.org/temp/test.PTR.scandata is the scandata file I used for testing

_________________
Do not ask me about online cheats. I don't know any and wont help finding them.

Like my help? Join me on Patreon so i can keep helping
Back to top
View user's profile Send private message MSN Messenger
mgr.inz.Player
I post too much
Reputation: 222

Joined: 07 Nov 2008
Posts: 4438
Location: W kraju nad Wisla. UTC+01:00

PostPosted: Fri Dec 13, 2013 9:03 am    Post subject: Reply with quote

I tested few implementations of par2cmdline tool.
(tool to apply the data-recovery capability concepts of RAID-like systems to the multi-part archives)

So, there is original project - http://sourceforge.net/projects/parchive/ - par2cmdline-0.4-x86-win32

And other builds:
par2cmdline 0.4 with Intel Threading Building Blocks 2.2 - http://chuchusoft.com/par2_tbb/download.html
par2cmdline 0.4 with Intel Threading Building Blocks 2.2 + CUDA - http://chuchusoft.com/par2_tbb/download.html#gpu_version

TBB version is significantly faster than original par2cmdline.
TBB+CUDA is slightly faster than TBB (on my 8800 GS).

Results depends on source blocks count, repair blocks count, and dataset size.

_________________
Back to top
View user's profile Send private message MSN Messenger
zm0d
Master Cheater
Reputation: 7

Joined: 06 Nov 2013
Posts: 423

PostPosted: Fri Dec 13, 2013 9:20 am    Post subject: Reply with quote

Dark Byte wrote:
From what i've read gpu computing can not access cpu memory.


DMA should do the trick, shouldn't it?
http://www.techterms.com/definition/dma
Back to top
View user's profile Send private message
Dark Byte
Site Admin
Reputation: 470

Joined: 09 May 2003
Posts: 25807
Location: The netherlands

PostPosted: Fri Dec 13, 2013 9:52 am    Post subject: Reply with quote

The cuda implementation of par2cmdline is only for xor'ing a block of data without any conditional checks
So yeah, gpu computing is most likely not suitable for a pointerscan

(On a sidenote, they use atomicXor in a __global__ function. Funny thing is that atomic functions do not work for some reason in those functions, only in __device__ functions. You won't get any errors during compile time, but i can guarantee it's not going to be atomic.
I wasted 8 hours on this myself trying to figure out why my data kept getting corrupted)

_________________
Do not ask me about online cheats. I don't know any and wont help finding them.

Like my help? Join me on Patreon so i can keep helping
Back to top
View user's profile Send private message MSN Messenger
Display posts from previous:   
Post new topic   Reply to topic    Cheat Engine Forum Index -> Cheat Engine All times are GMT - 6 Hours
Goto page 1, 2, 3, 4, 5  Next
Page 1 of 5

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group

CE Wiki   IRC (#CEF)   Twitter
Third party websites