Cheat Engine Forum Index Cheat Engine
The Official Site of Cheat Engine
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


Optimisation challenges
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    Cheat Engine Forum Index -> General programming
View previous topic :: View next topic  
Author Message
Polynomial
Grandmaster Cheater
Reputation: 5

Joined: 17 Feb 2008
Posts: 524
Location: Inside the Intel CET shadow stack

PostPosted: Tue Jun 15, 2010 7:31 pm    Post subject: Optimisation challenges Reply with quote

The following are a few optimisation challenges to test your skill at analysing algorithms and optimising their performance. There are a few rules, so I'll start out with them:

Rules
1) All entries must be submitted as both a compiled Windows executable and full source code.
2) You must write your program in VS2008-compatible C++.
3) Plagarism will not be tollerated. I am aware of many solutions to these problems and have given each of them a thorough search. Each entry will be checked thoroughly. However this is mainly a learning competition, so you may base your optimisations off of other people's ideas. Work together to produce the best code.
4) Don't start arguments over mistakes, your opinion on the quality of someone's code, or who invented certain ideas. It's all about learning. Test out things that seem to be unlikely to work anyway, just to see how it fares. I've seen some brilliant ideas miraculously fail for no apparant reason, and things that shouldn't work just happen to work. There's no reason to denounce an idea at face value.
5) No inline ASM.


Right, now onto the challenges!

Challenge 1 - Primality test
Produce a function that given a number, returns a boolean representing the primality of that number - i.e. whether or not it is prime.

Here is a very unoptimal sample:
Code:
bool isPrime(int value)
{
    for(int i = 2; i < value; i++)
    {
        if(value % i == 0) return false;
    }
    return true;
}

void showPrimes()
{
    for(int i = 1; i < 10000000; i++)
    {
        if(isPrime(i)) printf("%n\n", i);
    }
}


You must not use large precomputed lookup tables (e.g. lowPrimes[] array) to find primes. Your code must be able to check the primality of large numbers quickly. You must not modify the code of showPrimes.

Challenge 2 - Semiprime integer factorisation
Semiprimes are numbers which are created when you multiply two primes together. They have the interesting property that their only factors are the two primes that created them. Write a function that, given a semiprime number, determines the two primes that created it.

Here is an incredibly slow and inefficient method:
Code:
void FindFactors(int number)
{
    for(int a = 3; a < number; a++)
    {
        for(int b = 1; b < number; b++)
            if(a*b == number) {
                printf("Found prime factors: %n and %n\n", a, b);
                return;
            }
        }
    }
}


As with challenge 1, you may not use lookup tables. Your code will be tested by timing how long it takes to find the factors of a set of randomly chosen semiprimes.

Challenge 3 - File discovery
Write a program that quickly and efficiently discovers files in a directory based on a partial file name. This program must work recursively through subdirectories, and will be tested in an environment that would likely overflow the stack of any program that attempts to use standard recursion. You may use threads if you need to, as well as any part of the Windows API.


Remember, you are not just trying to create a solution that works quickly, you are trying to create a solution that is THE fastest possible. If someone out-edges your code by 0.01%, they are still winning.


So, that's a wrap! Post your discussions, questions and ideas in this thread.

_________________
It's not fun unless every exploit mitigation is enabled.
Please do not reply to my posts with LLM-generated slop; I consider it to be an insult to my time.


Last edited by Polynomial on Wed Jun 16, 2010 5:15 am; edited 1 time in total
Back to top
View user's profile Send private message
Flyte
Peanuts!!!!
Reputation: 6

Joined: 19 Apr 2006
Posts: 1887
Location: Canada

PostPosted: Tue Jun 15, 2010 7:53 pm    Post subject: Reply with quote

Why is this limited to C++?
Back to top
View user's profile Send private message
Polynomial
Grandmaster Cheater
Reputation: 5

Joined: 17 Feb 2008
Posts: 524
Location: Inside the Intel CET shadow stack

PostPosted: Tue Jun 15, 2010 8:00 pm    Post subject: Reply with quote

I chose a language that most people on here know and that I have a very simple one-button-to-compile IDE for to save me messing about with other people's code when doing benchmarks. If I open it up to ASM, C, C#, VB.NET, Java and every other language out there it becomes less about the process of optimisation and more about the strengths and weaknesses of the chosen language, and makes proper benchmarking difficult.

Did you have a "better" language in mind?

_________________
It's not fun unless every exploit mitigation is enabled.
Please do not reply to my posts with LLM-generated slop; I consider it to be an insult to my time.
Back to top
View user's profile Send private message
Flyte
Peanuts!!!!
Reputation: 6

Joined: 19 Apr 2006
Posts: 1887
Location: Canada

PostPosted: Tue Jun 15, 2010 8:02 pm    Post subject: Reply with quote

Burningmace wrote:
I chose a language that most people on here know and that I have a very simple one-button-to-compile IDE for to save me messing about with other people's code when doing benchmarks. If I open it up to ASM, C, C#, VB.NET, Java and every other language out there it becomes less about the process of optimisation and more about the strengths and weaknesses of the chosen language, and makes proper benchmarking difficult.

Did you have a "better" language in mind?


I was going to do it in F#, personally. A functional language is best suited to these kinds of problems.

Also, I should point out that your first two challenges are open topics in computer science, and it'll just be a constant back and forth in here.
Back to top
View user's profile Send private message
Polynomial
Grandmaster Cheater
Reputation: 5

Joined: 17 Feb 2008
Posts: 524
Location: Inside the Intel CET shadow stack

PostPosted: Tue Jun 15, 2010 8:16 pm    Post subject: Reply with quote

I'm aware that the first two are open topics (semiprime factorisation especially), but the idea of this is to see what only your collective minds can conjure up. Since I don't have an F# compiler to hand and I would assume most people in here don't either, I'm gonna stick with C++ on this.

If you're not going to directly enter, might you provide some sample pseudocode and ideas? Partial participation is better than nothing.

_________________
It's not fun unless every exploit mitigation is enabled.
Please do not reply to my posts with LLM-generated slop; I consider it to be an insult to my time.


Last edited by Polynomial on Tue Jun 15, 2010 8:17 pm; edited 1 time in total
Back to top
View user's profile Send private message
Deltron Z
Expert Cheater
Reputation: 1

Joined: 14 Jun 2009
Posts: 164

PostPosted: Tue Jun 15, 2010 8:16 pm    Post subject: Reply with quote

Nice, I'm glad someone else offering such challanges.

Challange 1:
First of all, let's look at the definition for prime numbers:
Wikipedia wrote:
In mathematics, a prime number (or a prime) is a natural number that has exactly two distinct natural number divisors: 1 and itself.

Alright then, by definition 1 isn't prime since it has only one distinct divisor - itself.
We will also notice that 2 is the only even prime number - as every other even numbers are divisble by 2, so that makes them divisible by atleast 1, 2, themselves and pehaps more.

Given the information above, we can easily optimize the code by:
a. In case of 2, return true. otherwise, compute only odd numbers.
b. It is obvious that no number above n/2 divides n, so we can simply loop untill n/2; however, this is not the upper bound, it can be reduced. just follow along...
c. It is known that if a number, i, divides n without any remainder (aka n is divisible by i), then n / i also divides n. we will notice that in some cases of perfect squares, n / i will be equal to i if i = √n, for example 9 / 3 = 3. and thus, we will only need to loop through odd numbers untill the square-root of n, inclusive. that is, because, if i > √n then we will simply compute numbers for nothing - if no number i <= √n divides n then no such number i > √n exists.
d. We will investigate further more about prime numbers properties. we will notice that:
Wikipedia wrote:
all prime numbers above 3 are of the form 6n − 1 or 6n + 1

And thus, we can make steps of 6 at a time! instead of stepping slowly, just 1 at a time.

Summary:
-All prime numbers are odd - we can make steps of at least 2.
-A number above n/2 cannot be divided by any number but itself - we can stop at least at n/2 or sooner.
-For every number i which divides n, a divisor n / i exists. thus, the upper bound is the square root of n, inclusive.
-A prime number is of the form of either 6n+1 or 6n-1, thus we can make steps of 6.

The final code:
Code:
__forceinline int isPrime(unsigned int n)
{
   if (n == 2)
      return 1;
   if ((n & 1) == 0 || n <= 1 || n % 3 == 0)
      return 0;

   register unsigned int Sqrt = (unsigned int)sqrtf((float)n);
   for (register unsigned int i = 5; i <= Sqrt; i += 6)
   {
      if ((n % i == 0) || (n % (i + 2)) == 0)
         return 0;
   }

   return 1;
}

Code explanation:
__forceinline - why wasting time on a CALL operation in ASM if it can be done directly? no CALL, not RET, not preserving registers or in short - less commands to execute.
unsigned integers - better range. there are no negative prime numbers by definition.
convert float->unsigned int and unsigned int->float - why using extra bytes when not needed?
register keyword - attempt to use registers to hold variables. (ignored in MSVC++)

Very Happy

Challange 2:
Even simpler than in challange 1 - since we are not allowed to use look-up tables (not creating one using isPrime nor using Sieve of Eratosthenes) which limits us to creating a simple, short code at the cost of efficiency.
Anyway, I am assuming it is a semiprime number so I will not perform a check.
Simply factorize the number. I will use the same concepts as before, so no further explanation needed. (or so I hope)
Code:
void PrintSemiprimeFactors(unsigned int n)
{
   if (n % 2 == 0)
   {
      n /= 2;
      printf("2 ");
   }
   else if (n % 3 == 0)
   {
      n /= 3;
      printf("3 ");
   }

   for (int i = 5; n != 1; i += 6)
   {
      if (n % i == 0)
      {
         n /= i;
         printf("%d ", i);
      }

      if (n % (i + 2) == 0)
      {
         n /= (i + 2);
         printf("%d ", i + 2);
      }
   }
}


The call:
Code:
PrintSemiprimeFactors(12709189);

prints 3559 and 3571 instantly.


Last edited by Deltron Z on Tue Jun 15, 2010 8:29 pm; edited 1 time in total
Back to top
View user's profile Send private message
Polynomial
Grandmaster Cheater
Reputation: 5

Joined: 17 Feb 2008
Posts: 524
Location: Inside the Intel CET shadow stack

PostPosted: Tue Jun 15, 2010 8:27 pm    Post subject: Reply with quote

Deltron, your use of the 6k+/-1 rule is somewhat incorrect. All primes are in that form, but that does not mean that you can skip checking if they are divisible. For example, 961 is not prime but your algorithm misses it completely and states that it is prime. It is a semiprime of 31 and 31.
_________________
It's not fun unless every exploit mitigation is enabled.
Please do not reply to my posts with LLM-generated slop; I consider it to be an insult to my time.
Back to top
View user's profile Send private message
Deltron Z
Expert Cheater
Reputation: 1

Joined: 14 Jun 2009
Posts: 164

PostPosted: Tue Jun 15, 2010 8:35 pm    Post subject: Reply with quote

You're wrong! Rolling Eyes
Try executing before assuming wrong assumptions, or at least read the code or make sure you understand the algorithm.

Look, the idea here is not to check if n is in the form of 6n + 1 or 6n - 1, you've got that right. I used the same concept, except that I check if the number n is divisble by 6i + 1 or 6i - 1, because any other number is divisble by 3, and if you look at the code you will notice there's a condition, n % 3 == 0, outside the loop. this concept is somehow derived by the fact that a prime number is in the form of 6n + 1 or 6n - 1.

My code runs fine, returns 0 (false) for 961.

P.S. working on the 3rd challange... what about if I increase my stack's size to, let's say, a mega byte? or 20? that's enough for sure... Rolling Eyes
What should be the search pattern? wild-cards allowed? search in extentions too? if not, is the partial name means the first few characters or some unknown part of the file's name? either way, it doesn't matter, strstr or strncmp or whatever... still a very simple algorithm, but is increasing my stack size allowed in my solution?
Back to top
View user's profile Send private message
Polynomial
Grandmaster Cheater
Reputation: 5

Joined: 17 Feb 2008
Posts: 524
Location: Inside the Intel CET shadow stack

PostPosted: Tue Jun 15, 2010 8:42 pm    Post subject: Reply with quote

My apologies, it's late and I misread your code and missed the n%(i+2) bit. My tiredness is getting the better of me.

And as regards the stack, you should never use recursion. Ever. It's horrible. I've also heard it's rather inefficient. I'm gonna say no recursion. My suggestion would be a loop with a state.

The code should match the partial file name anywhere in the whole name. If I specify "png", that should match "file.png", "pngfile.png" and "this_is_a_not_a_png_file.exe".

_________________
It's not fun unless every exploit mitigation is enabled.
Please do not reply to my posts with LLM-generated slop; I consider it to be an insult to my time.
Back to top
View user's profile Send private message
hcavolsdsadgadsg
I'm a spammer
Reputation: 26

Joined: 11 Jun 2007
Posts: 5801

PostPosted: Tue Jun 15, 2010 9:00 pm    Post subject: Reply with quote

Careful with forceinline. It's usually a good idea to let the compiler decide, forcing it to inline might actually hurt performance in the end.

what do you mean by:
Quote:
convert float->unsigned int and unsigned int->float - why using extra bytes when not needed?


a float is 4 bytes, an unsigned int is 4 bytes. float to int is probably going to generate an ftol call which is "slow". You can just do this instead if rounding towards nearest is ok.

Code:
   inline int round(f32 x)
   {
      int r;

      __asm
      {
         fld      dword ptr [x]
         fistp   dword ptr [r]
      }

      return r;
   }


register keyword doesn't sound too hot. Probably a better idea to just let the compiler figure out what to do.

fsqrt instruction is fast, the sqrtf() probably ends up getting directly turned into it.
Back to top
View user's profile Send private message
Deltron Z
Expert Cheater
Reputation: 1

Joined: 14 Jun 2009
Posts: 164

PostPosted: Tue Jun 15, 2010 9:13 pm    Post subject: Reply with quote

Burningmace wrote:
My apologies, it's late and I misread your code and missed the n%(i+2) bit. My tiredness is getting the better of me.

No harm done.

Burningmace wrote:
And as regards the stack, you should never use recursion. Ever. It's horrible. I've also heard it's rather inefficient. I'm gonna say no recursion. My suggestion would be a loop with a state.

The code should match the partial file name anywhere in the whole name. If I specify "png", that should match "file.png", "pngfile.png" and "this_is_a_not_a_png_file.exe".

I can't belive what I'm hearing! Shocked
Recurrsion sucks? well, in mathematics, any recurrsion can be defined pretty easily without a recurrsion too, but it computer programming this isn't the case. I guess anything is possible without recurrsion, but recurrsion does-not-suck at all! I've solved PE 31 in recurrsion in 3 lines, also brought as a challange in my topic.
I guess I could use a loop, but I would just use a stack (STL) for it. I remember coding such a function a while ago in C, just a few lines with a simple recurrsion - did the trick.

slovach wrote:
Careful with forceinline. It's usually a good idea to let the compiler decide, forcing it to inline might actually hurt performance in the end.

what do you mean by:
Quote:
convert float->unsigned int and unsigned int->float - why using extra bytes when not needed?


a float is 4 bytes, an unsigned int is 4 bytes. float to int is probably going to generate an ftol call which is "slow". You can just do this instead if rounding towards nearest is ok.

Code:
   inline int round(f32 x)
   {
      int r;

      __asm
      {
         fld      dword ptr [x]
         fistp   dword ptr [r]
      }

      return r;
   }


register keyword doesn't sound too hot. Probably a better idea to just let the compiler figure out what to do.

fsqrt instruction is fast, the sqrtf() probably ends up getting directly turned into it.

Thanks for the comments.
I will use __inline next time.
As for register, MSVC++ ignores this keyword - I guess it would use registers in this case anyway.
I didn't know converting to float is slower. Mad
What's the difference between fsqrt and sqrtf? I could as well implement my own binary square-root function and get an integer result in just a few iterations, but the function was pretty fast already.

Oh, well... thanks for the info. helpful. Razz
Back to top
View user's profile Send private message
Slugsnack
Grandmaster Cheater Supreme
Reputation: 71

Joined: 24 Jan 2007
Posts: 1857

PostPosted: Tue Jun 15, 2010 9:13 pm    Post subject: Reply with quote

We're allowed to code the whole thing in inline assembly ? Lol
Back to top
View user's profile Send private message
hcavolsdsadgadsg
I'm a spammer
Reputation: 26

Joined: 11 Jun 2007
Posts: 5801

PostPosted: Tue Jun 15, 2010 9:51 pm    Post subject: Reply with quote

sqrtf() is the crt function, returns a float square root.
sqrt() returns double.
fsqrt is the actual instruction.

in C++, sqrt() is overloaded so it works either way.

you can also try the SSE sqrt to do 4 sqrt's at once.
take a look at sqrtps
Back to top
View user's profile Send private message
Deltron Z
Expert Cheater
Reputation: 1

Joined: 14 Jun 2009
Posts: 164

PostPosted: Tue Jun 15, 2010 9:52 pm    Post subject: Reply with quote

Should I return the first file I encounter or all of them?
Here's what I've done so far:
Code:
#define WIN32_LEAN_AND_MEAN
#include <Windows.h>
#include <stdio.h>

#pragma warning(disable:4996)
int FindFolder(const char* SearchPattern, char* Path)
{
   WIN32_FIND_DATAA Data;
   char FilePath[MAX_PATH];
   int Length, Result = 0;
   if (Path[strlen(Path) - 1] != '\\')
      strcat(Path, "\\");
   strcpy(FilePath, Path);
   strcat(FilePath, "*");

   HANDLE hSearch = FindFirstFileA(FilePath, &Data);
   if (hSearch != NULL && hSearch != INVALID_HANDLE_VALUE)
   {
      do
      {
         if ((Data.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY) != 0 && Data.cFileName[0] != '.')
         {
            Length = strlen(Data.cFileName);
            strcat(Path, Data.cFileName);
            Result = FindFolder(SearchPattern, Path);
            if (Result)
               break;
            else
               Path[strlen(Path) - Length - 1] = NULL;
         }

         if (strstr(Data.cFileName, SearchPattern))
         {
            strcat(Path, Data.cFileName);
            Result = 1;
            break;
         }
      } while (FindNextFileA(hSearch, &Data));

      FindClose(hSearch);
   }

   return Result;
}

int main()
{
   char Path[MAX_PATH] = "C:\\";
   if (FindFolder("bmp", Path))
      printf(Path);
   else
      printf("Error: %u\n", GetLastError());

   fflush(stdin);
   getchar();

   return 0;
}

Or I could seperate all files with a semicolon. Rolling Eyes
Back to top
View user's profile Send private message
Polynomial
Grandmaster Cheater
Reputation: 5

Joined: 17 Feb 2008
Posts: 524
Location: Inside the Intel CET shadow stack

PostPosted: Wed Jun 16, 2010 5:09 am    Post subject: Reply with quote

Deltron Z wrote:
I can't belive what I'm hearing! Shocked
Recurrsion sucks? well, in mathematics, any recurrsion can be defined pretty easily without a recurrsion too, but it computer programming this isn't the case.


Recursion, whilst useful, is a bad idea in programming. It only takes an unexpected input value to cause your whole program to crash. You also have to remember that the "default stack size" is not a defined value, so if another person were to compile your code on another compiler it may fail.

Slugsnack wrote:
We're allowed to code the whole thing in inline assembly ? Lol

Never thought of that one. Ok, new rule - you cannot use inline ASM. I'll update the rules appropriately.

_________________
It's not fun unless every exploit mitigation is enabled.
Please do not reply to my posts with LLM-generated slop; I consider it to be an insult to my time.
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    Cheat Engine Forum Index -> General programming All times are GMT - 6 Hours
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group

CE Wiki   IRC (#CEF)   Twitter
Third party websites