Cheat Engine Forum Index Cheat Engine
The Official Site of Cheat Engine
 
 FAQFAQ   SearchSearch   MemberlistMemberlist   UsergroupsUsergroups   RegisterRegister 
 ProfileProfile   Log in to check your private messagesLog in to check your private messages   Log inLog in 


Where is a memory address's format stored at?

 
Post new topic   Reply to topic    Cheat Engine Forum Index -> General Gamehacking
View previous topic :: View next topic  
Author Message
h3x1c
Master Cheater
Reputation: 17

Joined: 27 Apr 2013
Posts: 306

PostPosted: Mon Jul 25, 2016 9:44 am    Post subject: Where is a memory address's format stored at? Reply with quote

I've tried Googling a bit for this, but I can't seem to come across a "plain English" explanation of this. Put simply, how does a program know how to treat any given memory address? I understand that at compile time, the program is compiled with size and format in mind for everything, but where does it store such information so that at runtime, it knows when a particular address is a 4-byte int vs. 4-byte long?

Or better yet, how does CE know what format a memory address is when it scans?

I may well be thinking about this too deeply, but I'm just not understanding how a program (whether the program itself or a program like CE that can analyze said program) "knows" how to treat each of its memory addresses! Embarassed

_________________
Back to top
View user's profile Send private message Visit poster's website
Dark Byte
Site Admin
Reputation: 471

Joined: 09 May 2003
Posts: 25832
Location: The netherlands

PostPosted: Mon Jul 25, 2016 11:44 am    Post subject: Reply with quote

CE doesn't know what format an address is. It relies on the user to tell it instead. (or if you use all, it just tries every possible combination)

if you're talking about dissect data, then it's either based on guessing (address alignment and if the value is a human readable value or not) or if there is debugging information available (.net/mono, .pdb) then it can get the info from there

_________________
Do not ask me about online cheats. I don't know any and wont help finding them.

Like my help? Join me on Patreon so i can keep helping
Back to top
View user's profile Send private message MSN Messenger
STN
I post too much
Reputation: 43

Joined: 09 Nov 2005
Posts: 2676

PostPosted: Mon Jul 25, 2016 11:45 am    Post subject: Reply with quote

When you program, you can define which data type you want to use for your variable. Such as int, short int, long int, unsigned int, char etc. depending on the language. It is a feature of strongly typed languages but if you have been using very high level languages/managed languages then i can understand your confusion.

CE just guesses, of course CE doesn't know what is the proper data type.

In memory all data is same, a string is no different than an int unless you treat it as such. A string is a collection of chars(one byte), a 4 bytes int is a collection of 1 bytes. This will make it clear for you, open CE mem viewer and in hex viewer, change display type from byte hex to any of the different data types, you can see all of them are basically just bytes. That's how they are stored in memory

_________________
Cheat Requests/Tables- Fearless Cheat Engine
https://fearlessrevolution.com


Last edited by STN on Mon Jul 25, 2016 11:50 am; edited 1 time in total
Back to top
View user's profile Send private message
h3x1c
Master Cheater
Reputation: 17

Joined: 27 Apr 2013
Posts: 306

PostPosted: Mon Jul 25, 2016 11:49 am    Post subject: Reply with quote

STN wrote:
When you program, you can define which data type you want to use for your variable. Such as int, short int, long int, unsigned int, char etc. depending on the language.


I understand this part. My confusion comes in not understanding how the program itself, after compilation, knows to assign a particular format to an address it allocates. Where is this information stored in the program? Like, when you run the program and it loads into RAM, where does it check within itself to know that a particular address needs to be, say, a 4-byte long instead of a 4-byte int?

_________________
Back to top
View user's profile Send private message Visit poster's website
STN
I post too much
Reputation: 43

Joined: 09 Nov 2005
Posts: 2676

PostPosted: Mon Jul 25, 2016 11:51 am    Post subject: Reply with quote

h3x1c wrote:
STN wrote:
When you program, you can define which data type you want to use for your variable. Such as int, short int, long int, unsigned int, char etc. depending on the language.


I understand this part. My confusion comes in not understanding how the program itself, after compilation, knows to assign a particular format to an address it allocates. Where is this information stored in the program? Like, when you run the program and it loads into RAM, where does it check within itself to know that a particular address needs to be, say, a 4-byte long instead of a 4-byte int?


Ah, that's within each function. For example, strcmp functions expects a string/chars so the value you pass it to will be treated as string. If you pass that same string to say your custom function expecting ints, it will treat that string/chars as ints. This is how typecasts works.

Strongly typed language's compiler enforces these rules i.e if you define a string, you can't use it as an int unless you do type-casting but they are stored in a memory just the same as any other data type.

So in memory, they aren't stored as a 4 byte or a double, they are stored as collection of bytes. Functions define how they are used, of course strict rules are followed at compile time to avoid anarchy and unexpected results! if your function expects a string but you accidentally pass it an int, it can result in even disastrous results. 0 in a string signifies end of string but its just another value in int.

Hope this makes sense. When you are gamehacking, you can modify a string one byte at a time, same for a 4 or 8 bytes value. Look at doubles, collection of double DWORDS so if you want to modify them with a mov instruction, you modify those two DWORDs to reach your desired value. But you can modify it one byte at a time.

Compiles do store data types in a specific order though for example in case of classes/object oriented programming. They are closely stored but you can still treat them separately and modify them a byte at a time. So nowhere does the format info is stored but functions themselves decide how to use them (speaking at debugging time, at compile time this is all enforced by strict rules).

_________________
Cheat Requests/Tables- Fearless Cheat Engine
https://fearlessrevolution.com


Last edited by STN on Mon Jul 25, 2016 12:02 pm; edited 1 time in total
Back to top
View user's profile Send private message
h3x1c
Master Cheater
Reputation: 17

Joined: 27 Apr 2013
Posts: 306

PostPosted: Mon Jul 25, 2016 11:55 am    Post subject: Reply with quote

STN wrote:
h3x1c wrote:
STN wrote:
When you program, you can define which data type you want to use for your variable. Such as int, short int, long int, unsigned int, char etc. depending on the language.


I understand this part. My confusion comes in not understanding how the program itself, after compilation, knows to assign a particular format to an address it allocates. Where is this information stored in the program? Like, when you run the program and it loads into RAM, where does it check within itself to know that a particular address needs to be, say, a 4-byte long instead of a 4-byte int?


Ah, that's within each function.


Why on earth did my brain not put that together? Laughing

Right, so then we get into value types and reference types between stack/heap/global, any addresses of which all have their size/format defined from their respective functions, correct?

_________________
Back to top
View user's profile Send private message Visit poster's website
STN
I post too much
Reputation: 43

Joined: 09 Nov 2005
Posts: 2676

PostPosted: Mon Jul 25, 2016 12:28 pm    Post subject: Reply with quote

h3x1c wrote:
STN wrote:
h3x1c wrote:
STN wrote:
When you program, you can define which data type you want to use for your variable. Such as int, short int, long int, unsigned int, char etc. depending on the language.


I understand this part. My confusion comes in not understanding how the program itself, after compilation, knows to assign a particular format to an address it allocates. Where is this information stored in the program? Like, when you run the program and it loads into RAM, where does it check within itself to know that a particular address needs to be, say, a 4-byte long instead of a 4-byte int?


Ah, that's within each function.


Why on earth did my brain not put that together? Laughing

Right, so then we get into value types and reference types between stack/heap/global, any addresses of which all have their size/format defined from their respective functions, correct?


See my edit above, the forums should have a notification of some sort.

Anyway, not sure i understand your question correctly. What makes this all very clear for me to think of it in the way it actually is if you remove all the prettiness the compilers and high languages do. It is eventually just a collection of on/off, higher than that 1 and 0s....higher than that assembly you know. So when you speak debugger time, assembly is what we are dealing with.

What do you know about assembly? There are bytes which code for opcodes/instructions which makes it all happen. So, how do different instructions treat a certain data type and the register types? If this is all clear to you then you already know how everything is stored in memory.

Bytes are actually stored in file and then loaded into memory. Those bytes translate to opcodes/instructions right ? THAT IS IT! that is all that is stored, you can treat those bytes encoding for instructions as a DATA TYPE, you can treat data types in .data section as data type. You can treat anything in memory as a data type, you can treat it as a pointer, a reference type, a float, a double etc.

I am over-simplifying things but you get it now, right ? Very Happy.

A function in compiler translates to instructions in memory. You made a function and it for example expects a float so for example this instruction will be in memory
fld [game.exe+92]

game.exe+92 is expected to contain a float value. But you made another function where you used game.exe+92 as a 4 byte say
mov [game.exe+92], 1
then game.exe+92 will be expected as a 4 byte! Both are true. A float value of say -8.826972961 is stored in memory as 48 3B 0D C1 but that same value can just as easily be a 4 byte (C10D3B48) or it can be a byte (48 ) or it can be a pointer! (to C10D3B48) or it can be a reference type.

So nowhere is the format defined, it is just stored in memory as bytes but functions(instructions) use it as you tell them/expect them to at compile time.

Very Happy

_________________
Cheat Requests/Tables- Fearless Cheat Engine
https://fearlessrevolution.com
Back to top
View user's profile Send private message
ParkourPenguin
I post too much
Reputation: 152

Joined: 06 Jul 2014
Posts: 4718

PostPosted: Mon Jul 25, 2016 12:34 pm    Post subject: Reply with quote

You're thinking about this from a high-level perspective far too much. Value types are very useful for sanity checks when developing a program, but when you get down to it, a value type is really just an abstraction over bytes in memory. In other words, every value type is stored in memory as bytes. You're free to interpret those bytes any way you want, be it 4-byte, float, string, or something you make up (i.e. custom value types). There is absolutely nothing you can do to conclusively distinguish an address's value type just from looking at its value.

You can make an educated guess of an address's value type by looking at how the program accesses that address (e.g. fld dword ptr [eax] probably means [eax] is a float), but you still won't know for certain. When you look at the core aspects of reverse engineering, the only thing that's important is what the program does with a value. In order to quickly determine this, most people will make the assumption that a program will only treat a single value as a single type, which doesn't always have to be true. Take this C code for example:
Code:
#include <stdio.h>

int main(void) {
   float a = 9.375f;
   for (;*((char*)&a+3); a /= 8192.0f){
      printf("%d\n",*((char*)&a+3)/6);
   }
   *(int*)&a += 0x4346;
   printf("%s",(char*)&a);
   return(0);
}

This code counts down from 10 to 1 and prints a message to the screen using a single 4-byte address. That address is a float when it is declared and initialized as such, a boolean when used as the condition in the for loop, a float again in the for loop's increment statement, a 1-byte value in the print statement within the for loop, a 4-byte integer after the for loop, and a string at the final print statement. If you were just looking at the disassembly of this, you wouldn't know what to make of that variable's type. Again, the variable's type isn't important- what the code is doing is the important thing.


When CE scans for a particular value type, it looks through all memory you've specified. The only thing that changes between a 4-byte scan and a float scan is how CE interprets the bytes in memory. This is why you can change the value type in the found list on the fly (as long as the types are of the same length). When you dissect a structure CE has no way of gathering information on (mono discussed next), CE guesses the value types based on what looks correct. For example, the 4-byte value 1088421888 would be better interpreted as the float 7.0.

That's not to say this information can't exist. Some languages or compilers might store this information somewhere, usually for debugging purposes. Object-oriented languages can store information about an object's class at the start of the object, mono software can provide information about different classes and fields (CE makes use of this), and compilers can keep information on variables around if asked to. However, since this information is useless with regards to the execution of a bug-free program, it is commonly omitted for privacy and efficiency.

_________________
I don't know where I'm going, but I'll figure it out when I get there.
Back to top
View user's profile Send private message
h3x1c
Master Cheater
Reputation: 17

Joined: 27 Apr 2013
Posts: 306

PostPosted: Mon Jul 25, 2016 2:54 pm    Post subject: Reply with quote

Thanks DB, STN, and Parkour! This is crystal clear for me now.

The convolution in my head stems from a weird amalgamation of things I've been studying at the same time lately from low-level and high-level (C#, specifically)--the error being what you led with in your reply, Parkour.

Thanks again for your detailed replies, everyone!!! Very Happy

_________________
Back to top
View user's profile Send private message Visit poster's website
mgr.inz.Player
I post too much
Reputation: 222

Joined: 07 Nov 2008
Posts: 4438
Location: W kraju nad Wisla. UTC+01:00

PostPosted: Mon Jul 25, 2016 3:11 pm    Post subject: Re: Where is a memory address's format stored at? Reply with quote

h3x1c wrote:
how does a program know how to treat any given memory address? I understand that at compile time...

All above is true for many programming languages. But, there are other languages...


For example Lua 5.3 (and customized Lua in games). The type of variable can dynamically change and its value will still be at the same address.
There must be additional few bytes with variable type.


Link





You probably want to know why 0, 1, 3, 19?



From lua.h header file:
Code:
/*
** basic types
*/

LUA_TNIL = 0;
LUA_TBOOLEAN = 1;
LUA_TLIGHTUSERDATA = 2;
LUA_TNUMBER = 3;
LUA_TSTRING = 4;
LUA_TTABLE = 5;
LUA_TFUNCTION = 6;
LUA_TUSERDATA = 7;
LUA_TTHREAD = 8;


Plus variant tags for numbers (LUA_TNUMBER) from lobject.h header file:
Code:
/* Variant tags for numbers */

LUA_TNUMFLT = (LUA_TNUMBER | (0 << 4));  /* float numbers */
LUA_TNUMINT = (LUA_TNUMBER | (1 << 4));  /* integer numbers */


And above means:
LUA_TNUMFLT = 3 + 16*0 = 3
LUA_TNUMINT = 3 + 16*1 = 19



So, it looks like this:
LUA_TNIL = 0;
LUA_TBOOLEAN = 1;
LUA_TLIGHTUSERDATA = 2;
LUA_TNUMFLT = 3; -- it is LUA_TNUMBER
LUA_TSTRING = 4;
LUA_TTABLE = 5;
LUA_TFUNCTION = 6;
LUA_TUSERDATA = 7;
LUA_TTHREAD = 8;
...
LUA_TNUMINT = 19; -- it is also LUA_TNUMBER

_________________
Back to top
View user's profile Send private message MSN Messenger
Display posts from previous:   
Post new topic   Reply to topic    Cheat Engine Forum Index -> General Gamehacking All times are GMT - 6 Hours
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum
You cannot attach files in this forum
You can download files in this forum


Powered by phpBB © 2001, 2005 phpBB Group

CE Wiki   IRC (#CEF)   Twitter
Third party websites