WinCE Beginner Tutorial #1
By: Shub-Nigurrath /ARTeam
http://cracking.accessroot.com

Basics on patching WinCE Application

The Target:
none
The Tools:
IDA Professional 4.7, eMbedded Visual Tools, Hex Workshop
The Protection:
none

Other Information:
This tutorial will cover some initial elements in patching WinCE applications. Most of the material has been taken from other sources and integrated here in a stand alone tutorial. Have a good reading.

Best viewed in Firefox at 1280x1024

Introduction:

Many embedded operating systems are stripped-down microversions of their big brothers. An embedded operating system brings the power of a complete OS to small devices such as mobile phones or watches, which suffer from severely restricted processing and memory resources.

Embedded RCE is still at its first steps, also because there's not too much requests around, but here I'll try to introduce embedded OS architecture and how to crack the applications that run on it. I have chosen Windows CE, which powers many Windows Mobile OS flavors such as PocketPC and Smartphone. Windows CE is a semi-open, scalable, 32-bit, true-multitasking operating system that has been designed to run with maximum power on minimum resources. This OS is actually a miniature version of Windows 2000/XP that can run on appliances as small as a watch.

Why I chosen to speak of WinCE, well the WinCE scene is not that big at least compared to the Palm's scene thus there are not much cracks around for these application. IMHO this is mostly because the large number of processors on which WinCE/PowerPC runs, obblige a cracker to learn too much disassembly languages.

Nowadays the cracking activity for this scene is more easier because most of the PocketPC/WinCE devices are using ARM procesors (so an unique assembler to learn) and because Microsoft decided to release for free his embedded Visual Studio (almost a complete Visual Studio 6.0). Then, first of all, download the free eMbedded Visual Tools (MVT) package from Microsoft.com and get cracking—literally. The second instrument of great help for us will be IDA, which is able (in the Pro version) to disassemble ARM compiled programs (not only ARM indeed).
 



Windows CE Architecture:

Windows CE is the basis of all Windows Mobile PocketPC and Smartphone devices. In addition, using the CE Platform Builder, any programmer can create his own miniature operating system based on Windows CE. Consequently, CE is starting to control a vast array of consumer devices, ranging from toasters to exercise bicycles. Because of its growing prevalence, if you want to become proficient at reverse engineering applications on mobile devices it is important to understand the basics of how this operating system works. This segment briefly covers the Windows CE architecture, with a deeper look at topics important to understand when reversing.

PROCESSORS
In the world of miniature gadgets, physics is often the rate-limiting step. For example, the intense heat generated by high-speed processors in notebook PCs has been shown to be hot enough to fry eggs.
Windows CE devices are likewise limited in their choice of processors. The following is a list of processors supported by Windows CE:
 
  • ARM - Supported processors include ARM720T, ARM920T, ARM1020T, StrongARM, and XScale. ARM-based processors are by far the most common choice of CE devices at the time of this writing.
  • MIPS - Supported processors include MIPS II/32 w/FP, MIPS II/32 w/o FP, MIPS16, MIPS IV/64 w/FP, and MIPS IV/64 w/o FP.
  • SHx - Supported processors include SH-3, SH-3 DSP, and SH-4.
  • x86 - Supported processors include 486, 586, Geode, and Pentium I/II/III/IV.
     

If heat dissipation is a serious issue, the best choice is one of the non-x86 processors that uses a reduced level of power. The reduction in power consumption reduces the amount of heat created during processor operation, but it also limits the processor speed.

Kernel, Processes, and Threads
The kernel is the key component of a Windows CE OS. It handles all the core functions of the OS, such as processes, threads, and memory management. It also handles scheduling and interrupts. However, it is important to understand that Windows CE uses parts from its big brother—i.e., desktop Windows software. This means its threading, processing, and virtual memory models are similar to those of traditional Windows platforms.

While CE has a lot in common with traditional Windows, there are several items that distinguish it. These differences center on the use of memory and the simple fact that there is no hard drive (as discussed in the next section). In addition, dynamic link libraries (DLLs) in Windows CE are not implemented as they are in other Windows operating systems. Instead, they are used in such a way as to maximize the available memory. Integrating them into the core operating system means that DLLs don't take up precious space when they are executed. This is an important concept to understand before trying to reverse a program in Windows CE. Due to this small difference, attempting to break a program while it is executing a system DLL is not allowed by Microsoft's MVT.

A process in Windows CE represents an executing program. The number of processes is limited to 32, but each process can execute a theoretically unlimited number of threads. Each thread has a 64K memory block assigned to it, in addition to an ID and a set of registers. It is important to understand this concept because when debugging a program, you will be monitoring the execution of a particular thread, its registers, and the allotted memory space. In the process, you will be able to deduce hidden passwords, serial numbers, and more.

Processes can run in two modes: kernel and user. A kernel process has direct access to the OS and the hardware. This gives it more power, but a crash in a kernel process often crashes the whole OS. A user process, on the other hand, operates outside the kernel memory—but a crash only kills the running program, not the whole OS. In Windows CE, any third-party program will operate in user mode, which means it is protected. In other words, if you crash a program while reversing it, the whole OS will not crash (though you still may need to reboot the device).

There are two other important points to understand. First, one process cannot affect the data of another process. While related threads can interact with each other, a process is restricted to its own memory slot. The second point to remember is that each existing thread is continuously being stopped and restarted by a scheduler (discussed next). This is how multitasking is actually performed. While it may appear that more than one program is running at a time, the truth is that only one thread may execute at any one time on single-processor devices.

The scheduler is responsible for managing the thread process times. It does this by giving each thread a chance to use the processor. By continuously moving from thread to thread, the scheduler ensures that each gets a turn. Three key features for adjusting processor time are built into the scheduler.

The first feature is a method that is used to increase the amount of processor time. The secret is found in multithreading an application. Since the scheduler assigns processor time at the thread level, a process with 10 threads will get 10 times the processor time of a process with one thread.

Another method for gaining more processor time is to increase the process priority; but it's not encouraged unless necessary. Changing priority levels can cause serious problems in other programs, and it affects the speed of the computing device as a whole. The THREAD_PRIORITY_TIME_CRITICAL priority is important; it forces the processor to complete the critical thread.

The final interesting feature of the scheduler deals with a problem that can arise when priority threading is used. If a low-priority thread is executing and it ties up a resource needed by a higher-priority thread, the system could become unstable. In short, a paradox is created in which the high thread waits for the low thread to finish, which in turn waits on the high to complete. To prevent this situation from occurring, the scheduler will detect such a paradox and boost the lower-priority thread to a higher level, thus allowing it to finish.

Note that all of these problems are issues that every Windows OS must deal with. A Windows Mobile device may seem different, but it is still a Microsoft product, and as such it is limited by those products' common constraints.

Memory Architecture
One of the unique properties of most devices running Windows CE is the lack of a disc hard drive. Instead of spinning discs, pocket PCs use old-fashioned RAM (Random Access Memory) and ROM (Read Only Memory) to store data. While this may seem like a step back in technology, the use of static memory like ROM is on the rise and will eventually make moving storage devices obsolete. The next few paragraphs explain how memory in a Windows CE device is used to facilitate program execution.

In a Windows CE device, the entire operating system is stored in ROM. This type of memory is typically read-only and is not used to store temporary data that can be deleted. On the other hand, data in RAM is constantly being updated and changed. This memory is used to hold all files and programs that are loaded into the Windows CE-based device.

RAM is also used to execute programs. When a third-party game is executed, it is first copied into RAM and is executed from there. This is why a surplus of RAM is important in a Windows CE device. However, the real importance of RAM is that its data can be written to and accessed by an address. This is necessary because a program will often have to move data around. Since each program is allotted a section of RAM to run in when it is executed, it must be able to write directly to its predefined area.

While ROM is typically only used as a static storage area, in Windows CE it can be used to execute programs. This process is known as Execute In Place (XIP). In other words, RAM is not required to hold the ROM's data as a program executes. This freedom allows RAM to be used for other important applications. However, it only works with ROM data that is not compressed. While compression allows more data to be stored in ROM, the decompression will force any execution to be done via RAM.

RAM usage on a Windows CE device is divided between two functions. The first is the object store, which is used to hold files and data that are used by the programs but are not stored in ROM. In particular, the object store holds compressed program files, user files, database files, and the infamous Windows registry file. Although this data is stored in RAM, it remains intact when the device is turned off, because the RAM is kept charged by the power supply. This is the reason it is very important to never let the charge on a Pocket PC device completely die. If this happens, the RAM loses power and resets. It dumps all installed programs and wipes everything on the device except what is stored in ROM. This is referred to as a hard reboot when dealing with a Pocket PC device.

The second function of the RAM is to facilitate program execution. As previously mentioned, when a program is running, it needs to store the information it is using—this is the same function that RAM serves on a typical desktop PC. Any data passing through a program, such as a password or serial number, will be written to the RAM at one time or another.

Windows CE does have a limit on the RAM size. In Windows CE 3.0 it is 256 MB with a 32 MB limit on each file, but in Windows CE .NET this value has been increased to a rather large 4 GB. In addition, there is a limit to the number of files that can be stored in RAM (4 million) and to the number of programs that can operate at the same time. This brings us to multitasking.

Windows CE was designed to be a true multitasking operating system. Just like other modern Windows operating systems, it allows more than one program to be open at a time. In other words, you can listen to an MP3 while taking notes and checking out sites on the Internet. Without multitasking, you would be forced to close one program before opening another. However, you must be careful not to open too many programs on a Windows CE device. Since you are limited by the amount RAM in the device, and each open program takes up a chunk of the RAM, you can quickly run out of memory.

Finally, the limitation of RAM in a pocket PC also affects the choice of operating system. Since Windows CE devices may only have 32-128 MB of internal RAM, they do not make good platforms for operating systems that use a lot of memory, such as embedded Windows XP. In this OS, the minimum footprint for a program is 5 MB. On the other hand, Windows CE only requires 200K; this is a 2500% difference.

Graphics, Windowing, and Event Subsystem (GWES)
This part of the Windows CE architecture is responsible for handling all the input (e.g., stylus) and output (e.g., screen text and images). Since every program uses windows to receive messages, it is a very important part of Windows CE. It is one of the areas you need to understand to successfully reverse a program.

Without going into too much detail, you should know that every Windows CE process is assigned its own windows messaging queue. The queue is similar to a stack of papers that is added to and read from. This queue is created when the program calls GetMessage, which is very common in Windows CE programs. While the program executes and interacts with the user, messages are placed in and removed from the queue. The following is a list and explanation of the common commands that you will see while reverse engineering:
 

  • PostMessage - Places message on queue of target thread, which is returned immediately to the process/thread
  • SendMessage - Places message on queue, but does not return until it is processed
  • SendThreadMessage - Sends messages directly to thread instead of to queue
     

These Message commands, and others, act as bright, virtual flares when reversing a program. For example, if a "Sorry, wrong serial number" warning is flashed on the screen, you can bet some Message command was used. By looking for the use of this command in a disassembler, you can find the part of the program that needs further research.

We've given you a quick inside look at how Windows CE operates. This information is required reading for the rest of the tutorial. Understanding processing, memory architecture, and how Windows CE uses messages to communicate with the executing program will make it easier for you to understand how CE cracking works. Just as a doctor must understand the entire human body before diagnosing even a headache, a reverse engineer must thoroughly understand the platform he is dissecting to be successful in making a patch or deciphering a serial number.
 

 

 

Practical CE Reverse Engineering:

For this section, you will need to use the usual cracking tools (hex editors and disassemblers). I will start by creating a simple "Hello World!" application, and I then use this program to demonstrate several cracking methods. After this discussion, we offer a hands-on tutorial that allows you to walk through real-life examples of how reverse engineering can be used to get to the heart of a program.


Hello, World!
When learning a programming language, the first thing most people do is to create the famous "Hello, World" application. This program is simple, but it helps to get a new programmer familiar with the syntax structure, compiling steps, and general layout of the tool used to create the program. In fact, Microsoft's eMbedded Visual C++ goes so far as to provide its users with a wizard that creates a basic "Hello World" application with the click of a few buttons. The following are the required steps:

  1. Open Microsoft eMbedded Visual C++.

  2. Click File New.

  3. Select the Projects tab.

  4. In the "Project Name:" field, type "test", as illustrated in Figure 2. Select WCE Application on the left.

Figure 2. WCE application creation window
images/sn_0402.gif

By default, all compiled executables will be created in the C:\Program Files\Microsoft eMbedded Tools\Common\EVC\MyProjects\ directory.


  1. Click OK.

  2. Ensure "A typical `Hello World!' Application" is selected, and click Finish.

  3. Click OK.

We're running the programs on a PDA synchronized with our computer, but the beauty of Microsoft's eMbedded Visual Tools is you don't need a real device. The free MVT has an emulator for virtual testing .


After a few seconds, a new "test" class appears on the left side of the screen, under which are all the classes and functions automatically created by the wizard. We aren't making any changes to the code, so next, we compile and build the executable:

  1. Ensure the device is connected via ActiveSync.

  2. Click Build test.exe.

  3. Click Yes/OK through the warnings.

  4. Locate the newly created executable in your C:\Program Files\Microsoft eMbedded Tools\Common\EVC\MyProjects\ directory, or whatever directory you selected during the wizard, and copy it to your device.

Once the steps are complete, find test.exe on your device and execute it. If everything went according to plan, you'll see a screen similar to Figure 3. After a short break to discuss some of the popular methods crackers use to subvert protection, we will take a closer look at test.exe and make some changes to it using our reversing tools.

Figure 3. test.exe screen on the Windows CE device
images/sn_0403.gif

CE Cracking Techniques

Predictable system calls
In about 80% of all software, there is a common flaw that leads to the eventual cracking of the software: predictable code. For example, if you go through the registration process, you will almost always find a message that tells you the wrong serial number was entered. While this is a nice gesture for the honest person who made a mistake, it is a telltale sign that the program is an easy crack.

The problem arises simply because there are a limited number of alert boxes that appear in a program. A cracker has only to open the program in IDA Pro and search the strings for any calls made to MessageBoxW—the name of the function responsible for sending a message to the computer screen.

Once the cracker finds this call, she can use the reference list included with IDA Pro to backtrack through the program until she finds the point where the serial number is verified. In other words, using a message box to warn about an invalid serial gives the cracker the necessary starting point to look for a weakness. Without it, a beginner cracker could spend hours slowly stepping through the program, testing and probing.

Other common calls are Load String (for loading serial number values into a variable), Registry checks (for checking to see if the program is registered or not), and System Time checks (for checking for trial period deadlines). To find these, a cracker only has to use the Names window, which lists all the functions and system calls used in the program. Figure 4 is taken from IDA Pro, with our test.exe program loaded into it. The highlighted function may be a good place to start when looking for a way to alter the displayed message.

Figure 4. Names window in IDA, listing the CE functions used
images/sn_0404.gif
strlen and wcslen
When working with strings such as usernames, serials, or other text entries, it is important to monitor the length. The length of the string is important for two reasons. One, a program that expects a string may generate an error if it receives a variable with no value. For example, if a program is trying to divide two numbers and the denominator is blank, the calculation will fail. To avoid problems like this, a program will include checks to ensure that a value is indeed entered.

The second main use of string length checks is when setting aside memory for a variable. For example, our "Hello, World!" application must set aside enough memory for a 12-character variable. The program checks to see how much space is required using wcslen, as the following code illustrates:

ADD  R0, SP, #0x54;    Points R0 to memory address of 'Hello World!' string.

BL   wcslen;           Tests the length of the string and places that value in R0.

While testing string length is undeniably important, it is also an easy function to find and abuse. Because these types of functions are required when verifying serial numbers, a cracker has only to look in the Names window of the application to start the reversing process. In fact, crackers sometimes target this check and reset the required serial number length to zero, thus bypassing a program's security.

strcmp and CMP
Another popular method of finding serial number checks is through the use of the comparison (CMP) instruction. This type of function is used to compare two values to see if they are equal, and it can flip the Zero flag to true or false accordingly. Again, this is a required function for program execution; however, it comes with a serious risk.

Using strcmp or CMP as the sole method of validation in a registration process is not recommended. This particular function is one of the most abused and exploited functions in assembler. In fact, the use of this one little command can sometimes neuter a program that uses complex serial verification routines with encryption, name checks, and more.

For example, some programs do not actually store their serial numbers in the program file. Instead, an algorithm is used to create a valid serial number on the fly, based on owner names, hardware settings, the date/time, and more. In other words, thousands of lines of code are dedicated to creating a valid registration key. This key is used in the validation process to check any serial number that is entered to unlock a program. However, at the very end of the verification routine, most programs simply perform a simple comparison between the entered serial number and the one generated by the complex algorithm. The results of this check are placed into one of the registries, which are used to determine how the program flows. Typically, the next line includes some conditional branch call that either accepts the entered serial number or rejects it. Let's take a look at the following example, in which strcmp is used to verify a registration value:

Assume R1 = address of correct serial

ADD   R0, SP, #0x12    
   : This updates RO with a value pulled from the stack, which corresponds to the serial 
   : number entered by the user.

BL   strcmp
   : This compares the values held in addresses that R0 and R1 point to and sets the 
   : Zero flag accordingly: 1 for no match and 0 for match.

MOVS  R2, R0
   : Writes the value of R0 into R2 (the entered serial number).

MOV   R0, #0
   : Assigns R0 = 0

CMP  R2, R0
   : The CMP will check R0 against the value held by R2 (the results of the strcmp); 
   : if these values match, then the serials do not match.

Following this function, there would be a branch link to another section of code that would update the serial status and probably alert the user to a success or failure of the registration attempt. This would be done using the status flags, updated when the CMP opcode was executed. The following is an example:

BNE    loc_0011345
BEQ    loc_0011578

Therefore, if a cracker wanted to patch this program, he would only need to ensure that the CMP opcode always worked to his advantage. To do this, he would update the following opcode:

CMP    R2, R1
CMP    R2, R2

Since R2 will always equal R2, the CMP updates the status flags with an Equal status. This is used in the BNE/BEQ branches, which react with a positive serial check. To do this, a cracker would have to update the hex values as follows:

CMP    R2, R1    Hex: 01 0 52 E1
CMP    R2, R2    Hex: 02 0 52 E1

In other words, thanks to strcmp and the change of one hex character, the protection of this program is nullified.

NOP sliding
When attacking a program, there are some situations that require a cracker to overwrite existing code with something known as a nonoperation (NOP). A nonoperation simply tells the processor to move on to the next command. When a series of NOP commands are used in sequence, the processor virtually slides through the code until it hits a command it can perform. This technique is popular in both the hacking and cracking community, but for different reasons.

A hacker typically uses NOP slides to facilitate the execution of inserted code through a buffer overflow. A buffer overflow  is a method of overflowing a variable's intended memory allocation with data. This allows a hacker to write her own code right into the memory, which can be used to create a backdoor, elevate permissions, and more. However, a hacker does not always know where her code ends up in the target computer's memory, so she typically pads her exploit code with NOP commands. This allows a hacker to guess where in the memory to point the execution code. Upon hitting the NOP commands, the processor just slides into the exploit code and executes it.

A cracker, on the other hand, does not use NOP slides to execute code. Instead, he uses NOP commands to overwrite code he does not want executed. For example, many programs include a jump or branch in the assembler code that instructs the processor to validate a serial number. If a cracker can locate this jump in the program, he can overwrite it with a NOP command. This ensures that the program remains the same byte size and bypasses the registration check. Typically, this method will also be used with a slight alteration on a compare or equivalence function, to ensure proper continued code execution.

Traditionally, the NOP command is as simple as typing 0x90 over the hex that needs to be nullified. However, this works only on an x86 processor, not on ARM. If you attempt to use 0x90s on ARM, you end up inserting UMULLSS, which is the command to perform an unsigned multiply long if the LS condition flags are set, followed by an update of the status flags depending on the result of the calculation. Obviously, this is about as far from a NOP as you can get.

Ironically, the ARM processor has no true NOP command. Instead, a cracker would need to use a series of commands that essentially perform no operation. This is accomplished by simply moving a value from a register back into itself, as follows:

(MOV R1, R1)

This method of cracking is common because it is one of the easiest to implement. For example, if a cracker wanted to bypass a "sleep" function in a shareware program, she could easily search for and find something similar to the following code.

Assembler                  HEX
MOV       R0, #0x15        15 00 A0 E3
BL        Sleep            FF 39 00 EB
MOV       R4, R0           00 40 A0 E1

Using a hex editor, a cracker would only have to make the following changes to the code to cause the "sleep" function to be ignored:

Assembler                  HEX
MOV       R0, #0x15        15 00 A0 E3
MOV       R1,R1     
MOV       R4, R0           00 40 A0 E1

Note the missing Sleep command. When you overwrite this command, the revised program will not display, for example, a nag screen that temporarily restricts access. Instead, the user will be taken straight into the program.

To our knowledge, at the time of this writing there are no hex editors that work directly on Windows Mobile platforms. However, you can edit the application on the desktop (Figure 5) using an hex editor.

Figure 5. UltraEdit-32 hex output of test.exe
images/sn_0405.gif

Disassembling a CE Program

As discussed previously, a disassembler is a program that interprets machine code into a language that humans can understand. Recall that a disassembler attempts to convert hex/binary into its assembler equivalent. However, there are as many different assembler languages as there are types of processors. AMD, Intel, and RISC processors each have their own languages. In fact, processor upgrades often include changes to the assembler language, to provide greater functionality.

As a result of the many variations between languages, disassembling a program can be challenging. For example, Microsoft's MVT, discussed next, includes a disassembler to allow for CE debugging. However, this program will not debug code meant to run on a Motorola cell phone. This is why choosing the right debugger is an important process—which brings us to IDA Pro.

Once you have obtained a copy of IDA Pro, execute it and select New from the pop-up screen. You will be prompted for a program to disassemble. For this exercise, we will use the test.exe file that we just created. However, we are going to alter the file and control the execution of the program to show a different message than the one it was originally programmed for.

Loading the file
The first thing you need to do is load the test.exe file into IDA Pro. You need to have a local copy of the file on your computer. Step through the following instructions to get the test.exe file disassembled.
  1. Open IDA (click OK through splash screen).

  2. Click New at the Welcome screen and select test.exe from the hard drive; then, click Open.

  3. Check the "Load resources" box, change the "Processor type" drop-down menu selection to "ARM processors: ARM," and click OK, as illustrated in Figure 6.

  4. Click OK again if prompted to change the processor type.

At this point you may be asked for some *.dll files. We recommend that you find the requested files (either from MVT or from your device) and transfer them to a local folder on your PC. This allows IDA to fully disassemble the program. test.exe requires the AYGSHELL.DLL file, which can be downloaded from the Internet.


  1. Locate any requested *.dll files and wait for IDA to disassemble the program.

  2. If the Names window does not open, select it from the View Open Subviews Names menu.

  3. Locate "LoadStringW" from the list and double-click on it.

Figure 6. IDA Pro startup configuration for test.exe
images/sn_0406.gif

At this point, you should have the following chunk of code listed at the top of the disassembler window:

.text:00011564 ;  S U B R O U T I N E 
.text:00011564 
.text:00011564 
.text:00011564 LoadStringW   ; CODE XREF: sub_110E8+28#p
.text:00011564               ; sub_110E8+40#p ...
.text:00011564      LDR   R12, =_  _imp_LoadStringW
.text:00011568      LDR   PC, [R12]
.text:00011568 ; End of function LoadStringW

If you look at this code, you can see that LoadStringW is considered a subroutine. A subroutine is a mini-program that performs some action for the main program. In this case, it is loading a string. However, you will want to pay attention to the references that use this subroutine. These will be listed at the top of the routine under the CODE XREF, which stands for cross-reference. In our case, there are two addresses in this program that call this subroutine; they are sub_110E8+28 and sub_110E8+40. While these addresses may appear a bit cryptic, they are easy to understand. In short, the cross-reference sub_110E8+28 tells you that this LoadStringW subroutine was called by another subroutine that is located at address 110E8 in the program. The actual call to LoadStringW was made at the base 110E8 address plus 28 (hex) bits of memory into the routine.

Not all XREFs are always visible. If there are more than two, there will be a "..." after the second reference.


While it is possible to scroll up to this memory location, IDA makes it easy by allowing us to click on the reference. Here's the secret: right-click on the "..." and select the "Jump to cross reference" option. Select the third option on the list, which should be 1135C. Without this shortcut, you would have to go to each XREF and check to see where in the display process the code is.

Once at address 1135C, you can see that it looks very promising. Within a short chunk of code, you have several function calls that seem to be part of writing a message to a screen (i.e., BeginPaint, GetClientRect, LoadStringW, wcslen, DrawTextW). Now we will use the lessons we've learned to see what we can do.

As we learned, wcslen is a common point of weakness. We are going to use this knowledge to change the size of our message. Let's take a closer look at this part of the code, assuming that the message is loaded into memory.

.text:0001135C          BL   LoadStringW       ;load string
.text:00011360          ADD   R0, SP, #0x54    ;change value of 
                                               ;R0 to point to string location
.text:00011364          BL   wcslen            ;get length of 
                                               ;string and put value in R0
.text:00011368    MOV   R3, #0x25              ;R3 = 0x25
.text:0001136C    MOV   R2, R0                 ;moves our string 
                                               ;length into R2
.text:00011370    STR   R3, [SP]               ;pushes R3 value 
                                               ;on memory stack
.text:00011374    ADD   R3, SP, #4             ;R3 = memory stack 
                                               ;address + 4
.text:00011378    ADD   R1, SP, #0x54          ;R1 = memory stack 
                                               ;address + 0x54
.text:0001137C    MOV   R0, R5                 ;moves R5 to R0
.text:00011380    BL   DrawTextW               ;writes text to 
                                               ;screen using R0, R1, R2 to define 
                                               ;location of string in memory, 
                                               ;length of string, and type of draw.

Now that we have broken down this part of the code (which you will be able to do with practice), how can we change the length of the string that is drawn to the screen? Since we know that this value was moved into R2, we can assume that R2 is used by the DrawTextW routine to define the length. In other words, if we can control the value in R2, we can control the message on the screen.

To do this, we only need to change the assembler at address 1136C. Since R2 gets its value from R0, we can simply replace the R0 variable with a hardcoded value of our own. Now that we know this, let us edit the program using our hex editor.

Once you get the hex editor open, you will quickly see that the address in IDA does not match the address in the hex editor. However, IDA does provide the address in another part of the screen, as illustrated in Figure 7. The status bar located at the bottom left corner of the IDA window gives the actual memory location you need to edit.

Figure 7. IDA Pro status bar showing memory address
images/sn_0407.gif

Using the opcodes discussed previously in this tutorial, you recreate the hex code you want to use in place of the existing code. The following is the original hex code and the code you will want to replace it with.

Here is the original:

MOV     R2, R0        00 20 00 E1

And here it is, updated:

MOV     R2, 1         01 20 00 E3

Note the change from E1 to E3; it differentiates between a MOV of a register value and a MOV of a hardcoded value.

What did this change accomplish? If you download the newest test.exe file to your PDA, you will see that it now has a message of just "R". In other words, we caused the program to only load the first character of the message it had stored in memory. Now, imagine what we could do if we increased the size of the message to something greater than the message in memory. Using this type of trick, a cracker could perform all kinds of manipulation. However, these types of tricks often take more than just a disassembler, which is where MVT comes in handy.

 

Microsoft's eMbedded Visual Tools

Currently, there are very few tools available for live debugging of Windows CE devices. The choice of free tools is even more limited. However, Microsoft, in its benevolent wisdom, has provided just such a tool. You will need this tool to reverse engineer most Windows CE applications, unless you are intimately familiar with ARM assembler. Even if you do know the ARM code, the debugger will allow you to access parts of a program that you cannot access via a disassembler.

In short, MVT allows you to run a program, one line or opcode at a time. In addition, it allows you to observe the memory stack, register values, and values of variables in the program while it is executing. And if that isn't enough, the debugger allows you to actually change the values of the registers and variables while the program is executing. With this power, you can change a Zero flag from a 1 to a 0 in order to bypass a protection check, or even watch the program compare an entered serial number with the hardcoded number, one character at a time. Needless to say, a debugger gives you total control over the program. It not only lets you look at the heart of its operation, but allows you to redesign a program on the fly.

To illustrate this power, we will use our little example program again. We will change the message on the screen, but this time we will locate the hardcoded message in memory and redirect the LDR opcode to a different point in the memory. This has the effect of allowing us to write whatever message we want to the screen, providing it exists in memory.

Using the MVT

The first step in debugging a program is to load it into the MVT. This step typically involves the use of the Microsoft eMbedded Visual C++ (MVC) program that is included with the MVT package. Once C++ is open, perform the following steps to load the test.exe file into your debugger. Optionally, if you have a Windows Mobile device, you will want Microsoft ActiveSync loaded, with the device connected. In this case, be sure to have a copy of the test.exe file stored on the CE device, preferably under the root folder.

  1. Open Microsoft eMbedded Visual C++.

  2. Select File Open.

  3. Change "Files of type:" to "Executable Files" (.exe, .dll, .ocx).

  4. Select the local copy of test.exe.

  5. After brief delay, select Project Settings from the top menu.

  6. Click the Debug tab.

  7. In the "Download directory:" text box, type "\" (or point the directory to the folder you have selected on the CE device).

  8. Click OK, and then hit F11.

  9. You will see a Connecting screen (Figure 8) followed by a warning screen (Figure 9). Select Yes on the CPU Mismatch Warning dialog window.

Figure 8. Microsoft eMbedded Visual C++ connecting screen
images/sn_0408.gif
Figure 9. Microsoft eMbedded Visual C++ CPU warning
images/sn_0409.gif
  1. Click OK on the next warning screen (Figure 10).

Figure 10. Microsoft eMbedded Visual C++ platform warning
images/sn_0410.gif
  1. The file will download and some file verification will occur.

  2. Click OK on the debugging information warning screen (Figure 11).

Figure 11. Microsoft eMbedded Visual C++ debugging information alert
images/sn_0411.gif
  1. Patiently wait as the program launches.

  2. You will be asked for several .dll files. For this example, they can be canceled. Note that you may be asked for system .dlls that you do not have; in this case, you can easily find them online for download.

  3. Patiently wait for the program to synchronize.

 

Experiencing the MVC Environment

Once the program is loaded in debug mode, you will notice it is similar to IDA Pro. This is because the program must be disassembled before it can be executed in debug mode. As with any debugger, take a moment to become familiar with the tools and options available to you.

The Registers screen is one of the most useful, after the main Disassembly window. It is also important to note that you can change the conditional flags by double-clicking on their labels. This can easily turn an equal condition into an unequal condition, which will allow you to control the flow of the code.

The Call Stack windows provide a means of keeping track of the function in which you currently reside, as well as where the function will return if it is a BL. The Memory window allows you to look right into the RAM and the values it is holding. This is extremely valuable as a means to sniff out a serial number or value to which you want access. We demonstrate this process in our example.

When debugging a complicated program, you may also need to jump to determine where in memory a linked file exists. Doing so allows you to locate the code and set a breakpoint. Using the Modules window, you can easily find the memory range and jump to that point of code. In addition, pressing Alt-F9 allows you to set breakpoints (BPXs). Use breakpoints when you want to step into the address of a BL. MVC does not step into a BL; instead, it executes the code and jumps to the next line after the BL from the main function.

 

Reverse Engineering test.exe

Now that you are familiar with the basic layout of the MVC, let's try it out. For this example, we use the test.exe program, which you have already altered via the hex editor. Our goal is to use this program as a foundation, but we are going to once again alter the displayed text using some of the methods previously discussed. Although this example is simple, it allows you to become familiar with the embedded debugging environment.

The first thing we want to do is to jump to the point in the program where the message is displayed. Since we already found this using IDA Pro, we can easily jump to this part of the program. First, we need to know where in memory our test.exe program resides. We will use the Modules window. Once we open this window, we quickly see that the test.exe program is between 0x2E010000 and 0x2E015FFF. (Note that the first two characters may vary. It is important to interpret the following examples if your address does not match them exactly.) You may have noted that you are already sitting in this memory block, but using the Modules window is a good way to validate that you are in the correct section. Next, hit Alt-G to open the Goto window. Enter the address 2E01135C, which is based on the 2E value combined with the 0001135C address value we have deduced from early exploration.

Once you find that address, place a breakpoint next to it so the program will stop running at this point: either right-click on the memory address or hit Alt-F9. Make sure to enter the address with a 0x appended to the front. Without this hex declaration, the breakpoint will not set. If you are successful, you will see a red dot next to the address.

Now, hit the F5 key to execute the program. If all went well, the program stops at the address at which you placed the BPX. At this point in the execution, part of the program has executed. In fact, your Windows CE device may have the blank HACK window loaded on its screen (as shown in Figure 12). However, we are not yet at the place in the code where the actual message is written to the screen.

Figure 12. Results of MVT reverse engineering
images/sn_0412.gif

If you compare the disassembly screen in the MVT with that of the code in the IDA Pro hack we worked on previously, you can see we are at the key part of the code in which the message is written to the screen. However, unlike IDA Pro, the MVT does not provide the function names (e.g., 1135C is the LoadStringW function). This is one reason it is useful to have both programs open in tandem.

Once the program is paused at the BPX, you can see that the register values are all filled. Note that some are red and some are black. The red ones symbolize changes, making it easy to spot values that have been updated. As an example, hit the F11 key. The F11 key executes the BL code at 1135C, which in turn causes the R0-R3, R12, Lr, PC, and Psr values to change.

Since we know that the 1135C address pointed to a function that loaded the string, we can assume that the registers have been updated with this string's information. This is in fact what has happened. R0 now equals C, which is the hex equivalent to the value 12. If you recall, the original message was 12 characters long. R1 also changed, and now holds the memory address of the string. To see the string, hit Alt-6 to open the Memory window. Once the window is open, type in the value held by R1 and hit Enter. This should cause the value TEST to appear at the top of the Memory window.

If you are wondering why our long 12-character string did not appear, you have to remember that memory is written to in reverse order: the value of the string ends at the address 2E015818. In other words, if you scroll up a few lines, you should see your message. So you now know that R2 points to the address in the program's memory where the string is stored, and R0 holds the length of the string.

If we step through the program, we can see that the string is eventually added to the stack and is stored back into memory at 2E06FA60. During this process, the value in R0 is placed in R12, and R5's value is placed in R0. There are some other value updates, but eventually, at 2E011380, the string is written to the screen.

During this process, note that address 11378 contained an add opcode that updated the value of R1 by adding Sp with 0x54. This is used to point to the place in temporary memory where the string is stored. So if we changed the 0x54 value to a value of our choosing, the output screen should reflect the change. To illustrate, let us look through the Memory window to see if we can find a different message. After scrolling down a bit, you should come to memory address 2E06FA10, which points to the beginning of the word HACK. Now that we have found an alternative message, how can we get this message to display?

This process is a matter of basic math. If our stack pointer is 6FA0C, to which 0x54 is added to point to the original message, we need to determine what value needs to be added to the stack pointer to point to our new address. In other words, 6FA60 - 0x54 = Sp, which means the original address is 6FA60. Using this equation, if the desired address is 6FDAC, then to figure out the difference we simply need to subtract the Sp from 6FDAC (i.e., 6FDAC - 6FA0C = 3A0).

At this point, we have determined the purpose of this hack. We have located a string in the memory that we wish to display and figured out the distance from the Sp to that memory address. We know that the opcode and assembler at address 11378 needs to be changed as follows.

Here's the original:

ADD   R1, SP, #0x54        54 10 8D E2

And here it is, updated:

ADD   R1, SP, #0x3A0         3A 1E 8D E2

We also can use the lessons we previously learned to reduce the size of the string buffer to four characters. This would simply require us to change the instructions and assembler at 1136C as follows.

Here's the original:

MOV    R2, R0        00 20 00 E1

And the updated:

MOV     R2, 1        01 20 00 E3

Once you have completed this exercise, save the new binary file and run it on MVT (or, optionally, upload it to your Windows CE device). If you got everything right, you should be rewarded with a screen similar to Figure 12.


 

 

Reverse Engineering serial.exe:

Now that you've had a simple introduction to RCE on Windows CE, the next section provides a legal and hands-on tutorial of how to bypass serial protection. We describe multiple methods of circumvention of the protection scheme, which shows there's more than one "right" way to do it. We use the previous discussion as a foundation.

OVERVIEW

For our example, we use our own program, called serial.exe. This program was written in Visual C++ to provide you with a real working product on which to test and practice your newly acquired knowledge. Our program simulates a simple serial number check that imitates those of many professional programs. You will see firsthand how a cracker can reverse engineer a program to allow any serial number, regardless of length or value.

Loading the target

You must first load the target file into a disassembler from the local computer, using the steps we covered earlier. In this case, we are targeting a file called serial.exe, written solely for this example (Figure 13).

Figure 13. serial.exe
images/sn_0413.gif

Once the program is open, drill down to a point in the program where you can monitor what is happening. As previously discussed, there are several function calls that flag an event worth inspection. For example, using the Names window, we can locate a wcscmp call, which is probably used to validate the entered serial number with the corrected serial number. Using this functions XREF, we can easily locate the chunk of code illustrated in Figure 13.

Since serial.exe is a relatively simple program, all the code we need to review and play with is located within a few lines. They are as follows:

.text:00011224             MOV   R4, R0
.text:00011228             ADD   R0, SP, #0xC
.text:0001122C             BL   CString::CString(void)
.text:00011230             ADD   R0, SP, #8
.text:00011234             BL   CString::CString(void)
.text:00011238             ADD   R0, SP, #4
.text:0001123C             BL   CString::CString(void)
.text:00011240             ADD   R0, SP, #0x10
.text:00011244             BL   CString::CString(void)
.text:00011248             ADD   R0, SP, #0
.text:0001124C             BL   CString::CString(void)
.text:00011250             LDR   R1, =unk_131A4
.text:00011254             ADD   R0, SP, #0xC
.text:00011258             BL   CString::operator=(ushort)
.text:0001125C             LDR   R1, =unk_131B0
.text:00011260             ADD   R0, SP, #8
.text:00011264             BL   CString::operator=(ushort)
.text:00011268             LDR   R1, =unk_131E0
.text:0001126C             ADD   R0, SP, #4
.text:00011270             BL   ; CString::operator=(ushort)
.text:00011274             LDR   R1, =unk_1321C
.text:00011278             ADD   R0, SP, #0
.text:0001127C             BL   CString::operator=(ushort)
.text:00011280             MOV   R1, #1
.text:00011284             MOV   R0, R4
.text:00011288             BL   CWnd::UpdateData(int)
.text:0001128C             LDR   R1, [R4,#0x7C]
.text:00011290             LDR   R0, [R1,#-8]
.text:00011294             CMP   R0, #8
.text:00011298             BLT   loc_112E4
.text:0001129C             BGT   loc_112E4
.text:000112A0             LDR   R0, [SP,#0xC]
.text:000112A4             BL   wcscmp
.text:000112A8             MOV   R2, #0
.text:000112AC             MOVS  R3, R0
.text:000112B0             MOV   R0, #1
.text:000112B4             MOVNE  R0, #0
.text:000112B8             ANDS  R3, R0, #0xFF
.text:000112BC             LDRNE  R1, [SP,#8]
.text:000112C0             MOV   R0, R4
.text:000112C4             MOV   R3, #0
.text:000112C8             BNE   loc_112F4
.text:000112CC             LDR   R1, [SP,#4]
.text:000112D0             B    loc_112F4
.text:000112E4 
.text:000112E4 loc_112E4                ; CODE XREF: .text:00011298
.text:000112E4                          ; .text:0001129C
.text:000112E4             LDR   R1, [SP]
.text:000112E8             MOV   R3, #0
.text:000112EC             MOV   R2, #0
.text:000112F0             MOV   R0, R4
.text:000112F4 
.text:000112F4 loc_112F4                ; CODE XREF: .text:000112C8
.text:000112F4                          ; .text:000112D0
.text:000112F4             BL   CWnd_  _MessageBoxW

If you have not touched anything after IDA placed you at address 0x000112A4, then that line should be highlighted blue. If you want to go back to the last address, use the back arrow at the top of the window or hit the Esc key.

Since we want to show you several tricks crackers use when extracting or bypassing protection, let's start by considering what we are viewing. At first glance at the top of our code, you can see there is a pattern. A string value appears to be loaded in from program data, and then a function is called that does something with that value. If we double-click on unk_131A4, we can see what the first value is "12345678", or our serial number. While our serial.exe example is simplified, the fact remains that any data used in a program's validation must be loaded in from the actual program data and stored in RAM. As our example illustrates, it doesn't take much to discover a plain text serial number. In addition, it should be noted that any hex editor can be used to find this value, although it may be difficult to parse out a serial number from the many other character strings that are revealed in a hex editor.

As a result of this plain text problem, many programmers build an algorithm into the program that deciphers the serial number as it is read in from memory. It's typically indicated by a BL to the memory address in the program that handles the encryption/algorithm. An example of another method of protection is to use the device owner's name or some other value to dynamically build a serial number. This completely avoids the problems, surrounding and storing it within the program file, and indirectly adds an extra layer of protection on to the program. Despite efforts to create complex and advanced serial number creation schemes, the simple switch of a 1 to a 0 can nullify many antipiracy algorithms, as you will see.

The remaining code from 0x00011250 to 0x0001127C is also used to load values from program data to the device's RAM. If you check the values at the address references, you can quickly see that three messages are loaded into memory as well. One is a "Correct serial" message, and the other two are "Incorrect serial" messages. Knowing that there are two different messages is a minor but important tidbit of information, because it tells us that failure occurs in stages or as a result of two different checks.

Moving through the code, we see that R1 is loaded with some value out of memory, which is used to load another value into R0. After this, in address 0x00011294, we can see that R0 is compared to the number eight (CMP R0, #8). The next two lines check the result of the comparison, and if it is greater than or less than eight, the program jumps to loc_112E4 and continues from there.

If we follow loc_112E4 in IDA Pro, it starts to get a bit more difficult to determine what is happening, which brings us to the second phase of the reverse engineering process: the live debugger.

Debugging serial.exe

As we illustrated when debugging test.exe, the MVT is a very useful tool that can help a debugger, or a cracker, work through a program's execution line by line. This type of intimate relationship allows an in-depth look at the values being processed and can also allow on-the-fly alteration of data that is stored in the registers, flags, and memory.

After the program is loaded, set a breakpoint at 0x00011280, with any changes as defined by the absolute memory block. Once the breakpoint is entered, hit the F5 key to execute the program. You should now see a Serial screen on your Pocket PC as in Figure 14. Enter any value in the text box and hit the Submit button.

Figure 14. serial.exe key entry screen
images/sn_0414.gif

After you click the Submit button, your PC should shift focus to the section of code we looked at earlier in IDA. Notice the little yellow arrow on the left side of the window, pointing to the address of the breakpoint. Right-click on the memory address column and note the menu that appears. You will use this menu quite frequently when debugging a program.

The MVT is slow in execution mode when it's using a USB/serial connection. If you are in the habit of jumping between programs, you will quickly become frustrated at the time required for the MVT to redraw the screen. To avoid these delays, ensure the MVT is in break mode before changing window focus.

 

Step-Through Investigation

At this point, serial.exe is loaded on the Pocket PC and the MVT is paused at a breakpoint. The next command the processor executes MOV R1, #1. This is a simple command to move the value 1 into register 1 (R1).

Before executing this line, look at the Registers window and note the value of R1. You should also note that all the register values are red; this is because they have all changed from the last time the program was paused. Now, hit the F11 key to execute the next line of code. After a short pause, the MVT returns to pause mode, at which time you should notice several things. The first is that most of the register values turned to black, which means they did not change values. The second is that R1 now equals 1.

The next line loads the R0 register with the value in R4. Once again, hit the F11 key to let the program execute this line of code. After a brief pause, you will see that R0 is equal to R4. Step through a few more lines of code until your yellow arrow is at address 0x00011290. At this point, let's take a look at the Registers window.

The last line of code executed was an LDR command that loaded a value (or address representing the value) from memory into a register. In this case, the value was loaded into R1, which should be equal to 0006501C. Locate the Memory window and enter the address stored by R1 into the "Address:" box. Once you hit Enter, you should see the serial number you entered.

After executing the next line, we can see that R0 is given a small integer value. Take a second and see if you can determine its significance. In R0, you should have a value equal to the number of characters in the serial you entered. In other words, if you entered "777", the value of R0 should be 3, which represents the number of characters you entered.

The next line, CMP R0, #8, is a simple comparison opcode. When this opcode is executed, it will compare the value in R0 with the integer 8. Depending on the results of the comparison, the status flags will be updated. These flags are conveniently located at the bottom of the Registers window. Note their values and hit the F11 key. If the values change to N1 Z0 C0 O0, your serial number is not 8 characters long.

At this point, serial.exe is headed for a failure message (unless you happened to enter eight characters). The next two lines of code use the results of the CMP to determine if the value is greater than or equal to eight. If either is true, the program jumps to address 0x000112E4, where a message will be displayed on the screen. If you follow the code, you will see that address 0x000112E4 contains the opcode LDR R1, [SP]. If you follow this through and check the memory address after this line executes, you will see that it points to the start of the following error message at address 0x00065014: "Incorrect serial number. Please verify it was typed correctly."

Abusing the System

Now that we know the details of the first check, we want to break the execution and restart the entire program. Perform the same steps that you previously worked through, but set a breakpoint at address 0x00011294 (CMP R0, #8). Once the program is paused at the CMP opcode, locate the Registers window and note the value of R0. Now, place your cursor on the value and overwrite it with "00000008". This very handy function of the MVT allows you to trick the program into thinking your serial is eight characters long, thus allowing you to bypass the check. While this works temporarily, we will need to make a permanent change to the program to ensure any value is acceptable at a later point.

After the change is made, use the F11 key to watch serial.exe execute through the next few lines of code. Then, continue until the pointer is at address 0x000112A4 (BL 00011754). While this command may not mean much to you in the MVT, if we jump back over to IDA Pro we can see that this is a function call to wcscmp, which is where our serial is compared to the correct serial. Knowing this, we should be able to take a look at the Registers window and determine the correct serial.

Function calls that require data to perform their operations use the values held by the registers. In other words, wcscmp will compare the values of R0 with the value of R1, which means we can easily determine what these values are. It then returns a true or false in R1.


If we look at R0 and R1, we can see that they hold the values 00064E54 and 0006501C, respectively, as illustrated by Figure 15 (these values may be different for your system). While these values are not the actual serial numbers, they do represent the locations in memory where the two serials are located. To verify this, place R1's value in the Memory window's "Address:" field and hit Enter. After a short pause, the Memory window should change, and you should see the serial number you entered. Next, do the same with the value held in R0. This will cause your Memory window to change to a screen similar to Figure 16, in which you should see the value "1.2.3.4.5.6.7.8"—in other words, the correct serial.

Figure 15. The Registers window displays the addresses of the serials
images/sn_0415.gif
Figure 16. The Memory window displays the correct serial
images/sn_0416.gif

At this point, a cracker could stop and simply enter the newfound value to gain full access to the target program, and he could also spread the serial number around on the Internet. However, many serial validations include some form of dynamically generated serial number (based on time, name, or a matching registration key), which means any value determined by viewing it in memory will only work for that local machine. As a result, crackers often note the serial number and continue on to determine where the program can be "patched" in order to bypass the protection, regardless of the dynamic serial number.

Moving on through the program, we know the wcscmp function will compare the values held in memory, which results in an update to the condition flags and R0-R4, as follows:

R0 If the serials are equal, R0 = 0; else R0 = 1.
 

R1 If equal, address following entered serial number; else, address of failed character.
R2 If equal, R2 = 0; else, hex value of failed character.
R3 If equal, R3 = 0; else, hex value of correct character.

We need to once again trick the program into believing it has the right serial number. This can be done one of two ways. The first method is to actually update your serial number in memory. To do this, note the hex values of the correct serial (i.e., 31 00 32 00 33 00 34 00 35 00 36 00 37 00 38), and overwrite the entered serial number in the Memory window. When you are done, your Memory window should look like Figure 17.

Figure 17. Using the Memory window to update values
images/sn_0417.gif

Be sure to include the 00 spacers. They are necessary.


The second method a cracker can use is to update the condition flags after the wcscmp function has updated the status flags. To do this, hit F11 until the pointer is at 0x000112A8. You should note that the Z condition flags change from 1 (equal) to 0 (not equal). However, if you don't like this condition, you can change the flags back to their original values by overwriting them. Once you do this, the program will once again think the correct serial number was entered. While this temporarily fixes the serial check, a lasting solution requires an update to the program's code.

Fortunately, we do not have to look far to find a weak point. The following explains the rest of the code that is processed until a message is provided on the Pocket PC, alerting the user to a correct (or incorrect) serial number.

This opcode clears out the R2 register so there are no remaining values that could confuse future operations:

260112A8  mov    r2, #0

In the next opcode, two events occur. The first is that R0 is moved into R3. The second event updates the status flags using the new value in R3. As we previously mentioned, R0 is updated from the wcscmp function. If the entered serial number matched the correct serial number, R0 will be updated with a 0. If they didn't match, R0 will be set to 1. R3 is then updated with this value and checked to see if it is negative or zero.

260112AC  movs     r3, r0    Moves R0 into R3 and updates the status flags

Next, the value #1 is moved into R0. This may seem a bit odd, but by moving #1 into R0, the program is setting the stage for the next couple of lines of code.

260112B0  mov     r0, #1    Move #1 into R0

Next, we see another altered MOV command. In this case, the value #0 will be moved into R0 only if the condition flags are not equal (ne), which is based on the status update performed by the previous MOV. In other words, if the serials matched, R0 would have been set to 0 and the Zero flag would have been set to 1, which means the MOVNE opcode would not be executed.

260112B4  movne    r0, #0    If flags are not equal, move #0 into R0

Like the MOV opcode, the ANDS command first executes and then updates the status flags depending on the result. Looking at the last few lines, we can see that R0 should be 1 if the serials did not match. This is because R0 was set to equal #1 a few lines up and was not changed by the MOVNE opcode. Therefore, the AND opcode would result in R3 being set to the value of #1, and the condition flags would be updated to reflect the "equal" status. On the other hand, if the serials did match, R0 would be equal to 1, which would have caused the Zero flag to be set to 0, or "not equal."

260112B8  ands      r3, r0, 0xFF

Next, we see another implementation of the "not equal" conditional opcode. In this case, if the ANDS opcode set the Z flag to 0—which would occur only if the string check passed—the LDRNE opcode would load R1 with the data in SP+8. Recall from our dissection of code in IDA Pro that address 0x0001125C loaded the "correct message" into this memory location. However, if the condition flags are not set at "not equal" or "not zero," this opcode will be skipped.

260112BC  ldrne   r1, [sp, #8]

This is an example of a straightforward move of R4 into R0:

260112C0  mov    r0, r4    Move R4 into R0

This is another example of a simple move of #0 to R3:

260112C4  mov    r3, #0    Move #0 into R3

Again, we see a conditional opcode. In this case, the program will branch to 0x000112F4 if the "not equal" flag is set. Since the conditional flags have not been updated since the ANDS opcode in address 0x000112B8, a correct serial number would result in the execution of this opcode.

260112C8  bne    260112F4 ;      If flag not equal jump to 0x260112F4

If the wrong eight-character serial number was entered, this line would load the "incorrect" message from memory into R1:

260112CC  ldr    r1, [sp, #4]    Load SP+4 into R1 (incorrect message)

This line tells the program to branch to address 0x260112F4:

260112D0  b      260112F4 ;      Jump to 0x260112F4

The final line we will look at is the call to the MessageBoxW function. This command simply takes the value in R1, which will either be the correct message or the incorrect message, and displays it in a message box.

...
260112F4  bl    26011718 ;       MessageBoxW call to display message in R1

 

THE CRACKS

Now that we have dissected the code, we must alter it to ensure that it will accept any serial number as the correct value. As we have illustrated, when executing the program in the MVT, we can crack the serial fairly easily by changing the register values, memory, or condition flags during program execution. However, this type of legerdemain is not going to help the average user who has no interest in reverse engineering. As a result, a cracker will have to make permanent changes to the code to ensure the serial validation will always validate the entered serial.

To do this, the cracker has to find a weak point in the code that can be changed in order to bypass security checks. Fortunately for us, there is typically more than one method by which a program can be cracked. To illustrate, I demonstrate three distinct ways that serial.exe can be cracked using basic techniques.

Crack 1: Sleight of hand

The first method requires three separate changes to the code. The first change is at address 00011294, where R0 is compared to the value #8. If you recall, this is used to ensure that the user-provided serial number is exactly eight characters long. The comparison then updates the condition flags, which are used in the next couple of lines to determine the flow of the program.

To ensure that the flags are set at "equal," we need to alter the compared values. The easiest way to do this is to have the program compare two equal values (i.e., CMP R0, R0). This ensures the comparison returns as "equal," thus tricking the program into passing over the BLT and BGT opcodes in the next two lines.

The next change is at address 0x000112B4, where we find a MOVNE R0, #0 command. As we previously discussed, this command checks the flag conditions, and if they are set at "not equal," the opcode moves the value #0 into R0. The R0 value is then checked when it is moved into R3, which updates the status flags once again.

Since the MOVS command at address 00112AC will set Z = 0 (unless the correct serial is entered), the MOVNE opcode will then execute, thus triggering a chain of events that results in a failed validation. To correct this, we need to ensure the program thinks R0 is always equal to #1 at line 000112B8 (ANDS R3, R0, #0xFF). Since R0 would have been changed to #1 in address 000112B0 (MOV R0, #1), the ANDS opcode would result in a "not equal" for a correct serial.

In other words, we need to change MOVNE R0, #0 to MOVNE R0, #1 to ensure that R0 AND FF outputs 1, which is then used to update the status flags. The program will thus be tricked into validating the incorrect serial.

Here are the changes:

.text:00011294         CMP   R0, #8 -> CMP R0, R0
.text:000112B4         MOVNE  R0, #0 -> MOVNE R0,#1

Determining the necessary changes is the first step to cracking a program. The second step is to actually alter the file. To do this, a cracker uses a hex editor to make changes to the actual .exe file. However, in order to do this, the cracker must know where in the program file she needs to make changes. Fortunately, if she is using IDA Pro, a cracker only has to click on the line she wants to edit and look at the status bar at the bottom of IDA's window, as we previously discussed. As Figure 18 illustrates, IDA clearly displays the memory address of the currently selected line, which can then be used in a hex editor.

Figure 18. Viewing location of 0x00011294 for use in a hex editor
images/sn_0418.gif

Once we know the addresses where we want to make our changes, we will need to determine the values with which we want to update the original hex code. (Fortunately, there are several online reference guides that can help.) We want to make the changes shown in Table 4 to the serial.exe file.

Table 4. Changes to serial.exe

IDA address

Hex address

Original opcode

Original hex

New opcode

New hex

0x11294

0x694

CMP: R0, #8

08 00 50 E3

CMP R0, R0

00 00 50 E1

0x112B4

0x6B4

MOVNE R0, #0

00 00 A0 13

MOVNE R0, #1

01 00 A0 13

To make the changes, perform the following procedures (using UltraEdit).

  1. Open UltraEdit and then open your local serial.exe file in UltraEdit.

  2. Using the left-most column, locate the desired hex address.

  3. Move to the hex code that needs to be changed, and overwrite it.

  4. Save the file as a new file, in case you made a mistake.

Finding the exact address in the hex editor isn't always easy. You will need to count the character pairs from left to right to find the exact location once you locate the correct line.


Crack 2: The NOP slide

The next example uses some of the same tactics as Crack 1, but it also introduces a new method of bypassing the eight-character validation, known as NOP.

The term NOP is a reference to a nonoperation, which means the code is basically null. Many crackers and hackers are familiar with the term NOP due to its prevalence in buffer overflow attacks. In buffer overflows, a NOP slide (as it is often called) is used to make a part of the program do absolutely nothing. The same NOP slide can be used when bypassing a security check in a program.

In our program, we have a CMP opcode that compares the length of the entered serial with the number 8. This results in a status change of the condition flags, which are used by the next two lines to determine if they are executed. While our previous crack bypassed this by ensuring the flags were set at "equal," we can attack the BLT and BGT opcodes by overwriting them with a NOP opcode. Once we do this, the BLT and BGT opcodes no longer exist.

Typical x86 NOPing is done using a series of 0x90s. This will not work on an ARM processor and will result in the following opcode: UMULLLSS R9, R0, R0, R0. This opcode actually performs an unsigned multiply long if the LS condition is met, and then updates the status flags accordingly. It is not a NOP.


The trick we learned to perform a NOP on an ARM processor is to simply replace the target code with a MOV R1, R1 operation. This will move the value R1 into R1 and will not update the status flags. The following code illustrates the NOPing of these opcodes.

.text:00011298         BLT   loc_112E4 -> MOV R1, R1
.text:0001129C         BGT   loc_112E4 -> MOV R1, R1

The second part of this crack was already explained in Crack 1 and requires only the alteration of the MOVNE opcode, as the following portrays:

.text:000112B4         MOVNE  R0, #0 -> MOVNE R0,#1

Table 5 describes the changes you will have to make in your hex editor.

Table 5. Changes to serial.exe for Crack 2

IDA address

Hex address

Original opcode

Original hex

New opcode

New hex

0x11298

0x698

BLT loc_112E4

11 00 00 BA

MOV R1, R1

01 10 A0 E3

0x1129C

0x69C

BLT loc_112E4

10 00 00 CA

MOV R1, R1

01 10 A0 E3

0x112B4

0x6B4

MOVNE, R0, #0

00 00 A0 13

MOVNE R0, #1

01 00 A0 13

Crack 3: Preventive maintenance

At this point you are probably wondering what the point of another example is when you already have two crack methods that work just fine. However, we have saved the best example for last—Crack 3 does not attack or overwrite any checks or validation opcodes, like our previous two examples. Instead, it demonstrates how to alter the registers to our benefit before any values are compared.

If you examine the opcode at 0x00001128C using the MVT, you will see that it sets R1 to the address of the serial that you entered. The length of the serial is then loaded into R0 in the next line, using R1 as the input variable. If the value pointed to by the address in R1 is eight characters long, it is then bumped up against the correct serial number in the wcscmp function. Knowing all this, we can see that the value loaded into R1 is a key piece of data. So, what if we could change the value in R1 to something more agreeable to the program, such as the correct serial?

While this is possible by using the stack pointer to guide us, the groundwork has already been done in 0x0000112A0, where the correct value is loaded into R0. Logic assumes that if it can be loaded into R0 using the provided LDR command, then we can use the same command to load the correct serial into R1. This would trick our validation algorithm into comparing the correct serial with itself, which would always result in a successful match!

The details of the required changes are as shown in Table 6.

Table 6. Changes to serial.exe for Crack 3

IDA address

Hex address

Original opcode

Original hex

New opcode

New hex

0x11298

0x68C

LDR R1, [R4, #0x7C]

7C 10 94 E5

LDR R1, [SP,#0xC]

0C 10 9D E5

Note that this crack only requires the changing of two hex characters (i.e., 7 0 and 4 D). This example is by far the most elegant and foolproof of the three, which is why we saved it for last. While the other two examples are just as effective, they are each a reactive type of crack that attempts to fix a problem. This crack, on the other hand, is a preventative crack that corrects the problem before it becomes one.
 

 

References:

I suggest the following further readings from now on to complete your beginner's training, than you'll be free to specialize in anything you like most, unpacking protectors, writing loaders or other things..

and obviously our tutorial's page ^__^
 

 

Conclusion:

Thanks to the whole ARTeam:
[Nilrem] [JDog45] [Shub - Nigurrath] [MaDMAn_H3rCuL3s] [Ferrari] [Kruger] [Teerayoot] [R@dier] [ThunderPwr] [Eggi] [EJ12N] [Stickman 373] [Bone Enterprise] [KaGra]

Thanks to all the people who take time to write tutorials.
Thanks to all the people who continue to develop better tools.
Thanks to Exetools, Woodmann, SND, TSRH, MP2K, TEAMICU and all the others for being a great place of learning.
Thanks also to The Codebreakers Journal, and the Anticrack forum.

If you have any suggestions, comments or corrections contact me in usual places..