The first thing you may noticed for this challenge is that the file size is much larger than the previous challenges, it is about 12 MB! However, from the section table we can find that the size of the PE image is only 0x30600 bytes (198,144 bytes), far less than 12 MB:
That is to say, there is a lot of data appended to the end of this PE file, let’s see what are they:
From offset 0x30600, we can find that the data here is begin with a magic string “PYZ”, which suggests that the append data probably contains compressed Python code. Furthermore, the executable itself may be created by tools like py2exe and may be used to interpret the Python code inside its append data.
If the program indeed works like a Python interpreter, it is not worth to analyze the entire file because it may wastes a lot of time, so let’s look at the strings of this file at first:
The above strings indicate that our previous guess (the program works as a Python interpreter) should be correct. And let’s move on to see if we can find where it interprets the Python code. If we can find such a place, we may read the Python code from the memory.
According to the strings, we can find the following piece of code which retrieves an interesting API named PyRun_SimpleString:
This API is probably used to execute the Python code and if we can put a breakpoint on this API after it was loaded, we may know what Python code will be executed. So firstly, I put a breakpoint on address 0x004026DE, right after the GetProcAddress() call, and then I ran this program in the Immunity Debugger. Unfortunately, the program executed without hitting the breakpoint. A window pops up and presents us the following picture:
What is happened here? The process explorer tells us the story: a child process of the program itself was created!
So why it creates a child process of itself? Or more precise: why the child process can perform different behaviors since its executable is same as the parent process? The answer for this question could be: it receives different augments or it running in a different environment. From the Properties Panel of the parent process, I found a suspicious environment variable named “_MEIPASS2”, which point to a fold in the %TEMP% directory:
To verify if this environment variable can affect the behaviors of this program, I added a system wide environment variable with the same name and value, then ran the program again in the debugger, and finally, the breakpoint was hit!
Now we can know the address of the PyRun_SimpleString() and we can put a breakpoint on this API. If we continue the process, the breakpoint on PyRun_SimpleString() will hit as expect:
From the stack we can know that the first argument passed to the PyRun_SimpleString() is the Python code to be executed, and from the information in the Python code we can know that this program is probably created by the PyInstaller.
After several hits on this API, I found a suspicious Python code:
This Python code concatenates a lot of strings and then decodes and executes those strings. By replacing the function exec() with print() we can get the code to be executed:
In the above code we can easily find what we want:
Now that we know the file is created by PyInstaller, I found a great tool which can decompress all the files in the append data:
With this tool we may solve this challenge within one minute because it can extract the Python code immediately.