Aquaboutic | Focus Security Research | Vulnerability Exploit | POC

Home

blog that doesn't shine

Posted by bax at 2020-03-12
all

Jan 022018 in no business, python Lang fanread (4600)

0x00 pyinstaller

Today, master Wilson said that he would pack his Fenghuang scan into an executable file. It is said that he used a thing called pyinstaller, which is very curious. I've only heard about this kind of thing before, so I've made a try, but I'm mainly trying to see if I can restore the packed source code.

Let's take a look at how this works first, which is described in the document. https://pyinstaller.readthedocs.io/en/stable/operating-mode.html

PyInstaller reads a Python script written by you. It analyzes your code to discover every other module and library your script needs in order to execute. Then it collects copies of all those files – including the active Python interpreter! – and puts them with your script in a single folder, or optionally in a single executable file.

According to this statement, pyinstaller will analyze your Python source code, and then copy the required library and the current Python interpreter into a separate folder, or generate a separate executable file. It looks very good. So how does a single file work? It is also mentioned in the document: https://pyinstaller.readthedocs.io/en/stable/operating-mode.html × how-the-one-file-program-works

Here I will briefly talk about it. For more details, please refer to the document. After packaging into a file, the core of the file is actually a bootloader. During execution, a folder named "meixxxxx" will be created under the temporary folder, and then bootloader will decompress some files used in Python script, most of them are so files. The next step is to execute the python script. When the program is finished, the folder will be deleted.

_MEIXXXXX

The packed file does not contain any Python source code, but packs PyC. Of course, the packed PyC does some special processing, which will be discussed later.

0x01 try packing a file

Here I write a small program to look at my IP address. The code is very simple:

#!/usr/bin/env python # coding: utf-8 # file: myip.py from __future__ import print_function import time import requests from api import API def main(): url = API.url r = requests.get(url) print(r.content) if __name__ == "__main__": # time.sleep(120) main() #!/usr/bin/env python # coding: utf-8 # file: api.py class API(object): def __init__(self): super(API).__init__() url = "http://myip.ipip.net"

In order to test the package, it is intentionally written a little more complicated and split into two files. Let's try to merge it into an executable file:

# lightless @ VM-UBUNTU in ~/program/pyinstaller [22:18:27] $ pyinstaller -F myip.py 15 INFO: PyInstaller: 3.2.1 15 INFO: Python: 2.7.12 16 INFO: Platform: Linux-4.4.0-77-generic-x86_64-with-Ubuntu-16.04-xenial 16 INFO: wrote /home/lightless/program/pyinstaller/myip.spec 19 INFO: UPX is not available. 20 INFO: Extending PYTHONPATH with paths ['/home/lightless/program/pyinstaller', '/home/lightless/program/pyinstaller'] 20 INFO: checking Analysis 24 INFO: Building because /home/lightless/program/pyinstaller/myip.py changed 24 INFO: Initializing module dependency graph... 25 INFO: Initializing module graph hooks... 57 INFO: running Analysis out00-Analysis.toc 74 INFO: Caching module hooks... 76 INFO: Analyzing /home/lightless/program/pyinstaller/myip.py 2739 INFO: Loading module hooks... 2740 INFO: Loading module hook "hook-httplib.py"... 2740 INFO: Loading module hook "hook-requests.py"... 2742 INFO: Loading module hook "hook-encodings.py"... 3012 INFO: Looking for ctypes DLLs 3058 INFO: Analyzing run-time hooks ... 3067 INFO: Looking for dynamic libraries 3272 INFO: Looking for eggs 3273 INFO: Python library not in binary depedencies. Doing additional searching... 3296 INFO: Using Python library /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 3299 INFO: Warnings written to /home/lightless/program/pyinstaller/build/myip/warnmyip.txt 3357 INFO: checking PYZ 3359 INFO: Building because toc changed 3360 INFO: Building PYZ (ZlibArchive) /home/lightless/program/pyinstaller/build/myip/out00-PYZ.pyz 3777 INFO: Building PYZ (ZlibArchive) /home/lightless/program/pyinstaller/build/myip/out00-PYZ.pyz completed successfully. 3835 INFO: checking PKG 3836 INFO: Building because /home/lightless/program/pyinstaller/build/myip/out00-PYZ.pyz changed 3836 INFO: Building PKG (CArchive) out00-PKG.pkg 5693 INFO: Building PKG (CArchive) out00-PKG.pkg completed successfully. 5699 INFO: Bootloader /usr/local/lib/python2.7/dist-packages/PyInstaller/bootloader/Linux-64bit/run 5699 INFO: checking EXE 5699 INFO: Rebuilding out00-EXE.toc because pkg is more recent 5699 INFO: Building EXE from out00-EXE.toc 5700 INFO: Appending archive to ELF section in EXE /home/lightless/program/pyinstaller/dist/myip 5728 INFO: Building EXE from out00-EXE.toc completed successfully. # lightless @ VM-UBUNTU in ~/program/pyinstaller [22:36:53] $ ./dist/myip 当前 IP:1.1.1.1 来自于:中国 浙江 杭州 联通

0x02 extract Python source code

This is not the focus of this discussion. Let's take a look at how to extract Python source code from packed files.

Looking for some data, we find that pyinstaller actually maintains a data format called pyz and attaches it to the end of the executable file, which starts with pyz. The official also seems to provide a tool to read this data. For details, see this file: https://github.com/pyinstaller/pyinstaller/blob/development/pyinstaller/utils/cliutils/archive_viewer.py

PYZ PYZ

Take this file and try it:

# lightless @ VM-UBUNTU in ~/program/pyinstaller/dist [22:43:37] C:2 $ python archive_viewer.py myip Traceback (most recent call last): File "archive_viewer.py", line 266, in <module> run() File "archive_viewer.py", line 258, in run PyInstaller.log.__process_options(parser, args) File "/usr/local/lib/python2.7/dist-packages/PyInstaller/log.py", line 49, in __process_options logger.setLevel(level) NameError: global name 'logger' is not defined

I will find that I have reported a bug. I don't know how to deal with it. Anyway, I patched the log.py file myself.

There are four commands that can be used:

U: go Up one level O <name>: open embedded archive name X <name>: extract name Q: quit

There are two parts in the list that need attention. One is (3626573, 1576197, 1576197, 0, 'Z', u'out00-pyz. Pyz '). The other is (13072, 301, 497, 1,' s', u'myip '). The out00-pyz.pyz is a variety of libraries that we refer to. You can use the O command to view them.

out00-PYZ.pyz O

Now let's extract the PyC file:

? x myip to filename? myip.pyc ?

0x03 restore Python source code

Well, there are countless ways to restore Python source code from PyC. I use easypythonerecompiler here. Direct reduction, found that ruthless wrong.

easypythondecompiler

The main reason is: invalid PyC / Pyo file - Magic value mismatch. We know that every PyC file has a magic head, which is the processing of PyC mentioned earlier. Pyinstaller will take out the magic part of PyC, and we need to make it up by ourselves. The python2 I tested here is 8 bytes in total, the next 4 bytes are time stamps, and the first 4 bytes are Python compiled versions. Of course, this file is compiled by me. I can know that the four bytes should be: X03 \ xf3 \ x0d \ x0a. However, if it is a file that is decompiled and packed by others, you can only look up the table to guess.

Invalid pyc/pyo file - Magic value mismatch!

There seems to be a check available here. If we take out the PyC of the system library in the executable, we will find that the first four bytes actually exist.

Then we can fill in these four bytes directly. After that, it will look like this.

Then you can get the source code.

In the same way, if we do the same operation for the API part of out00-pyz.pyz, we can get the source code of api.py.

out00-PYZ.pyz

0xFF 参考文献 pyinstaller document How to decompile files from PyInstaller PYZ file