Jan 022018 in no business, python Lang fanread (4600)
0x00 pyinstaller
Today, master Wilson said that he would pack his Fenghuang scan into an executable file. It is said that he used a thing called pyinstaller, which is very curious. I've only heard about this kind of thing before, so I've made a try, but I'm mainly trying to see if I can restore the packed source code.
Let's take a look at how this works first, which is described in the document. https://pyinstaller.readthedocs.io/en/stable/operating-mode.html
PyInstaller reads a Python script written by you. It analyzes your code to discover every other module and library your script needs in order to execute. Then it collects copies of all those files – including the active Python interpreter! – and puts them with your script in a single folder, or optionally in a single executable file.
According to this statement, pyinstaller will analyze your Python source code, and then copy the required library and the current Python interpreter into a separate folder, or generate a separate executable file. It looks very good. So how does a single file work? It is also mentioned in the document: https://pyinstaller.readthedocs.io/en/stable/operating-mode.html × how-the-one-file-program-works
Here I will briefly talk about it. For more details, please refer to the document. After packaging into a file, the core of the file is actually a bootloader. During execution, a folder named "meixxxxx" will be created under the temporary folder, and then bootloader will decompress some files used in Python script, most of them are so files. The next step is to execute the python script. When the program is finished, the folder will be deleted.
_MEIXXXXX
The packed file does not contain any Python source code, but packs PyC. Of course, the packed PyC does some special processing, which will be discussed later.
0x01 try packing a file
Here I write a small program to look at my IP address. The code is very simple:
#!/usr/bin/env python
# coding: utf-8
# file: myip.py
from __future__ import print_function
import time
import requests
from api import API
def main():
url = API.url
r = requests.get(url)
print(r.content)
if __name__ == "__main__":
# time.sleep(120)
main()
#!/usr/bin/env python
# coding: utf-8
# file: api.py
class API(object):
def __init__(self):
super(API).__init__()
url = "http://myip.ipip.net"
In order to test the package, it is intentionally written a little more complicated and split into two files. Let's try to merge it into an executable file:
# lightless @ VM-UBUNTU in ~/program/pyinstaller [22:18:27]
$ pyinstaller -F myip.py
15 INFO: PyInstaller: 3.2.1
15 INFO: Python: 2.7.12
16 INFO: Platform: Linux-4.4.0-77-generic-x86_64-with-Ubuntu-16.04-xenial
16 INFO: wrote /home/lightless/program/pyinstaller/myip.spec
19 INFO: UPX is not available.
20 INFO: Extending PYTHONPATH with paths
['/home/lightless/program/pyinstaller', '/home/lightless/program/pyinstaller']
20 INFO: checking Analysis
24 INFO: Building because /home/lightless/program/pyinstaller/myip.py changed
24 INFO: Initializing module dependency graph...
25 INFO: Initializing module graph hooks...
57 INFO: running Analysis out00-Analysis.toc
74 INFO: Caching module hooks...
76 INFO: Analyzing /home/lightless/program/pyinstaller/myip.py
2739 INFO: Loading module hooks...
2740 INFO: Loading module hook "hook-httplib.py"...
2740 INFO: Loading module hook "hook-requests.py"...
2742 INFO: Loading module hook "hook-encodings.py"...
3012 INFO: Looking for ctypes DLLs
3058 INFO: Analyzing run-time hooks ...
3067 INFO: Looking for dynamic libraries
3272 INFO: Looking for eggs
3273 INFO: Python library not in binary depedencies. Doing additional searching...
3296 INFO: Using Python library /usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
3299 INFO: Warnings written to /home/lightless/program/pyinstaller/build/myip/warnmyip.txt
3357 INFO: checking PYZ
3359 INFO: Building because toc changed
3360 INFO: Building PYZ (ZlibArchive) /home/lightless/program/pyinstaller/build/myip/out00-PYZ.pyz
3777 INFO: Building PYZ (ZlibArchive) /home/lightless/program/pyinstaller/build/myip/out00-PYZ.pyz completed successfully.
3835 INFO: checking PKG
3836 INFO: Building because /home/lightless/program/pyinstaller/build/myip/out00-PYZ.pyz changed
3836 INFO: Building PKG (CArchive) out00-PKG.pkg
5693 INFO: Building PKG (CArchive) out00-PKG.pkg completed successfully.
5699 INFO: Bootloader /usr/local/lib/python2.7/dist-packages/PyInstaller/bootloader/Linux-64bit/run
5699 INFO: checking EXE
5699 INFO: Rebuilding out00-EXE.toc because pkg is more recent
5699 INFO: Building EXE from out00-EXE.toc
5700 INFO: Appending archive to ELF section in EXE /home/lightless/program/pyinstaller/dist/myip
5728 INFO: Building EXE from out00-EXE.toc completed successfully.
# lightless @ VM-UBUNTU in ~/program/pyinstaller [22:36:53]
$ ./dist/myip
当前 IP:1.1.1.1 来自于:中国 浙江 杭州 联通
0x02 extract Python source code
This is not the focus of this discussion. Let's take a look at how to extract Python source code from packed files.
Looking for some data, we find that pyinstaller actually maintains a data format called pyz and attaches it to the end of the executable file, which starts with pyz. The official also seems to provide a tool to read this data. For details, see this file: https://github.com/pyinstaller/pyinstaller/blob/development/pyinstaller/utils/cliutils/archive_viewer.py
PYZ
PYZ
Take this file and try it:
# lightless @ VM-UBUNTU in ~/program/pyinstaller/dist [22:43:37] C:2
$ python archive_viewer.py myip
Traceback (most recent call last):
File "archive_viewer.py", line 266, in <module>
run()
File "archive_viewer.py", line 258, in run
PyInstaller.log.__process_options(parser, args)
File "/usr/local/lib/python2.7/dist-packages/PyInstaller/log.py", line 49, in __process_options
logger.setLevel(level)
NameError: global name 'logger' is not defined
I will find that I have reported a bug. I don't know how to deal with it. Anyway, I patched the log.py file myself.
There are four commands that can be used:
U: go Up one level
O <name>: open embedded archive name
X <name>: extract name
Q: quit
There are two parts in the list that need attention. One is (3626573, 1576197, 1576197, 0, 'Z', u'out00-pyz. Pyz '). The other is (13072, 301, 497, 1,' s', u'myip '). The out00-pyz.pyz is a variety of libraries that we refer to. You can use the O command to view them.
out00-PYZ.pyz
O
Now let's extract the PyC file:
? x myip
to filename? myip.pyc
?
0x03 restore Python source code
Well, there are countless ways to restore Python source code from PyC. I use easypythonerecompiler here. Direct reduction, found that ruthless wrong.
easypythondecompiler
The main reason is: invalid PyC / Pyo file - Magic value mismatch. We know that every PyC file has a magic head, which is the processing of PyC mentioned earlier. Pyinstaller will take out the magic part of PyC, and we need to make it up by ourselves. The python2 I tested here is 8 bytes in total, the next 4 bytes are time stamps, and the first 4 bytes are Python compiled versions. Of course, this file is compiled by me. I can know that the four bytes should be: X03 \ xf3 \ x0d \ x0a. However, if it is a file that is decompiled and packed by others, you can only look up the table to guess.
Invalid pyc/pyo file - Magic value mismatch!
There seems to be a check available here. If we take out the PyC of the system library in the executable, we will find that the first four bytes actually exist.
Then we can fill in these four bytes directly. After that, it will look like this.
Then you can get the source code.
In the same way, if we do the same operation for the API part of out00-pyz.pyz, we can get the source code of api.py.
out00-PYZ.pyz