Home All Posts Random Post

Custom PE Binary stripping tool

In my previous article I mentioned a custom PE binary stripping tool that I coded up in C++. I wanted to do a quick write-up about the tool here as well as share out the source.

The basics

The foundation of the tool is fairly simple and is coded directly to the PE specification as defined by Microsoft. The first step to getting is working was simply #include <Windows.h> and pulling in all the relevant structs as specified in the specification. The only quirk is that the variable size structs are defined based on the architecture that is being compiled which doesn't necessarily match the architecture of the binary that is being stripped. This is fairly easily rectified since the architecture is stored in the binary and structs for both architectures are available in the header with an architecture postfix. The structs and their order identified, the tool simply needs to open the binary as a file and read into all the structs.

// .h
IMAGE_DOS_HEADER dos_header;
void* dos_program_blob;
DWORD signature;
IMAGE_FILE_HEADER file_header;
// Either IMAGE_OPTIONAL_HEADER32 or IMAGE_OPTIONAL_HEADER64 depending on the image type
void* optional_header;
IMAGE_SECTION_HEADER* sections;
void** sections_data;
// .cpp
bool clear_and_read(void* buffer, size_t buffer_size, FILE* f, std::string error_message) {
    memset(buffer, 0, buffer_size);
    const size_t elements = fread(buffer, buffer_size, 1, f);
    if (elements != 1) {
        std::cout << error_message << '\n';
        return false;
    }
    return true;
}
bool PEEditor::read(std::string const& file) {
    input_binary = fopen(file.c_str(), "rb");
    if (input_binary == nullptr) {
        std::cout << "Failed to open file." << '\n';
        return false;
    }
    if (!clear_and_read(&dos_header, sizeof(dos_header), input_binary, "Failed to read in DOS header")) {
        return false;
    }
    // and so on for the rest of the data fields.
    // [...]

With the data in place, stripping unwanted symbols is as easy as adding an enum to specify the removable sections and passing that into a function to write the output which closely mirrors the input function.

Memory reference fix up

Attempting to run the binary produced from the simple program detailed above will not work (except maybe for an empty binary but that isn't very useful)! Lots of data in the binary directly depends on fixed sizes and offsets which removing sections has now upset. The most obvious and first target to rectify is the image optional header which has a number of fields that need to be updated:

void fixup_optional_header(void*& header, bool x86, size_t net_offset_diff) {
    if (x86) {
        auto optional_header = static_cast<IMAGE_OPTIONAL_HEADER32*>(header);
        optional_header->SizeOfImage -= static_cast<DWORD>(net_offset_diff);
        optional_header->SizeOfHeaders -= static_cast<DWORD>(net_offset_diff);
    }
    // else IMAGE_OPTIONAL_HEADER64
}

This is simple enough to do before writing out each header. The only caveat is that net_offset_diff needs to be tracked for each header separately since the tool is also capable of removing entire optional headers. Speaking of removing optional headers, removing a header also necessitates fixing up IMAGE_FILE_HEADER::NUMBER_OF_SECTIONS. Code also needs to be added to fix up all the section headers since any removals are going to relocate them in the binary file. This is as simple as:

void fixup_section_header(IMAGE_SECTION_HEADER& section, size_t net_offset_diff) {
    section.PointerToRawData -= static_cast<DWORD>(net_offset_diff);
}

At this point, I'm realizing that a fully robust implementation would probably be implemented in three phases: read, edit, write. Initially, I wanted to avoid the additional complexity; however, doing so is likely to greatly complicate the code, so I'm going to add an additional strip() phase that will do all the necessary modifications before passing off to the write phase that will write the data back to disk. That decided on, I refactored the existing code and simplified it a bit due to the additional phase.

That's all for now

Initially, I was planning to make this tool able to strip out arbitrary sections in a PE binary and ensure it remained run-able. In its current state, it is mostly a shell to get and write out PE format files with a tiny bit of additional functionality for stripping out DOS binaries. This is as far as I intend to take the tool for now as I've since realized that the .pdata section only exists when using exceptional types (Text-Game still uses a few standard library types, so I didn't realize this immediately) and that was the main space savings I was hoping to realize with the tool. This tool has not been extensively tested so there may still be bugs in the implementation. Additionally, the tool doesn't fix checksum so using this on a driver or system binary will result in the binary failing to load.

Source

I'm currently bringing up https://git.grahalt.com/. Once it is available, I plan to publish this tool in its current state. As noted above, this code is not production ready but intended as more of an outline.