This Monday I decided to write a linux executable (elf) without using a compiler! I’ve always been a huge fan of low level programming exercises so this came pretty naturally. There was hardly any information on how to do this online so it took some real software engineering to get it to work. It would be pretty tedious to explain every line of bytes in this article but I put everything into a well commented bash script here: https://github.com/quantumvm/Elf-from-echo/blob/master/elfFromScratch
I used four tools to actually put things together, vim for writing a bash script to document what I was doing, echo to actually get the bytes I need, reaelf to make sure the elf file was being interpreted correctly, and ht editor as a simple hex editor and to view more information about the executable. I had to use a few extra flags on echo to make sure things were being interpreted correctly “-n” to prevent a newline from being appended, and “-e” to make sure bytes like “\x90” were actually being interpreted as bytes and not 4 ascii characters.
The man pages do a great job at explaining the structure of an elf by giving its representation as a series of structs. The only important ones I cared about to get the program to run were the ElfN_Ehdr struct, the Elf32_Phdr struct, and the Elf32_Shdr struct. These correspond to the elf header, program headers, and section headers.
First up was the elf header. It’s really important to get this part right. Although the program header will described what stuff in your file that will get loaded into memory, the elf header will describe where the program header is located, where the section header is located, and most importantly where in memory the program will start executing (yes this is something you can really modify just wait until the post on modifying msfvenom part 2)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
As you can see, the size in bytes is given to the left while the actual field is given to the right. For example “uint32_t e_version;” can be interpreted as 4 bytes that describe the version of the elf file.
I found out some tools like ht actually depend on the e_entry when giving a projection of the contents of the instructions into memory. This is a VERY bad way to show the projection of instructions into memory. It can give very inaccurate results since instructions may not necessarily be loaded in the same spot of memory as the execution begins if the elf file is built in a strange way.
Since I was building this elf file myself I decided to play around with things and build the elf file in a strange way on purpose. Normally the structure of an elf file will look like the following:
1 2 3 4 |
|
I decided to build the program like this for fun:
1 2 3 4 |
|
Surprisingly everything still ran perfectly yay!
Writing the program headers/section headers was pretty boring so I won’t include that process here. Its easier to just see how that was done by taking a look at the bash file for the project. I will make a comment on the section headers though. When designing the program I initially only wanted to have one section header that accounted for the text segment of the program. To make sure everything worked I ended up with two section headers, one which accounted for the “<.text>” segment of the program and the other which accounted for the names of each section. I could have ended up with three section headers but I decided to make this elf file “stripped”
As I found out when you compile an assembly file with nasm and then link it with ld you normally will end up with an additional section header/struct that describes the symbol table. In my program I decided to leave this symbol table out. This was roughly equivalent to using the command strip on the executable.
I didn’t feel like having to go through the hassle of dealing with loaded libraries since this was meant to be a simple program. I decided to rely on my shellcode writing skills and instead write all the needed instructions for a simple hello world program via Linux system calls. First I wrote out my program in assembly and then translated it into bytes later:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
|
One odd thing you may notice about the assembly is the 4 pushes and later mov esp, ecx. What was effectively doing was placing the string “hello world!” onto the stack and then moving a pointer to this string into the register ecx. The system call that that the assembly makes is a call to write. This is done by moving the value 0x4 into the register eax and then making a call to int 0x80. The second function call happens just after this by moving the value 0x1 into eax and then making a call to int 0x80. This was just a function call to exit so the program quit cleanly.
Wrapping things up, I was really proud of the fact I got this to work. Sometimes with developments such as object oriented programming and interpreted languages like java, it can get real easy to become so caught up in abstraction you completely forget your just dealing with a simple state machine.