ELF Internals - Part I: The ELF Header
Table of Contents
So… You wanna learn something about those weird Linux executables known as ELFs? Well you are in luck, me too! I’ve taken it upon me to seek around the dark dusty corners of the world wide webs to learn about those ELFs so you do not have to look as much and maybe help you out!
But beware young traveler I am learning with you, so there may be some mistakes.
Part I: What in the Great Heavens is an ELF
First off… No its not a elve!
ELF stands for Executable and Linkable Format, and simply put, its just a File Format commonly used on the *UNIX operating systems for… well plain ol’ executables but also for other stuff like kernel modules (.ko), shared libaries (.so) and even core dumps (.core)!
So why should anyone care?
Well, there are several reasons why one would want to learn more about the ELFs… Here are a few (which I can think of):
Malware
Whether you are analysing or developing mischevious computer software, you the analyist or developer should know at least the basics of the ELF! Why??? Because I said so. Aaand also because for example as the developer you could make the analyists life a nightmare by modifying some silly bytes to confuse the disassemblers/decompilers, or as the analyist you may want to know the
evil tricks of the developer to find out his actual intents while reverse engineering. Or if you are like me and think viruses and file infectors are cool you probably need to know about ELFs to properly write a infector/virus.
Binary Golfing
Ahh so you wanna be part of the cool kids club, ey? Well knowing the ELF structure helps a lot when trying to make your silly little binary as small and cute as possible. The standard ELF loader on Linux does not care all that much about your binaries header integrity as we will see later!
Compatibilty
Maybe you are the big brained fella that wants to create his own operating system and want to use the good ol’ ELF as the standard for your executables or you just want to somehow port the ELF to some other weird operating system, like… I don’t know… Windows???
(dont get any stupid ideas…)
Part II: Let the horrors begin!
The ELF consists of 6 main parts:
- The ELF header
- Program headers (table)
- Section headers (table)
- Sections
- Symbols
- Relocation
Now… Before diving in, lets first create a small binary to have a nice little example:
#include <stdio.h>
int main(void) {
printf("Hello World!\n");
return 0;
}
Now… hexdump that sucka (or whatever you use) and let’s begin!
The ELF Header
The good old ELF header consists of 14 little parts, and those tiny parts contain some very important data, let us take a deeper look.
The ELF header is defined like so: [1]
typedef struct {
unsigned char e_ident[16];
uint16_t e_type;
uint16_t e_machine;
uint32_t e_version;
ElfN_Addr e_entry;
ElfN_Off e_phoff;
ElfN_Off e_shoff;
uint32_t e_flags;
uint16_t e_ehsize;
uint16_t e_phentsize;
uint16_t e_phnum;
uint16_t e_shentsize;
uint16_t e_shnum;
uint16_t e_shstrndx;
} ElfN_Ehdr;
now lets see what each member does:
starting off with e_ident
, it’s a kind of a weird combo of multiple
significant bytes so lets try to not get confused here…
we will use the hexdump output as reference:
7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
<- example
7f 45 4c 46 __ __ __ __ __ __ __ __ __ __ __ __
<- “__” means the loader don’t care
(for more info about golfing and learning about what the loader ignores id check out blogs [1][2])
lets start with the magic bytes 7f 45 4c 46
or just ELF
, this one just
tells us that its a ELF binary, without it, the loader and other stuff will
refuse the binary. Next up, we have the byte after our magic, it tells us in
which format the binary is, so for 32bit executables we have the value 01
and
for 64bit executables we have the value 02
. After that comes the byte that signifies
the endianess of the binary, 01
being little endian and 02
being big endian. Finally
we have the version of the ELF which is (almost) always 01
.
Next we got 2 bytes that are not usually used automatically, the first one
signifies for which OS the binary was compiled for, for example Linux would be 03
and OpenBSD would be 0c
,
and the latter signifies the version of the OS (also not used anymore after Linux kernel version 2.6)
Aaaand finally, we have some padding bytes, used for… well padding.
02 00 3e 00 01 00 00 00 00 10 40 00 00 00 00 00
02 00 3e 00 __ __ __ __ 00 10 40 00 00 00 00 00
next up we have the e_type
member signifying the file type of our binary, 02
meaning a executables, 03
meaning a shared object
and etc. Annd after that we got e_machine
specifying the instruction set, this one can be left out.
Then we have the e_version
, that is 4 bytes long, specifying the version of the ELF (again??). And Finally
we have the e_entry
that just contains the entry point of our file, in this case it is 0x0000000000401000
or just 0x401000
40 00 00 00 00 00 00 00 10 21 00 00 00 00 00 00
40 00 00 00 00 00 00 00 __ __ __ __ __ __ __ __
on to the 3rd line of the dump, we got the e_phoff
or Program Header Offset, is a 8 byte chonk
and a pointer to the program header table. And since it usually just follows after the ELF header
its most of the time at the offset 0x40
on 64bit. Next we have another chonk, the e_shoff
,
this is just a pointer to the Section Header table.
00 00 00 00 40 00 38 00 03 00 40 00 06 00 05 00
__ __ __ __ 40 00 38 00 03 00 __ __ __ __ __ __
Now here the first 4 bytes are architecture specific and belong to the member e_flags
. Next up we have the e_ehsize
which is 2 bytes that just say how big the ELF header is.
After that we got the e_phentsize
member, this one is also 2 bytes and contains the size of “each” program header entry,
this is most likely gonna be 0x38
on 64bit. Right after that we have the number of entries in the program header table (also 2 bytes).
And for the section header table we have the members e_shentsize
, e_shnum
and e_shstrndx
representing the section header
entry size, the number of section header entries and the index of the of the section header table entry that contains the section
names respectfully.
Conclusion
So this is it for the first part of the small series I plan to do on the ELF file format. I hope I did not bore you too much! In the next parts we will cover the program headers and the section headers so be sure to check this site again after a few days.. or weeks… or months. While you wait, I have left you a little assignment you can do to directly use the knowledge you learned today so it helps ya memorise it! ))
P.S. I’ll try my best to make the next post not as boring as this one ))
P.P.S. If you find anything in this article that is not correct be sure to DM me on discord at “deluks.”
ASSIGNMENT:
Pick any language you want and write a simple ELF header parser, keep in mind we will be expanding this as we go. I will be using C and you will see my solution in the last post of the series!
References
[0] https://en.wikipedia.org/wiki/Executable_and_Linkable_Format
[1] https://tmpout.sh/1/1.html *
[2] https://n0.lol/ebm/ *
[3] https://www.man7.org/linux/man-pages/man5/elf.5.html
* be sure to check these awesome posts out, the authors are extremely talented and explain their topics very well!