Introduction to Linux binary exploitation - Part 1
In order to begin the process of exploiting Linux binaries, we first need to become familiar with the target. We will need the appropriate instruments in order to accomplish it. Because gathering information is the most important aspect of any security research activity, including binary exploitation, it is very important to know how to use tools that will help us recognize what is happening inside the target binaries.
In contrast to Windows, where the majority of apps feature graphical user interfaces, Linux and the majority of Linux exploitation tools feature command line terminals.
Linux’s default debugger is gdb. It is preinstalled on the majority of Linux distributions and is available via:
sudo apt-get install gdb
To increase the visibility of gdb findings, there are a large number of plugins available on the web that were created by the community and made available to the public.
pwndbg among them. In this article,
PEDA will be utilized.
PEDA is easily downloadable and configurable. Its GitHub repository can be accessed at the following URL:
The installation instructions are simple and straightforward.
/.gdbinit file provides startup commands to be executed by
gdb. You can compare it to a
.bashrc file for
Simply use the command:
to debug a file with gdb. Note that debugging suid binaries or attaching to a more privileged process will not be permitted.
Once the program has been loaded into
gdb, it can be executed using the
You are also allowed to specify command line arguments after the
Find some extra and handy
gdb commands listed below:
disas [function name]- Displays a disassembly of the specified function.
break *0xaddress- Places a breakpoint at the entry of a specified function or address. When a breakpoint is hit, execution will stop.
print [name]– Displays the contents of the specified object. The name may represent a function, register, or variable.
info [name]- Displays information regarding the specified name; for instance, info registers prints the contents of all registers.
step- Advance the program till the next source line is reached (Step Over).
stepi- Follow precisely one instruction.
x– inspect. This command displays numerous memory locations in a variety of formats. It has the syntax
x/[number of units][data type][location name].
For example, you can use:
20 words starting from where
esi points to.
It is important to note that there are two variants of common user Assembly syntax: AT&T and Intel.
What about gdb? It’s employs the
AT&T syntax by default, which implies assembly code is written differently. For instance, a percent sign must precede any register, else the operands are written in reverse order compared to the
Intel syntax, which may cause confusion.
You can change the syntax handling manually by issuing the:
set disassembly-flavor intel
PEDA does this by default, as you can see in its source code below:
def disassemble(self, *arg): """ Wrapper for disassemble command - arg: args for disassemble command Returns: - text code (String) """ code = "" modif = "" arg = list(arg) if len(arg) > 1: if "/" in arg: modif = arg arg = arg[1:] if len(arg) == 1 and to_int(arg) != None: arg += [to_hex(to_int(arg) + 32)] self.execute("set disassembly-flavor intel") out = self.execute_redirect("disassemble %s %s" % (modif, ",".join(arg))) if not out: return None else: code = out return code
Other important tools for analyzing Linux binaries include
You might also utilize
stracetrace library or system calls made by a binary target.
readelfcommand displays information regarding an ELF (Linux executable format) file.
objdumpdisplays object file information. Additionally, it can be used to disassemble Linux executables.
stringsextracts strings that are readable from a binary. This can be used to extract hardcoded paths, “hidden” strings, or names.
ELF (Executable and Linking Format) is the executable file format for Linux and unix.
The components of an ELF file are an ELF-header and ELF-data.
It can be investigated with the
readelf -a /bin/ping
An ELF header contains crucial information for the operating system regarding how to process the file. Here are the most crucial elements of the header:
It begins with the hexadecimal sequence:
7f 45 4c 46
Class specifies the architectural target
Data refers to the endianness type (little or big)
Type may be
CORE (core dumps),
DYN (shared objects),
EXEC (executables), or
REL (relocatable files).
Memory corruption is the cause of core dumps. They can be supplied to
gdb to analyse crashed programs with greater precision. We will use core dumps later on during exploit development.
Before moving on to the last three types, it is necessary to define linkers and loaders.
Linkers are responsible for connecting the names of functions to their actual memory locations. During a call to a function, the linker is responsible for identifying the function’s memory address inside a system library and writing it to the executable’s process memory so that the function may be accessed at that address. A loader’s responsibility is to load programs from storage into memory.
If an executable requests to be loaded at an already occupied memory address, it must be moved. Relocation is the process of repositioning a module in memory to avoid address clashes.
ELF files contain a
.reloc section as this should be done in an organized manner. Whenever the desired loading address is unavailable, it is the responsibility of the
.reloc section to patch the program with new addresses.
Consequently, relative addressing is employed to describe the addresses of program functions.
Relative addressing identifies a function’s address by its offset from the loading base address, as opposed to its entire address.
For example, if the relative virtual address of a function is
0x123 and its program is loaded at
0x804000, the function can be found at
Additionally, executable files may contain symbols. Symbols describe the executable code and include function and variable names, among others. The developer of an executable may disable support for symbols during compilation.
Many function and variable names provide a hint as to what they are intended to achieve; for instance, locating functions titled
printName() can eliminate a great deal of reverse engineering work.
The removal of symbols from an ELF file is known as stripping.
Now that some Linux executable handling ideas have been described, let’s define the three remaining ELF file formats:
- EXEC (executables)
- REL (relocatable files)
- DYN (shared objects)
Executable files can be mapped directly into memory upon execution.
Files that are relocatable are executables that facilitate the relocation process.
In the event of an address collision, they employ relative addressing and modify the addresses of their components with a constant offset to their dynamic base address.
Shared Objects are function libraries. They comprise parts typical of both executable and relocatable files from a technological standpoint. They are frequently identifiable by their
During program launch, shared objects are loaded into a program that will use them.
Sections are common locations within an ELF file that contribute to its functionality. Sections are mapped into the process memory during startup. Mapping involves placing them in the memory of a newly generated process based on their size and the information they contain.
Theoretically, none or all of the section’s permissions -
execute - can be enabled.
Looking to expand your knowledge of vulnerability research and exploitation? Check out our online course, MVRE - Certified Vulnerability Researcher and Exploitation Specialist. In this course, you’ll learn about the different aspects of software exploitation and how to put them into practice.