diff --git a/.gitignore b/.gitignore index 2323b69..9388f97 100644 --- a/.gitignore +++ b/.gitignore @@ -53,3 +53,4 @@ dkms.conf # Our binaries hello/hello +sum/sum diff --git a/README.md b/README.md index 0bfcd04..eef9362 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ Have fun! Here are links to each post: * [Part 1. Introduction](https://github.com/0xAX/asm/blob/master/content/asm_1.md) - * [Say hello to x86_64 Assembly part 2](https://github.com/0xAX/asm/blob/master/content/asm_2.md) + * [Part 2. The `x86_64` concepts](https://github.com/0xAX/asm/blob/master/content/asm_2.md) * [Say hello to x86_64 Assembly part 3](https://github.com/0xAX/asm/blob/master/content/asm_3.md) * [Say hello to x86_64 Assembly part 4](https://github.com/0xAX/asm/blob/master/content/asm_4.md) * [Say hello to x86_64 Assembly part 5](https://github.com/0xAX/asm/blob/master/content/asm_5.md) diff --git a/content/asm_1.md b/content/asm_1.md index d62201c..681e6cd 100644 --- a/content/asm_1.md +++ b/content/asm_1.md @@ -165,7 +165,7 @@ _start: syscall ;; Specify the number of the system call (60 is `sys_exit`). mov rax, 60 - ;; Set the first argument of `sys_exit` to `0`. The `0` status code is success. + ;; Set the first argument of `sys_exit` to 0. The 0 status code is success. mov rdi, 0 ;; Call the `sys_exit` system call. syscall diff --git a/content/asm_2.md b/content/asm_2.md index fcb3399..caeb99b 100644 --- a/content/asm_2.md +++ b/content/asm_2.md @@ -1,235 +1,509 @@ +# The `x86_64` concepts -Some days ago I wrote the first blog post - introduction to x64 assembly - Say hello to x64 Assembly [part 1] which to my surprise caused great interest: +Some days ago I wrote the first post about [Introduction to assembly](https://github.com/0xAX/asm/blob/master/content/asm_1.md) which, to my surprise, caused great interest: ![newscombinator](./assets/newscombinator-screenshot.png) ![reddit](./assets/reddit-screenshot.png) -It motivates me even more to describe my way of learning. During this days I got many feedback from different people. There were many grateful words, but what is more important for me, there were many advices and adequate critics. Especially I want to say thank you words for great feedback to: +It motivated me to continue describing my journey through learning assembly programming for Linux x86_64. During these days I got great feedback from people all over the Internet. There were many words of gratitude, but, what is more important to me, there was also much adequate advice and very useful criticism. Especially, I want to say thank you for the great feedback to: -* [Fiennes](https://reddit.com/user/Fiennes) -* [Grienders](https://disqus.com/by/Universal178/) -* [nkurz](https://news.ycombinator.com/user?id=nkurz) +- [Fiennes](https://reddit.com/user/Fiennes) +- [Grienders](https://disqus.com/by/Universal178/) +- [nkurz](https://news.ycombinator.com/user?id=nkurz) -And all who took a part in discussion at Reddit and Hacker News. There were many opinions, that first part was a not very clear for absolute beginner, that's why i decided to write more informative posts. So, let's start with second part of Say hello to x86_64 assembly. +I also want to say thank you to all who took part in the discussion on [reddit](https://www.reddit.com/r/programming/comments/2exmpu/say_hello_to_assembly_part_1_linux_x8464/). There were many different opinions, some of them saying that the first post was not so clear for an absolute beginner. These comments inspired me to rework the first post and make some things more clear, keeping in mind that the main goal was just an introduction without diving too deep. For the future, I will try my best to write more informative posts. -## Terminology and Concepts +So, let's start with the second part of learning assembly programming where I will try to bring closer the basic `x86_64` concepts. -As i wrote above, I got many feedback from different people that some parts of first post are not clear, that's why let's start from description of some terminology that we will see in this and next parts. +## Terminology and concepts -Register - register is a small amount of storage inside processor. Main point of processor is data processing. Processor can get data from memory, but it is slow operation. That's why processor has own internal restricted set of data storage which name is - register. +Now that we've successfully written and run our first assembly program, it's time to learn the basics. Let's start with some terminology and concepts that we will see and use from now on. -Little-endian - we can imagine memory as one large array. It contains bytes. Each address stores one element of the memory "array". Each element is one byte. For example we have 4 bytes: AA 56 AB FF. In little-endian the least significant byte has the smallest address: +### Processor registers + +One of the first concepts we met in the previous post was a **register**. We agreed that we can consider a register as a small memory slot. Following the definition on [Wikipedia](https://en.wikipedia.org/wiki/Processor_register), we can see that it's not so far from truth: + +> A processor register is a quickly accessible location available to a computer's processor. + +The main goal of a processor is data processing. To process data, a processor must access this data somewhere. Of course, a processor can get data from [main memory](https://en.wikipedia.org/wiki/Random-access_memory), but it is a very slow operation. If we take a look at the [Latency Numbers Every Programmer Should Know](https://samwho.dev/numbers), we can see the following picture: ``` - 0 FF - 1 AB - 2 56 - 3 AA +L1 cache reference = 1ns +... +... +... +Main memory reference = 100ns ``` -where 0,1,2 and 3 are memory addresses. +Access to the [L1 CPU cache](https://en.wikipedia.org/wiki/CPU_cache) is `100x` times faster than access to the main memory. The processor registers are even "closer" to the processor. For comparison, you can take a look at the list of latencies for different instructions by [Agner Fog](https://www.agner.org/optimize/#manual_instr_tab). + +There are different types of registers on the `x86_64` processors: + +- General purpose registers +- Segment registers +- RFLAGS registers +- Control registers +- Model-specific registers +- Debug registers +- x87 FPU registers +- MMX registers +- XMM registers +- YMM registers +- ZMM registers +- Bounds registers +- Memory management registers + +You can find a detailed description of registers in the [Intel software developer manuals](https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html). For now, we will focus only on the **general purpose registers** as we will use them in most examples. If we will use other registers, I will mention it beforehand. We already saw a table with general purpose registers in the [previous chapter](https://github.com/0xAX/asm/blob/master/content/asm_1.md#cpu-registers-and-system-calls): + +![registers](/content/assets/registers.png) + +There are 16 registers of 64 bits size, from `rax` to `r15`. Each register also has smaller parts with their own names. For example, as we may see in the table above, the lower 32 bits of the `rax` register are called `eax`. Similarly, the lower 16 bits of the `eax` register are called `ax`. Finally, the lower 8 bits of the `ax` register are called `al`, while the higher 8 bits are called `ah`. We can visualize this as: + +![rax](./assets/rax.svg) + +The general purpose registers are used in many different cases, like performing arithmetic and logical operations, transferring data, memory address calculation operations, passing parameters to functions and system calls, and many more. When going through these chapters, we will see how to use the general purpose registers to perform different operations. + +### Endianness + +A computer operates with bytes of data. The bytes can be stored in memory in different order. The order in which a computer stores a sequence of bytes is called **endianness**. There are two types of endianness: + +- `big` +- `little` + +We can imagine memory as one large array of bytes, where each byte has its own unique address. For example, let's say we have the following four bytes in memory: `AA 56 AB FF`. In the `little-endian` order, the least significant byte is stored at the smallest memory address: + +| Address | Byte | +|--------------------|------| +| 0x0000000000000000 | 0xFF | +| 0x0000000000000001 | 0xAB | +| 0x0000000000000002 | 0x56 | +| 0x0000000000000003 | 0xAA | + +In the case of the `big-endian` order, the bytes are stored in the opposite order, so the most significant byte is stored at the smallest memory address: + +| Address | Byte | +|--------------------|------| +| 0x0000000000000000 | 0xAA | +| 0x0000000000000001 | 0x56 | +| 0x0000000000000002 | 0xAB | +| 0x0000000000000003 | 0xFF | + +### System calls + +A [system call](https://en.wikipedia.org/wiki/System_call) is a set of APIs provided by an operating system. A user-level program can use these APIs to achieve different functionalities that an operating system kernel provides. As mentioned in the previous [chapter](https://github.com/0xAX/asm/blob/master/content/asm_1.md#cpu-registers-and-system-calls), you can find all the system calls of the Linux kernel for the `x86_64` architecture [here](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl). + +There are two ways to execute a system call: + +- Using a library function like [printf](https://man7.org/linux/man-pages/man3/fprintf.3.html), which internally calls the OS-level API +- Making a system call directly using the `syscall` instruction + +To call a system call directly, we must prepare the parameters of this function. As we saw in the previous post, some general purpose registers were used for that. The rules that define how the system calls and functions are called and how their parameters are passed are called **calling conventions**. For the Linux `x86_64`, the calling conventions are specified in the [System V Application Binary Interface](https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf) PDF document. + +Let's take a closer look at how arguments are passed to a system call if we trigger a system call using the `syscall` instruction. + +The first six parameters are passed in the following general purpose registers: + +- `rdi` - used to pass the first argument to a function. +- `rsi` - used to pass the second argument to a function. +- `rdx` - used to pass the third argument to a function. +- `r10` - used to pass the fourth argument to a function. +- `r8` - used to pass the fifth argument to a function. +- `r9` - used to pass the sixth argument to a function. + +If we use a user-level wrapper instead of calling a system call directly, the order of registers and parameters will be different. For now, let's focus only on the calling conventions of the system calls using the Linux kernel interface. The system call number is passed in the `rax` register. Once we set up all the parameters in their respective registers, we can call the system call with the instruction `syscall`. After the system call is finished, the result is returned in the `rax` register. -Big-endian - big-endian stores bytes in opposite order than little-endian. So if we have AA 56 AB FF bytes sequence it will be: +### Program sections + +As we saw in the first post, each program consists of program sections (or segments). Each executable file on Linux x86_64 is represented in the [ELF](https://en.wikipedia.org/wiki/Executable_and_Linkable_Format) format. An ELF file has a table of sections that a program consists of. We can see a list of sections in our `hello` program from the previous post using the [readelf](https://man7.org/linux/man-pages/man1/readelf.1.html) utility: ``` - 0 AA - 1 56 - 2 AB - 3 FF +~$ strip hello && readelf -S hello + +There are 4 section headers, starting at offset 0x2028: + +Section Headers: + [Nr] Name Type Address Offset + Size EntSize Flags Link Info Align + [ 0] NULL 0000000000000000 00000000 + 0000000000000000 0000000000000000 0 0 0 + [ 1] .text PROGBITS 0000000000401000 00001000 + 0000000000000027 0000000000000000 AX 0 0 16 + [ 2] .data PROGBITS 0000000000402000 00002000 + 000000000000000d 0000000000000000 WA 0 0 4 + [ 3] .shstrtab STRTAB 0000000000000000 0000200d + 0000000000000017 0000000000000000 0 0 1 +Key to Flags: + W (write), A (alloc), X (execute), M (merge), S (strings), I (info), + L (link order), O (extra OS processing required), G (group), T (TLS), + C (compressed), x (unknown), o (OS specific), E (exclude), + D (mbind), l (large), p (processor specific) ``` -Syscall - is the way a user level program asks the operating system to do something for it. You can find syscall table - [here](https://github.com/torvalds/linux/blob/master/arch/x86/entry/syscalls/syscall_64.tbl). - -Stack - processor has a very restricted count of registers. So stack is a continuous area of ​​memory addressable special registers `RSP`,`SS`,`RIP` and etc. We will take a closer look on stack in next parts. +As we can see, there are four sections. Two of them (`text` and `data`) were defined by us while writing the assembly code. The remaining two sections were added automatically by the compiler. While we can technically define any section in our program, there are some well-known, commonly used sections: -Section - every assembly program consists from sections. There are following sections: +- `data` - used for declaring initialized data or constants. +- `bss` - used for declaring non-initialized variables. +- `text` - used for the code of the program. +- `shstrtab` - stores references to the existing sections. -* `data` - section is used for declaring initialized data or constants -* `bss` - section is used for declaring non initialized variables -* `text` - section is used for code +### Data types -General-purpose registers - there are 16 general-purpose registers - rax, rbx, rcx, rdx, rbp, rsp, rsi, rdi, r8, r9, r10, r11, r12, r13, r14, r15. Of course, it is not a full list of terms and concepts which related with assembly programming. If we will meet another strange and unfamiliar words in next blog posts, there will be explanation of this words. +Assembly is not a [statically typed programming language](https://en.wikipedia.org/wiki/Category:Statically_typed_programming_languages). Usually, we work directly with a raw set of bytes. However, [NASM](https://nasm.us/) provides us with some helpers to define the size of data we are operating. The fundamental data types are: -## Data Types +- `byte` - 8 bits +- `word` - 2 bytes +- `doubleword` - 4 bytes +- `quadword` - 8 bytes -The fundamental data types are bytes, words, doublewords, quadwords, and double quadwords. A byte is eight bits, a word is 2 bytes, a doubleword is 4 bytes, a quadword is 8 bytes and a double quadword is 16 bytes (128 bits). +A byte is eight bits, a word is two bytes, a doubleword is four bytes, and a quadword is eight bytes. NASM provides pseudo-instructions to help us define the size: -Now we will work only with integer numbers, so let's see to it. There two types of integer: unsigned and signed. Unsigned integers are unsigned binary numbers contained in a byte, word, doubleword, and quadword. Their values range from 0 to 255 for an unsigned byte integer, from 0 to 65,535 for an unsigned word integer, from 0 to 2^32 – 1 for an unsigned doubleword integer, and from 0 to 2^64 – 1 for an unsigned quadword integer. Signed integers are signed binary numbers held as unsigned in a byte, word and etc... The sign bit is set for negative integers and cleared for positive integers and zero. Integer values range from –128 to +127 for a byte integer, from –32,768 to +32,767 for a word integer,from –2^31 to +2^31 – 1 for a doubleword integer, and from –2^63 to +2^63 – 1 for a quadword integer. +- `DB` - `byte` - 8 bits +- `DW` - `word` - 2 bytes +- `DD` - `doubleword` - 4 bytes +- `DQ` - `quadword` - 8 bytes +- `DT` - 10 bytes +- `DO` - 16 bytes +- `DY` - 32 bytes +- `DZ` - 64 bytes -## Sections +The pseudo-instructions from `DB` to `DQ` are used to define data with the size from `byte` to `quadword`. Additionally, `DT` is used to define 10 bytes, `DO` is used to define 16 bytes, `DY` is used to define 32 bytes, and `DZ` is used to define 64 bytes. -As i wrote above, every assembly program consists from sections, it can be data section, text section and bss section. Let's look on data section.It's main point - to declare initialized constants. For example: +For example: ```assembly section .data - num1: equ 100 - num2: equ 50 - msg: db "Sum is correct", 10 + ;; Define a byte with the value 100 + num1 db 100 + ;; Define 2 bytes with the value 1024 + num2 dw 1024 + ;; Define a set of characters (10 is a new line symbol in ASCII \n) + msg db "Sum is correct", 10 ``` +There are also alternatives to define uninitialized storage - `RESB`, `RESW`, `RESD`, `RESQ`, `REST`, `RESO`, `RESY`, and `RESZ`. These are used similarly to `DB` - `DZ`, but in this case, we do not provide an initial value for the defined variable. -Ok, it is almost all clear here. 3 constants with name num1, num2, msg and with values 100, 50 and "Sum is correct", 10. But what is it db, equ? Actual NASM supports a number of pseudo-instructions: +For example: + +```assembly +section .bss + ;; Define a buffer with the size 64 bytes + buffer resb 64 +``` -* DB, DW, DD, DQ, DT, DO, DY and DZ - are used for declaring initialized data. For example: +After defining variables, we can start using them in our program's code. To use a variable, we can refer it by name. However, there is a small thing to remember in the NASM assembly syntax: accessing a variable by name gets us its address, not the actual value it stores: ```assembly -;; Initialize 4 bytes 1h, 2h, 3h, 4h -db 0x01,0x02,0x03,0x04 +;; Move the address of the `num1` variable to the al register +move al, num1 +``` + +To get the actual value located in the given address, we need to specify the variable name in square brackets: -;; Initialize word to 0x12 0x34 -dw 0x1234 +```assembly +;; Move the value of num1 to the al register +mov al, [num1] ``` -* RESB, RESW, RESD, RESQ, REST, RESO, RESY and RESZ - are used for declaring non initialized variables -* INCBIN - includes External Binary Files -* EQU - defines constant. For example: +### Stack + +We can not dive into assembly programming without understanding one of the crucial concepts of the `x86_64` (and other) architectures — the [stack](https://en.wikipedia.org/wiki/Stack_(abstract_data_type)). The stack is a specific memory area in a program that is accessed in a last-in, first-out (LIFO) pattern. + +A processor has a very limited number of registers. As we already know, an `x86_64` processor provides access to only 16 general purpose registers. Often, we may need more – or even much more space – to store data. One way to solve this issue is by using the program's stack. We can look at the stack as the usual memory area but with one significant difference – its access pattern. + +In the usual [RAM](https://en.wikipedia.org/wiki/Random-access_memory) model, we can access any byte of memory accessible to our user-level application. However, the stack is accessed using a last-in, first-out (LIFO) pattern. Two specific instructions are used to interact with the stack: one to push a value into it and another to pop a value from it: + +- `push` - pushes the operand into the stack. +- `pop` - pops the top value from the stack. + +The stack grows downwards, from higher memory addresses to lower ones. So, when we refer to `top of the stack`, we actually mean the memory location with the lowest address in the current stack. In other words, the top of the stack is a pointer to the element with the lowest address. The general purpose register `rsp` should point to the top of the stack. + +In the [system call](#system-call) section, we saw that the first six arguments of a system call are passed in the general purpose registers. According to the calling conventions document: + +> System-calls are limited to six arguments, no argument is passed directly on the stack. + +So the available number of general purpose registers should be enough to execute any system call. But what about other functions? What if one has more than six arguments? In this case, the first six parameters are also passed in general purpose registers and all the next parameters are passed on the stack. The set of general purpose registers to call a library function is slightly different from the set of registers used for a system call: + +- `rdi` - used to pass the first argument to a function. +- `rsi` - used to pass the second argument to a function. +- `rdx` - used to pass the third argument to a function. +- `rcx` - used to pass the fourth argument to a function. +- `r8` - used to pass the fifth argument to a function. +- `r9` - used to pass the sixth argument to a function. + +Let’s take a look at the simple code example written in the C programming language to better illustrate the topic: + +```C +int foo(int arg1, int arg2, int arg3, int arg4, int arg5, int arg6, int arg7, int arg8) { + return arg1 + arg2 + arg3 + arg4 + arg5 + arg6 + arg7 + arg8; +} + +int bar() { + return foo(1, 2, 3, 4, 5, 6, 7, 8); +} +``` + +If we compile it and look at the assembly code, we will see such an output of the program: + +```assembly +bar: + ;; Preserve the base pointer + push rbp + ;; Set the new frame base pointer + mov rbp, rsp + ;; Push the eight argument on the stack + push 8 + ;; Push the seventh argument on the stack + push 7 + ;; Push the sixth argument on the stack + mov r9d, 6 + ;; Push the fifth argument on the stack + mov r8d, 5 + ;; Push the fourth argument on the stack + mov ecx, 4 + ;; Push the third argument on the stack + mov edx, 3 + ;; Push the second argument on the stack + mov esi, 2 + ;; Push the first argument on the stack + mov edi, 1 + ;; Call the function `foo` + call foo + ;; Clean up the stack from the 8th and 7th arguments + add rsp, 16 + ;; Restore the old rbp + leave + ;; Return from the function + ret +foo: + ;; Preserve the base pointer + push rbp + ;; Set the new frame base pointer + mov rbp, rsp + ;; Move 4 bytes value from the edi register to the address stored in the rbp register minus 4 bytes offset + mov DWORD PTR [rbp-4], edi + ;; Move 4 bytes value from the esi register to the address stored in the rbp register minus 8 bytes offset + mov DWORD PTR [rbp-8], esi + ;; Move 4 bytes value from the edx register to the address stored in the rbp register minus 12 bytes offset + mov DWORD PTR [rbp-12], edx + ;; Move 4 bytes value from the ecx register to the address stored in the rbp register minus 16 bytes offset + mov DWORD PTR [rbp-16], ecx + ;; Move 4 bytes value from the r8d register to the address stored in the rbp register minus 20 bytes offset + mov DWORD PTR [rbp-20], r8d + ;; Move 4 bytes value from the r9d register to the address stored in the rbp register minus 24 bytes offset + mov DWORD PTR [rbp-24], r9d + ... + ... # Skip arithmetic operations for now + ... + ;; Restore the old rbp + pop rbp + ;; Return from the function + ret +``` + +> [!NOTE] +> The C program should be compiled without any optimization flags because the compiler may eliminate all the calculations and just calculate the sum of arguments in compile time. You can use `-O0 -masm=intel` flags for the compiler to avoid optimization. You can use tools like [godbolt](https://godbolt.org/) to see the assembly output of these functions. + +First of all, let's take a look at the first two lines of code in the function `bar`: ```assembly -;; now one is 1 -one equ 1 +push rbp +mov rbp, rsp ``` -* TIMES - Repeating Instructions or Data. (description will be in next posts) +These two instructions at the beginning of each function are called [function prologue](https://en.wikipedia.org/wiki/Function_prologue_and_epilogue#Prologue). Each function usually operates with a part of the stack. Such a part is called a [stack frame](https://en.wikipedia.org/wiki/Call_stack). To manage the stack, the CPU uses these general purpose registers: + +- `rip` - this register is the so-called `instruction pointer`. It stores the address of the next instruction the CPU is going to execute. When the CPU meets the `call` instruction to call a function, it pushes the address of the next instruction to run after the function call to the stack. This is done so the CPU knows where to continue the program's execution after the function call. +- `rsp` - this register is called a `stack pointer` and should point to the top of the stack. After we push something to the stack using the `push` instruction, the stack pointer address decreases. After we pop something from the stack using the `pop` instruction, the stack pointer address increases. +- `rbp` - this register is the so-called `frame pointer` or `base pointer` that points to the stack frame. As mentioned above, each function has its own stack frame, which is a memory area where the function stores [local variables](https://en.wikipedia.org/wiki/Local_variable) and other data. + +Now that we know the rough meaning of the stack frame and the usage of the `rbp`, `rsp`, and `rip` registers, let's try to understand what happens when we call a function. Let's look at the stack before the `call foo` is executed. Our stack looks like this: + +![stack-before-call](./assets/stack-before-call.svg) + +After executing the `call` instruction, the return address (or address of the next instruction) is pushed to the stack. So our stack layout during the call looks like this: + +![stack-during-call](./assets/stack-during-call.svg) + +At the beginning of every new function, we must preserve the current value of the `rbp` register by pushing it onto the stack. This value acts as a base pointer of the previous function. In other words, the value of the `rbp` at the beginning of each function represents the address of the bottom (or the base) of the caller's stack. Since we are in a new function, it needs a new stack frame and, as a result, a new base. At this point, the stack layout looks like this: + +![stack-preserve-bp](./assets/stack-preserve-bp.svg) + +The next step is to put the current value of the stack pointer (`rsp`) into the `rbp` register. This marks the beginning of the new stack frame for our function `foo`. With the stack frame ready, we can start managing function parameters and local variables. + +The first six parameters of the `foo` function are passed using general purpose registers in the function `bar`. The eighth and seventh parameters of `foo` are pushed into the stack using `push` instructions within `bar`. Note that the eighth and seventh arguments are pushed into the stack in a specific order - the value `8` is pushed first, followed by `7`. As mentioned above, the stack operates with the `last in, first out` pattern. So if we use the `pop` instruction right after pushing these two values, we'd get `7` first, followed by `8`. + +For calculations, we need to access the input parameters. This is done using the address stored in the `rbp` register and offsets from this address. These offsets are negative because the stack grows down towards lower memory addresses. At first, we move the value stored in the `edi` register (the first argument of the `foo` function) to the address stored in the `rbp` register with the `-4` bytes offset. Similarly, we move the value stored in the `esi` register (the second argument) to the address stored in the `rbp` register with the `-8` bytes offset. We repeat this operation for all six input arguments. + +Let's read once again the sentence from the paragraph above: + +> For calculations, we need to access the input parameters. This is done using the address stored in the `rbp` register and offsets from this address. + +What was the address stored in the `rbp` register? Our stack pointer! So after running the last `mov` instruction in the function `foo`, our stack frame looks like this: + +![stack](./assets/stack.svg) + +That is the whole point of the `rbp` register. It plays the role of an anchor or a base point in the function. Using the positive offsets, we can access the return address and parameters pushed onto the stack by the caller. On the other hand, negative offsets allow us to access local variables of the current function. + +Right before returning from the `foo` function, we encounter the so-called [function epilogue](https://en.wikipedia.org/wiki/Function_prologue_and_epilogue#Epilogue). During this step, we restore the initial value of `rbp` by removing it from the stack. Finally, the last `ret` instruction pops the return address from the stack, and the program's execution continues from this address. + +## Example + +Now that we've covered the key concepts, it's time to move on to the most interesting part — writing our second simple assembly program. The program will take two integer numbers, calculate their sum, and compare the result with a third predefined number. If the predefined number is equal to the sum, the program will print a message on the screen; otherwise, it will just exit. + +Before diving into the code, we must understand how to execute basic arithmetic and comparison instructions. -## Arithmetic operations +### Basic arithmetic instructions -There is short list of arithmetic instructions: +Here is the list of common assembly instructions used for arithmetic operations: -* `ADD` - integer add -* `SUB` - substract -* `MUL` - unsigned multiply -* `IMUL` - signed multiply -* `DIV` - unsigned divide -* `IDIV` - signed divide -* `INC` - increment -* `DEC` - decrement -* `NEG` - negate +- `ADD` - Addition +- `SUB` - Subtraction +- `MUL` - Unsigned multiplication +- `IMUL` - Signed multiplication +- `DIV` - Unsigned division +- `IDIV` - Signed division +- `INC` - Increment +- `DEC` - Decrement +- `NEG` - Negation -Some of it we will see at practice in this post. Other will be covered in next posts. +We will use these instructions and explain the details in the following example. -## Control flow +### Basic control flow -Usually programming languages have ability to change order of evaluation (with if statement, case statement, goto and etc...) and assembly has it too. Here we will see some of it. There is cmp instruction for performing comparison between two values. It is used along with the conditional jump instruction for decision making. For example: +Now let's take a look at our first [control flow](https://en.wikipedia.org/wiki/Control_flow) instructions. Usually, programming languages allow you to change the order of evaluation (for example with `if`, `case`, or `goto` statements). The assembly programming language also provides the basic ability to change the flow of our programs. The first such instruction is `cmp`. This instruction takes two values and compares them. Usually, it is used along with the conditional jump instruction. For example: ```assembly -;; compare rax with 50 +;; Compare the value of the rax register with 50 cmp rax, 50 ``` -The `cmp` instruction just compares 2 values, but doesn't affect them and doesn't execute anything depend on result of comparison. For performing any actions after comparison there is conditional jump instructions. It can be one of it: +The `cmp` instruction only compares the parameters without affecting their values. To perform any actions after the comparison, you can use **conditional jump instructions**: -* `JE` - if equal -* `JZ` - if zero -* `JNE` - if not equal -* `JNZ` - if not zero -* `JG` - if first operand is greater than second -* `JGE` - if first operand is greater or equal to second -* `JA` - the same that JG, but performs unsigned comparison -* `JAE` - the same that JGE, but performs unsigned comparison +- `JE/JZ` - Jump if the values are equal. +- `JNE/JNZ` - Jump if the values are not equal. +- `JG` - Jump if the first value is greater than the second. +- `JGE` - Jump if the first value is greater or equal to the second. +- `JA` - The same as `JG`, but performs the unsigned comparison. +- `JAE` - The same as `JGE`, but performs the unsigned comparison. -For example if we want to make something like if/else statement in C: +For example, let's take the following `if/else` statement written in C: ```C if (rax != 50) { - exit(); + foo(); } else { - right(); + bar(); } ``` -will be in assembly: +When translated to assembly, this statement looks like this: ```assembly -;; compare rax with 50 +;; Compare rax with 50 cmp rax, 50 -;; perform .exit if rax is not equal 50 -jne .exit -jmp .right +;; Jump to the label `.foo` if the value of the `rax` register is not equal to 50 +jne .foo +;; Jump to the label `.bar` otherwise +jmp .bar ``` -There is also unconditional jump with syntax: +In addition, we can jump to a label without any conditions using the `jmp` instruction: ```assembly -JMP label +jmp .label ``` -For example: +Unconditional jumps are often used to simulate loops in assembly programming. For example, a label is placed at the beginning of a code block. The code runs, and then a condition is checked. If the condition isn't met, the program jumps back to the label and repeats the code. The topic of loops will be covered in more detail in the next posts. -```assembly -_start: - ;; .... - ;; do something and jump to .exit label - ;; .... - jmp .exit +### Program example -.exit: - mov rax, 60 - mov rdi, 0 - syscall -``` - -Here we have can have some code which will be after _start label, and all of this code will be executed, assembly transfer control to .exit label, and code after .exit: will start to execute. - -Often unconditional jump uses in loops. For example we have label and some code after it. This code executes anything, than we have condition and jump to the start of this code if condition is not successfully. Loops will be covered in next parts. - -## Example +Now that we've covered basic arithmetic and control flow instructions, we can write a simple program. As a reminder, this program will calculate the sum of two integers and compare it to a predefined third number. If the sum equals the third number, a string will be printed; otherwise, the program will just exit. -Let's see simple example. It will take two integer numbers, get sum of these numbers and compare it with predefined number. If predefined number is equal to sum, it will print something on the screen, if not - just exit. Here is the source code of our example: +Here is the source code of our example: ```assembly +;; Definition of the .data section section .data - ; Define constants - num1: equ 100 - num2: equ 50 - ; initialize message - msg: db "Sum is correct\n" - + ;; The first number + num1 dq 0x64 + ;; The second number + num2 dq 0x32 + ;; The message to print if the sum is correct + msg db "The sum is correct!", 10 + +;; Definition of the .text section section .text - + ;; Reference to the entry point of our program global _start -;; entry point +;; Entry point _start: - ; set num1's value to rax - mov rax, num1 - ; set num2's value to rbx - mov rbx, num2 - ; get sum of rax and rbx, and store it's value in rax + ;; Set the value of num1 to rax + mov rax, [num1] + ;; Set the value of num2 to rbx + mov rbx, [num2] + ;; Get the sum of rax and rbx. The result is stored in rax. add rax, rbx - ; compare rax and 150 +.compare: + ;; Compare the rax value with 150 cmp rax, 150 - ; go to .exit label if rax and 150 are not equal + ;; Go to the .exit label if the rax value is not 150 jne .exit - ; go to .rightSum label if rax and 150 are equal - jmp .rightSum - -; Print message that sum is correct -.rightSum: - ;; write syscall - mov rax, 1 - ;; file descritor, standard output - mov rdi, 1 - ;; message address - mov rsi, msg - ;; length of message - mov rdx, 15 - ;; call write syscall + ;; Go to the .correctSum label if the rax value is 150 + jmp .correctSum + +;; Print a message that the sum is correct +.correctSum: + ;; Specify the system call number (1 is `sys_write`). + mov rax, 1 + ;; Set the first argument of `sys_write` to 1 (`stdout`). + mov rdi, 1 + ;; Set the second argument of `sys_write` to the reference of the `msg` variable. + mov rsi, msg + ;; Set the third argument to the length of the `msg` variable's value (20 bytes). + mov rdx, 20 + ;; Call the `sys_write` system call. syscall - ; exit from program + ;; Go to the exit of the program. jmp .exit -; exit procedure +;; Exit procedure .exit: - ; exit syscall - mov rax, 60 - ; exit code - mov rdi, 0 - ; call exit syscall + ;; Specify the number of the system call (60 is `sys_exit`). + mov rax, 60 + ;; Set the first argument of `sys_exit` to 0. The 0 status code is success. + mov rdi, 0 + ;; Call the `sys_exit` system call. syscall ``` -Let's go through the source code. First of all there is data section with two constants num1, num2 and variable msg with "Sum is correct\n" value. Now look at 14 line. There is begin of program's entry point. We transfer num1 and num2 values to general purpose registers rax and rbx. Sum it with add instruction. After execution of add instruction, it calculates sum of values from rax and rbx and store it's value to rax. Now we have sum of num1 and num2 in the rax register. +First, let's build our program using the commands we learned in the previous chapter to see the result: + +```bash +$ nasm -f elf64 -o program.o program.asm +$ ld -o program program.o +``` + +After building the program, we can run it with: + +```bash +~$ ./program +Sum is correct +``` + +Now let's go through the source code of our program. First of all, there is the `.data` section with three variables: + +- `num1` +- `num2` +- `msg` + +The entry point of our program is the `_start` symbol. At the beginning of the source code, we put the values of `num1` and `num2` to the general purpose registers `rax` and `rbx`. Then, we use the `add` instruction to calculate their sum, which will be stored in the `rax` register. + +Based on our program's description, the next step is to compare the sum of the two numbers with the predefined value. We use the `cmp` instruction for this. At this point, there are two possible outcomes: + +- If the value in the `rax` register (which stores the sum of `num1` and `num2`) is not equal to 150, we jump to the `.exit` label. +- If the sum is equal to 150, we jump to the `.correctSum` label. -Ok we have num1 which is 100 and num2 which is 50. Our sum must be 150. Let's check it with cmp instruction. After comparison rax and 150 we check result of comparison, if rax and 150 are not equal (checking it with jne) we go to .exit label, if they are equal we go to .rightSum label. +The source code of both `.correctSum` and `.exit` sub-routines should look familiar, as they are similar to what we covered in the previous chapter. The `.correctSum` sub-routine prints the string from `msg` to the screen, while the `.exit` sub-routine ensures a graceful exit from the program. -Now we have two labels: .exit and .rightSum. First is just sets 60 to rax, it is exit system call number, and 0 to rdi, it is a exit code. Second is .rightSum is pretty easy, it just prints Sum is correct. +We’ve just written our second program — great job 🎉 In the next post, we’ll continue exploring assembly programming and dive deeper into the stack topic. If you have any questions or thoughts, feel free to reach out. See you in the next post! diff --git a/content/assets/rax.svg b/content/assets/rax.svg new file mode 100644 index 0000000..b2d6493 --- /dev/null +++ b/content/assets/rax.svg @@ -0,0 +1,4 @@ + + + +

AH

AH

AL

AL

RAX

RAX

EAX

EAX

AX

AX
64 bits
64 bits
32 bits
32 bits
16 bits
16 bits
8
 bits
8...
8
 bits
8...
Text is not SVG - cannot display
\ No newline at end of file diff --git a/content/assets/stack-before-call.svg b/content/assets/stack-before-call.svg new file mode 100644 index 0000000..fbbbc72 --- /dev/null +++ b/content/assets/stack-before-call.svg @@ -0,0 +1,4 @@ + + + +
RBP
RBP
8 (8th foo arg)
8 (8th foo arg)
7 (7th foo arg)
7 (7th foo arg)
RSP
RSP
Stack grows down
Stack grow...
...
...
Text is not SVG - cannot display
\ No newline at end of file diff --git a/content/assets/stack-during-call.svg b/content/assets/stack-during-call.svg new file mode 100644 index 0000000..beb8db9 --- /dev/null +++ b/content/assets/stack-during-call.svg @@ -0,0 +1,4 @@ + + + +
RBP
RBP
8 (8th foo arg)
8 (8th foo arg)
7 (7th foo arg)
7 (7th foo arg)
Return address to bar
Return address to bar
RSP
RSP
Stack grows down
Stack grow...
...
...
Text is not SVG - cannot display
\ No newline at end of file diff --git a/content/assets/stack-preserve-bp.svg b/content/assets/stack-preserve-bp.svg new file mode 100644 index 0000000..7810538 --- /dev/null +++ b/content/assets/stack-preserve-bp.svg @@ -0,0 +1,4 @@ + + + +
RBP
RBP
8 (8th foo arg)
8 (8th foo arg)
7 (7th foo arg)
7 (7th foo arg)
Return address to bar
Return address to bar
Saved RBP
Saved RBP
RSP
RSP
Stack grows down
Stack grow...
...
...
Text is not SVG - cannot display
\ No newline at end of file diff --git a/content/assets/stack.svg b/content/assets/stack.svg new file mode 100644 index 0000000..ddcfdf0 --- /dev/null +++ b/content/assets/stack.svg @@ -0,0 +1,4 @@ + + + +
Saved RBP
Saved RBP
8 (8th foo arg)
8 (8th foo arg)
7 (7th foo arg)
7 (7th foo arg)
Return address to bar
Return address to bar
Saved RBP
Saved RBP
Parameter 1 of foo(). 
RBP - 4
Parameter 1 of foo()....
Parameter 2 of foo().
RBP - 8
Parameter 2 of foo()....
Parameter 6 of foo().
RBP - 24
Parameter 6 of foo()....
Parameter 3 of foo().
RBP - 12
Parameter 3 of foo()....
Parameter 4 of foo().
RBP - 16
Parameter 4 of foo()....
Parameter 5 of foo().
RBP - 20
Parameter 5 of foo()....
RSP
RSP
Stack grows down
Stack grow...
...
...
Text is not SVG - cannot display
\ No newline at end of file diff --git a/hello/hello.asm b/hello/hello.asm index e7f90af..3890f3f 100644 --- a/hello/hello.asm +++ b/hello/hello.asm @@ -22,7 +22,7 @@ _start: syscall ;; Specify the number of the system call (60 is `sys_exit`). mov rax, 60 - ;; Set the first argument of `sys_exit` to `0`. The `0` status code is success. + ;; Set the first argument of `sys_exit` to 0. The 0 status code is success. mov rdi, 0 ;; Call the `sys_exit` system call. syscall diff --git a/sum/README.md b/sum/README.md index 1b7c3c7..ad975ed 100644 --- a/sum/README.md +++ b/sum/README.md @@ -1,8 +1,11 @@ -This is "sum of 2 numbers" with assembly for Linux x86-64 +# Sum of numbers -Build it with: `make` +This is a basic program that sums two integer numbers and checks if it is correct. If the sum is correct, it prints the message; otherwise, it just exits. -More details at - [Say hello to x64 Assembly [part 2]](https://github.com/0xAX/asm/blob/master/content/asm_2.md) +To build the program, run: -[@0xAX](https://x.com/0xAX) +```bash +make +``` +For more details, read [Part 2. The `x86_64` concepts](https://github.com/0xAX/asm/blob/master/content/asm_2.md). diff --git a/sum/sum.asm b/sum/sum.asm index 77d2c0b..861ee84 100644 --- a/sum/sum.asm +++ b/sum/sum.asm @@ -1,47 +1,53 @@ -;initialised data section +;; Definition of the .data section section .data - ; Define constants - num1: equ 100 - num2: equ 50 - ; http://geekswithblogs.net/MarkPearl/archive/2011/04/13/more-nasm.aspx - msg: db "Sum is correct", 10 + ;; The first number + num1 dq 0x64 + ;; The second number + num2 dq 0x32 + ;; The message to print if the sum is correct + msg db "The sum is correct!", 10 -;; -;; program code -;; +;; Definition of the .text section section .text - global _start + ;; Reference to the entry point of our program + global _start -;; entry point +;; Entry point _start: - ; get sum of num1 and num2 - mov rax, num1 - mov rbx, num2 - add rax, rbx - ; compare rax with correct sum - 150 - cmp rax, 150 - ; if rax is not 150 go to exit - jne .exit - ; if rax is 150 print msg - jmp .rightSum + ;; Set the value of num1 to rax + mov rax, [num1] + ;; Set the value of num2 to rbx + mov rbx, [num2] + ;; Get the sum of rax and rbx. The result is stored in rax. + add rax, rbx +.compare: + ;; Compare the rax value with 150 + cmp rax, 150 + ;; Go to the .exit label if the rax value is not 150 + jne .exit + ;; Go to the .correctSum label if the rax value is 150 + jmp .correctSum -; Print message that sum is correct -.rightSum: - ;; write syscall - mov rax, 1 - ;; file descritor, standard output - mov rdi, 1 - ;; message address - mov rsi, msg - ;; length of message - mov rdx, 15 - ;; call write syscall - syscall - ; exit from program - jmp .exit +;; Print a message that the sum is correct +.correctSum: + ;; Specify the system call number (1 is `sys_write`). + mov rax, 1 + ;; Set the first argument of `sys_write` to 1 (`stdout`). + mov rdi, 1 + ;; Set the second argument of `sys_write` to the reference of the `msg` variable. + mov rsi, msg + ;; Set the third argument to the length of the `msg` variable's value (20 bytes). + mov rdx, 20 + ;; Call the `sys_write` system call. + syscall + ;; Go to the exit of the program. + jmp .exit -; exit procedure +;; Exit procedure .exit: - mov rax, 60 - mov rdi, 0 - syscall \ No newline at end of file + ;; Specify the number of the system call (60 is `sys_exit`). + mov rax, 60 + ;; Set the first argument of `sys_exit` to 0. The 0 status code is success. + mov rdi, 0 + ;; Call the `sys_exit` system call. + syscall