PC architecture
x86 instruction set
gcc calling conventions
PC architecture
CPU runs instructions:
run next instruction
Needs work space: registers
More work space: memory
CPU sends out address on address lines (wires, one bit per wire)
Data comes back on data lines after a fashion
or data is written to data lines
Add more registers: pointers into memory
SP - stack pointer
BP - frame base pointer
SI - source index
DI - destination index
Only a 16-bit machine, but >64kB memory: segment registers
CS - code segment
DS - data segment
SS - stack segment
ES, FS, GS - extra segments
seg:off means physical address seg*16+off
Instructions are in memory too!
IP - instruction pointer (PC on PDP-11, everything else)
increment after running each instruction
can be modified by CALL, RET, JMP, conditional jumps
Want conditional jumps
Still not interesting - need I/O to interact with outside world
same as memory but set I/O signal
only 1024 I/O addresses
enum {
Data = 0x378+0
Status = 0x378+1,
Notbusy = 0x80,
Ctl = 0x378+2,
Strobe = 0x01,
while((inb(Status)&Notbusy) == 0)
outb(Data, c)
outb(Ctl, Strobe)
outb(Ctl, 0)
- Only 1024 I/O addresses - MMIO
- use normal memory addresses
- no need for special instructions
- "magic" memory
- system controller routes to appropriate device
x86 Instruction Set
- Two-operand instruction set
- Intel: op dst, src
- AT&T (gcc/gas): op src, dst
- uses b, w, l suffix on instructions to specify size of operands
- Operands are registers, constant, memory via register, memory via constant
edx = eax;
edx = 0x123
at&t "C"
movl %eax, %edx
movl $0x123, %edx
movl (%ebx), %edx edx = mem[ebx];
movl 4(%ebx), %edx edx = mem[ebx+4];
movl 0x123, %edx edx = mem[0x123];
- Instruction classes
- data movement: MOV, PUSH, POP, ...
- arithmetic: TEST, SHL, ADD, AND, ...
- i/o: IN, OUT, ...
- control: JMP, JZ, JNZ, CALL, RET
- string: REP MOVSB, ...
- system: IRET, INT
- Intel architecture manual Volume 2 is the reference
gcc x86 calling conventions
- x86 dictates that stack grows down:
- pushl %eax
subl $4, %esp
movl %eax, (%esp)
popl %eax
movl (%esp), %eax
addl $4, %esp
call $0x12345
pushl %eip
movl $0x12345, %eip
popl %eip
Gcc dictates the rest. Contract between caller and callee on x86:
after call instruction:
- %eip points at first instruction of function
- %esp+4 points at arguments
- %esp points at return address
- after ret instruction:
- %eip contains return address
- %esp points at arguments
- caller may have trashed arguments
- %eax contains return value
- %ecx, %edx may be trashed
- %ebp, %ebx, %esi, %edi must contain contents from time of call
- %ecx, %edx are "caller save"
- %ebp, %ebx, %esi, %edi are "callee save"
Can do anything that doesn't violate contract. By convention, gcc does more:
each function has a stack frame marked by %ebp, %esp
| arg 2 |
| arg 1 |
| ret %eip |
%ebp-> | saved %ebp |
| |
| |
| |
| |
| |
%esp-> | |
%esp can move to make stack frame bigger, smaller
%ebp points at saved %ebp from previous function, chain to walk stack
function prologue:
pushl %ebp
movl %esp, %ebp
function epilogue:
movl %ebp, %esp
popl %ebp