- Arduino_asm
- Summary
- Register Usage in AVR GCC Calling Convention
- 1. Argument Passing (Registers Used for Function Parameters)
- 2. Registers That Can Be Freely Changed (Caller-Saved)
- 3. Registers That Must Be Preserved (Callee-Saved)
- 4. Special Registers
- Push/Pop Preservation
- What Should Not Be Used?
- Example: Calling an Assembly Function from C++
- Fill RAM on initialisation
- lo8(name), pm_lo8(name), lo8(gs( name ))
- Makra a assembler
- ATmega2560 a EXTMEM (~64 kB RAM)
- Registers in C and in FORTH
- ATmega stacks
- Přetypování adresy funkce na word adresu
- C - unused parameter
avr-gcc -mmcu=atmega328p -Os -S demo.c -o demo.Os.s
Arduino_asm
http://eleccelerator.com/fusecalc/fusecalc.php?chip=atmega328p
https://gcc.gnu.org/wiki/avr-gcc#Register_Layout
Ve zkratce:
- 1 bytové argumenty se považují za dvojbytové
- předávají se v registrech kde nejnižší byte je v registru s nejnižším číslem
- R26 se nepoužíje, vše se naštosuje POD něj
- R8 se nepoužíje, pokud by měl přijít na řadu, tak to jde na stack od té chvíle všechno
- R9-R17 se použijou, ale musí mít při návratu stejnou hodnotu
Summary
| Register | Purpose | Caller-Saved? |
|---|---|---|
| R0 | Temporary use, often used for multiplication/division | Yes |
| R1 | Must always be zero before returning | Yes (restore to 0) |
| R2-R17 | Callee-saved (preserve if modified) | No |
| R18-R27 | Temporary variables, can be freely changed | Yes |
| R26-R27 (X) | General indirect addressing in SRAM | Yes |
| R28-R29 (Y) | Frame pointer (stack-related) | No |
| R30-R31 (Z) | Used for indirect addressing (e.g., LD, ST) & reading from Flash memory | Yes |
When programming the ATmega328P (used in Arduino Uno) in assembly, especially when interfacing with C++ functions, you need to follow the AVR GCC calling convention.
Register Usage in AVR GCC Calling Convention
AVR uses a modified caller-save convention, meaning the caller is responsible for saving certain registers before calling a function.
1. Argument Passing (Registers Used for Function Parameters)
- The first two arguments (for 8-bit values) are passed in R24 and R22.
- For 16-bit values: - First argument: R24-R25 - Second argument: R22-R23
- For 32-bit values: - First argument: R22-R25 - Second argument: R18-R21
- For pointer values (16-bit): Passed in R24-R25
- For return values: - 8-bit: R24 - 16-bit: R24-R25 - 32-bit: R22-R25
2. Registers That Can Be Freely Changed (Caller-Saved)
- R18-R27 (X-register, used for indirect addressing)
- R30-R31 (Z-register, used for indirect addressing)
- R0 (but GCC expects it to be zero after use)
These must be saved if needed after calling a function.
3. Registers That Must Be Preserved (Callee-Saved)
These registers must not be changed by the assembly function unless they are saved/restored (via push/pop or store/load on the stack):
- R2-R17
- R28-R29 (Y-register, used for the frame pointer)
4. Special Registers
- R1: Always assumed to be zero. If modified, it must be cleared before returning.
- R0: Used for temporary storage, can be modified but needs special care.
- R30-R31: Often used for pointer arithmetic (Z-register).
Push/Pop Preservation
If your assembly function uses callee-saved registers, you must push them to the stack and restore them before returning. Example:
.global my_asm_function my_asm_function: push r2 ; Save registers that should be preserved push r3 push r4 ... ; Function logic ... pop r4 ; Restore registers before returning pop r3 pop r2 ret
What Should Not Be Used?
- Do not modify R1 without restoring it to zero (`clr r1`).
- Avoid corrupting callee-saved registers without restoring them (R2-R17, R28-R29).
- Use stack (`push`/`pop`) for temporary storage if you need extra registers.
Example: Calling an Assembly Function from C++
C++ Function Prototype (Arduino Sketch)
extern "C" uint8_t my_asm_function(uint8_t x); void setup() { Serial.begin(9600); uint8_t result = my_asm_function(42); Serial.println(result); } void loop() {}
Assembly Code (`file.S`)
.global my_asm_function my_asm_function: ; Argument comes in R24 mov r18, r24 ; Save input value in R18 lsl r18 ; Example operation: Multiply by 2 mov r24, r18 ; Store result in return register ret ; Return value is in R24
ldi ZL,pm_lo8(foo) ldi ZH,pm_hi8(foo) ijmp
; indirect jmp
ldi zl,low(CommandTable) ;Z points to table
ldi zh,high(CommandTable)
add zl,Mode ;add mode as an offset
brcc DoS1
inc zh ;account for a carry
DoS1:
ijmp ;jump to the command to run
CommandTable: ;table of commmands to run from menus
rjmp MainLoop ;blank entry
rjmp DoLearnMode ;learn the maze
rjmp DoSolveMode ;solve the maze
rjmp DoSolveFast ;solve the maze fast
rjmp DoSingleStep ;single step debug
rjmp DoSensors ;display sensor readings
rjmp DoShowCells ;display cell data
rjmp MainLoop ;blank entry
Fill RAM on initialisation
003_fill_RAM.S (or any other name)
; .init0 first part on boot, no stack
; .init1
; .init2 SP defined
; .init3 <======================== HERE we are
; .init4 fill .data from FLASH
; .init5 fill .bss with 0
; .init6
; .init7
; .init8
; .init9 call main
.section .init3
.global fill_ram_pattern
fill_ram_pattern:
ldi r30, lo8(__heap_start)
ldi r31, hi8(__heap_start)
ldi r26, lo8(__stack)
ldi r27, hi8(__stack)
ldi r24, 0xA5
1:
cp r30, r26
cpc r31, r27
brsh 2f
st Z+, r24
rjmp 1b
2:
lo8(name), pm_lo8(name), lo8(gs( name ))
gcc generuje trampoliny pro funkce, ktere jsou moc vysoko, nebo si o to reknou
some_fn:
ldi r16,lo8(some_fn) ; Byte-address, low byte
ldi r17,hi8(some_fn) ; Byte-address, high byte
ldi r18,hh8(some_fn) ; Byte-address, highest byte
ldi r19,pm_lo8(some_fn) ; Word-address, low byte
ldi r20,pm_hi8(some_fn) ; Word-address, high byte
ldi r21,pm_hh8(some_fn) ; Word-address, highest byte
ldi r22,lo8(gs(some_fn)) ; Word-address where the linker will
ldi r23,hi8(gs(some_fn)) ; generate a stub as needed.
jak gcc rict, ze jde o funkci:
.global some_fn .type some_fn,@function
Co taky jde:
subi r30,lo8(-(VRAM)) sbci r31,hi8(-(VRAM)) ; pricist adresu pomoci odecteni zaporne
See:
Makra a assembler
- Makro může mít defaultní hodnoty libovolných parametrů .macro DEFWORD lbl, attr, name, codeword, final_data_label="none"
- pokud se parametr neuvede, použije se defaltní hodnota
- parametry se v těle používají s předřazeným escapem \lbl, pokud potřebuju skládat s textem, je \() prázdný řetězec \lbl\()_cw:
- v řetězecích se parametry používají stejně .asci "\name" ALE assembler to v prvním přiblížení chápe jako escape sekvence a řve, pokud je nezná
- takže ty makro parametry použíté v řetězcích musejí začínat jedním z písmen nrtabfv - jo, je to blbé jak facka na břicho
- aby se proměnná nevypisovala jako kód, musí být popsána jako proměnná myLabel: a .type myLabel,@object (a lokální návěští 1..9 tak popsat nejdou, takže přepnou výpis do kódu, takže je dobré v makru za ně dát ještě nějaké jiné návěští)
- .byte strlen(name) prostě nejde udělat bez navěští
ATmega2560 a EXTMEM (~64 kB RAM)
Linker skript
MEMORY
{
text (rx) : ORIGIN = 0x000000, LENGTH = 256K
data (rw!x): ORIGIN = 0x800200, LENGTH = 8K
extmem (rw!x): ORIGIN = 0x802200, LENGTH = 0xDE00 /* 64K - 8k (RAM) - 0x200 (I/O) */
}
SECTION:
.extmem (NOLOAD) :
{
__extmem_start = .;
*(.videoram)
*(.extmem)
*(.extmem*)
__extmem_end = .; /* not needed */
__forth_heap_start = .;
} > extmem
na konec, mimo sekce můžu dát mnou definované symboly (ale pak bych je neměl definovat v sekcích)
/* === external RAM === */ __extmem_start = ORIGIN(extmem); /* 0x802200 */ __extmem_end = ORIGIN(extmem) + LENGTH(extmem) - 1; /* 0x80FFFF */
V C/C++ můžeme dát některé proměnné do extmem, např.
#include <stdint.h>
__attribute__((section(".videoram")))
volatile uint8_t videoram[8192];
__attribute__((section(".extmem")))
uint16_t video_cursor_x;
__attribute__((section(".extmem")))
uint16_t video_cursor_y;
...
extern uint8_t __forth_heap_start;
uint8_t *HERE = &__forth_heap_start;
extern uint8_t __extmem_end;
uint8_t *HERE_LIMIT = &__extmem_end;
...
extern uint8_t __extmem_end;
uint8_t *extmem_end = &__extmem_end;
ASM:
ldi ZL, lo8(__extmem_end) ldi ZH, hi8(__extmem_end)
a HERE ukazuje na začátek volné RAM (až do 0xFFFF)
Registers in C and in FORTH
| C/C++ | C/C++ | C/C++ | FORTH | FORTH | FORTH | comment |
|---|---|---|---|---|---|---|
| reg | Preserve | means | reg | Preserve | means | Comment |
| r0 | No | r0 | No | Ty | Another Scratch register, 1B | |
| r1 | No/0 | zero | r1 | No/0 | zero | Should be set to zero on exit from any C / FORTH routine |
| r2 | Yes | r2 | Yes | RST 0 | Pointer to Return STack 2 bytes - native RAM only | |
| r3 | Yes | r3 | Yes | RST 1 | ditto | |
| r4 | Yes | r4 | Yes | TOS 0 | Top of Data STack value (for faster access) | |
| r5 | Yes | r5 | Yes | TOS 1 | ditto | |
| r6 | Yes | r6 | Yes | TOS 2 | ditto | |
| r7 | Yes | r7 | Yes | DT 2 | W = DT pointer - generated for each function again (= may be used as any other temp) | |
| r8 | Yes | par 9 | r8 | Yes | IP 0 | IP Instruction pointer |
| r9 | Yes | par 9 | r9 | Yes | IP 1 | ditto |
| r10 | Yes | par 8 | r10 | Yes | IP 2 | ditto |
| r11 | Yes | par 8 | r11 | Yes | Free to later use | |
| r12 | Yes | par 7 | r12 | Yes | TCB 0 | Thread_Controll_Block - also User - in RAM |
| r13 | Yes | par 7 | r13 | Yes | TCB 1 | ditto |
| r14 | Yes | par 6 | r14 | Yes | LST 0 | LST LOOP fix stack, secondary stact for data >L L> |
| r15 | Yes | par 6 | r15 | Yes | LST 1 | ditto |
| r16 | Yes | par 5 | r16 | Yes | Free to later use | |
| r17 | Yes | par 5 | r17 | Yes | ||
| r18 | No | par 4 | r18 | No | Temp 0 | Scratch register3, used in Ldi*, freely destroyed anywhere |
| r19 | No | par 4 | r19 | No | Temp 1 | ditto |
| r20 | No | par 3 | r20 | No | Temp 2 | ditto |
| r21 | No | par 3 | r21 | No | Tx | Another Scratch register, 1B |
| r22 | No | Cpar2 0 | r22 | No | Parsx 0 | FORTH parametr 3 bytes, used in B?at and W?at |
| r23 | No | Cpar2 1 | r23 | No | Parsx 1 | ditto |
| r24 | No | Cpar1 0 | r24 | No | Parsx 2 | ditto |
| r25 | No | Cpar1 1 | r25 | No | Zx 2 | Temporary variable/pointer |
| r26 Xl | No | r26 | No | DT 0 | W = DT pointer - generated for each function again (= may be used as any other temp) used in Pop_Z* | |
| r27 Xh | No | r27 | No | DT 1 | ditto | |
| r28 Yl | Yes | r28 | Yes | DST 0 | Pointer to Data STack 2 bytes - native RAM only - point to second value (first is in TOS) - used often, Y+, -Y, Y+q | |
| r29 Yh | Yes | r29 | Yes | DST 1 | ditto | |
| r30 Zl | No | r30 | No | Zx 0 | Temporary variable/pointer | |
| r31 Zh | No | r31 | No | Zx 1 | ditto |
- movw rd, rs needs both source and destination register even, so all vectors shoud start at 0,2,4,6,... to use Set3 A,B == movw A0,B0; mov A2,B2;
- Y is DST as it can be used stack like and is preserved - Y+, -Y, Y+q
- X can be used in "LD rx, X+" for data, so it is W = DT pointer - generated for each function again - if no longer needed, may be reuse as scratch
- Z is probably overwritten just after entering "function" - also only register to use with ELPM and EIJMP/EICALL
ATmega stacks
- SP grow down
- init TOPMEM = $FFFF
| old | push new | old | pop = new | old |
| old | old | old | ||
| --- <---- | new | --- <---- | ||
| --- | --- <---- | --- | ||
| --- | --- | --- | ||
| --- | --- | --- |
- register X,Y,Z grow down
- I will use THIS model for FORTH
- init TOPMEM+1 = $10000 = $0000
- push = ST -X, rx
- pop = LD rx, X+
| old | push new | old | pop = new | old |
| old <---- | old | old <---- | ||
| --- | new <---- | --- | ||
| --- | --- | --- | ||
| --- | --- | --- | ||
| --- | --- | --- |
- register X,Y,Z grow up
- init LOWMEM = $0000
- push = ST X+, rx
- pop = LD rx, -X
| --- | push new | --- | pop = new | --- |
| --- | --- <---- | --- | ||
| --- <---- | new | --- <---- | ||
| old | old | old | ||
| old | old | old | ||
| old | old | old |
Přetypování adresy funkce na word adresu
extern void f_dup(void); uint32_t addr = (uint32_t)(uintptr_t)&f_dup; // eijmp addr
C - unused parameter
uint8_t serial_getc(void *state, char *out_char) { // HW Serial, no state needed
(void)state; // state UNUSED, no compiler complains
....
Arduino_asm