DynASM Toolchain Features
- DynASM is a pre-processing assembler.
- DynASM converts mixed C/Assembler source to plain C code.
- The primary knowledge about instruction names, operand modes, registers, opcodes and how to encode them is only needed in the pre-processor.
- The generated C code is extremely small and fast.
- A tiny embeddable C library helps with the process of dynamically assembling, relocating and linking machine code.
- There are no outside dependencies on other tools (such as stand-alone assemblers or linkers).
- Internal consistency checks catch runtime errors (e.g. undefined labels).
- The toolchain is split into a portable subset and CPU-specific modules.
- DynASM itself (the pre-processor) is written in Lua.
- There is no machine-dependency for the pre-processor itself. It should work everywhere you can get Lua 5.1 and Lua BitOp up and running (i.e. Linux, *BSD, Windows, ... you name it).
DynASM Assembler Features
- C code and assembler code can be freely mixed. Readable, too.
- All the usual syntax for instructions and operand modes you come to expect from a standard assembler.
- Access to C variables and CPP defines in assembler statements.
- Access to C structures and unions via type mapping.
- Convenient shortcuts for accessing C structures.
- Local and global labels.
- Numbered labels (e.g. for mapping bytecode instruction numbers).
- Multiple code sections (e.g. for tailcode).
- Defines/substitutions (inline and from command line).
- Conditionals (translation time) with proper nesting.
- Macros with parameters.
- Macros can mix assembler statements and C code.
- Captures (output diversion for code reordering).
- Simple and extensible template system for instruction definitions.
Currently the x86, x64, ARM, ARM64, PowerPC and MIPS instruction sets are supported. This includes most user-mode instructions available on modern CPUs. For x86/x64 this includes SSE, SSE2, SSE3, SSSE3, SSE4a, SSE4.1, SSE4.2, AVX, AVX2, BMI, ADX, AES-NI and FMA3. For PPC this also includes the e500 instruction set extension.
The whole toolchain has been designed to support multiple CPU architectures. As LuaJIT gets support for more architectures, DynASM will be extended with new CPU-specific modules.
Note that runtime conditionals are not really needed, since you can just use plain C code for that (and LuaJIT does this a lot). It's not going to be more (time-) efficient if conditionals are done by the embedded C library (maybe a bit more space-efficient).
Please don't be shied away because DynASM itself is written in Lua. This only applies to the pre-processor. This is pure text-processing and writing such stuff in C would be a waste of time. Pre-processing is done only once while your code generator itself is compiled. There are no dependencies on Lua during runtime, i.e. when your code generator is in action.
Apart from that, a full Lua distribution is around 200K and can be compiled in a few seconds. Consider it a part of the toolchain, if you want.
Or bundle src/host/minilua.c from the LuaJIT source tree, which is a minified Lua 5.1 + BitOp in a single file (45K compressed).