TPDE
|
TPDE-LLVM is a TPDE-based LLVM back-end focusing on fast compilation targeting x86-64 and AArch64. Typically, compile times are 10–20x faster than the LLVM -O0
back-end with similar execution time, code size is ~10-30% larger for -O0
IR and similar for -O1
IR. The focus is on supporting a commonly used subset of LLVM-IR and target platforms efficiently, therefore many IR features are not supported – in such cases, the intention is to fall back to the full LLVM back-end. Code generated by Clang (-O0
/-O1
) will typically compile; -O2
and higher will typically fail due to unsupported vector operations.
Standalone usage is possible through the tools tpde-llc
(similar to llc
), which compile LLVM IR or bitcode to an ELF object file.
Library usage is possible through tpde_llvm::LLVMCompiler, which supports compiling a module to an object file or mapping it into the memory of the current process for JIT execution. The JIT mapper only supports very typical ELF constructs (e.g., no TLS), if this is not sufficient, the object file can also be mapped through LLVM's ORC JIT (see tpde-llvm/tools/tpde-lli.cpp for an example).
Note that compilation is likely to modify the module. All constant expressions inside functions are replaced with instruction sequences and all accesses to thread-local variables are rewritten to use llvm.threadlocal.address
.
We provide a patch to integrate TPDE-LLVM into Clang/Flang. Apply the patch from the root directory of the repository and add this repository under clang/lib/CodeGen/tpde2
(e.g., via a symlink). This adds two options to the clang
and flang
drivers:
-ftpde
: Use TPDE instead of the regular LLVM back-end. Inputs that TPDE can't handle will cause a fall back to LLVM and emit a warning.-ftpde-abort
: Abort when the input is not supported, don't fallback to LLVM.Note: most LLVM-specific code-gen options will be ignored.
Unsupported features currently include:
i64
except i128
(i128
is supported), pointers with non-zero address space, half
, bfloat
, ppc_fp128
, x86_fp80
, x86_amx
. Code with x86-64 long double
needs to be compiled with -mlong-double-64
.<32 x i8>
on x86-64); icmp
/fcmp
; pointer element type; getelementptr
with vector types; select
with vector predicate, integer extension/truncation,select
aggregate type other than {i64, i64}
.bitcast
larger than 64 bit.seqcst
for atomicrmw
).fp128
: fneg
, fcmp one/ueq
, many intrinsics.goto
(blockaddress
, indirectbr
).landingpad
with non-empty filter
clause.llvm.cttz
only for 8/16/32/64-bit).