This tool statically (AOT) translates (or raises) binaries to LLVM IR.
by Sami Tolvanen, Staff Software Engineer, Android Security
Android’s security model is enforced by the Linux kernel, which makes it a tempting target for attackers. We have put a lot of effort into hardening the kernel in previous Android releases and in Android 9, we continued this work by focusing on compiler-based security mitigations against code reuse attacks. Google’s Pixel 3 will be the first Android device to ship with LLVM’s forward-edge Control Flow Integrity (CFI) enforcement in the kernel, and we have made CFI support available in Android kernel versions 4.9 and 4.14. This post describes how kernel CFI works and provides solutions to the most common issues developers might run into when enabling the feature.[…]
[…]It is the result of the community’s work over the past six months, including: function multiversioning in Clang with the ‘target’ attribute for ELF-based x86/x86_64 targets, improved PCH support in clang-cl, preliminary DWARF v5 support, basic support for OpenMP 4.5 offloading to NVPTX, OpenCL C++ support, MSan, X-Ray and libFuzzer support for FreeBSD, early UBSan, X-Ray and libFuzzer support for OpenBSD, UBSan checks for implicit conversions, many long-tail compatibility issues fixed in lld which is now production ready for ELF, COFF and MinGW, new tools llvm-exegesis, llvm-mca and diagtool. And as usual, many optimizations, improved diagnostics, and bug fixes.[…]
Early support for UBsan, X-Ray instrumentation and libFuzzer (x86 and x86_64) for OpenBSD. Support for MSan (x86_64), X-Ray instrumentation and libFuzzer (x86 and x86_64) for FreeBSD.
AArch64 target: Assembler and disassembler support for the ARM Scalable Vector Extension has been added.
A new Implicit Conversion Sanitizer (-fsanitize=implicit-conversion) group was added. Please refer to the Undefined Behavior Sanitizer (UBSan) section of the release notes for the details.
An existing tool named diagtool has been added to the release. As the name suggests, it helps with dealing with diagnostics in clang, such as finding out the warning hierarchy, and which of them are enabled by default or for a particular compiler invocation.
clang-tidy: New module zircon for checks related to Fuchsia’s Zircon kernel.
The DEBUG macro has been renamed to LLVM_DEBUG, the interface remains the same.
A new tool named llvm-mca has been added. llvm-mca is a static performance analysis tool that uses information available in LLVM to statically predict the performance of machine code for a specific CPU.
LLVM-based compiler to create artificial software diversity to protect software from code-reuse attacks.
PANDA is an open-source Platform for Architecture-Neutral Dynamic Analysis. It is built upon the QEMU whole system emulator, and so analyses have access to all code executing in the guest and all data. PANDA adds the ability to record and replay executions, enabling iterative, deep, whole system analyses. Further, the replay log files are compact and shareable, allowing for repeatable experiments. A nine billion instruction boot of FreeBSD, e.g., is represented by only a few hundred MB. PANDA leverages QEMU’s support of thirteen different CPU architectures to make analyses of those diverse instruction sets possible within the LLVM IR. In this way, PANDA can have a single dynamic taint analysis, for example, that precisely supports many CPUs. PANDA analyses are written in a simple plugin architecture which includes a mechanism to share functionality between plugins, increasing analysis code re-use and simplifying complex analysis development.
LAVA (Large Scale Automated Vulnerability Addition) for PANDA:
Evaluating and improving bug-finding tools is currently difficult due to a shortage of ground truth corpora (i.e., software that has known bugs with triggering inputs). LAVA attempts to solve this problem by automatically injecting bugs into software. Every LAVA bug is accompanied by an input that triggers it whereas normal inputs are extremely unlikely to do so. These vulnerabilities are synthetic but, we argue, still realistic, in the sense that they are embedded deep within programs and are triggered by real inputs. Our work forms the basis of an approach for generating large ground-truth vulnerability corpora on demand, enabling rigorous tool evaluation and providing a high-quality target for tool developers.
PANDA’s LAVA is separate from the Linaro LAVA project, which the Tags on this blog points to.
Phasar is a LLVM-based static analysis framework written in C++. It allows users to specify arbitrary data-flow problems which are then solved in a fully-automated manner on the specified LLVM IR target code. Computing points-to information, call-graph(s), etc. is done by the framework, thus you can focus on what matters.
Symbiotic is a tool for analysis of computer programs. It can check all common safety properties like assertion violations, invalid pointer dereference, double free, memory leaks, etc. Symbiotic uses three well-know techniques: instrumentation, program slicing, and symbolic execution. We use LLVM as program representation.
This release is the result of the community’s work over the past six months, including: retpoline Spectre variant 2 mitigation, significantly improved CodeView debug info for Windows, GlobalISel by default for AArch64 at -O0, improved scheduling on several x86 micro-architectures, Clang defaults to -std=gnu++14 instead of -std=gnu++98, support for some upcoming C++2a features, improved optimizations, new compiler warnings, many bug fixes, and more.
Heavy lifting with McSema 2.0
Four years ago, we released McSema, our x86 to LLVM bitcode binary translator. Since then, it has stretched and flexed; we added x86-64 support, put it on a performance-focused diet, and improved its usability and documentation. McSema wasn’t the only thing improving these past years, though. At the same time, programs were increasingly adopting modern x86 features like the advanced vector extensions (AVX) instructions, which operate on 256-bit wide vector registers. Adjusting to these changes was back-breaking but achievable work. Then our lifting goals expanded to include AArch64, the architecture used by modern smartphones. That’s when we realized that we needed to step back and strengthen McSema’s core. This change in focus paid off; now McSema can transpile AArch64 binaries into x86-64! Keep reading for more details.[…]
Lots of changes for Intel/AMD/ARM/MIPS/PowerPC, eg AMD Rhyzen support. And new PDB tool. Clang has new diagnostic/”lint” abilities. The static analyzer uses Microsoft’s Z3 solver. New C and C++ features (wow, C++ is at C++17 already!). Many other changes! I wish I had time to look at it more detail today… 😦
PDBs are the sidecar symbol files for Windows. The spec used to be private, now is public, and now it is great to see Clang supporting them. Last time I looked, GCC does not support them.
“Epona provides advanced software protection using LLVM, an industrial strength toolchain.”
ARM has updated it’s C/C++ compiler toolchains.
C and C++ update for Arm Compiler 6:
As you are hopefully aware, Arm Compiler 6 has been available for 3+ years now, and has grown in maturity, and optimization quality release on release. As I write this, the latest available version is 6.8, and 6.6 has been qualified for use in safety-related development. We offer full support for the latest Arm processors, across the Cortex-A, R, and M, and SecureCore families. Arm Compiler 6 is available within DS-5 and Keil MDK toolchains. Furthermore the qualified version is available for purchase stand-alone. Arm Compiler 6 is based on the LLVM framework, using the modern Clang compiler front-end, and this is reflected in the name of the executable, Armclang. The compiler is then integrated into the full Arm tools suite, enabling use of legacy assembler code built with Armasm, as well as gas format assembler directly with Armclang. Finally the Arm linker (Armlink) brings in the optimized C and C++ libraries, or if desired the size optimized Arm C MicroLib library, as well as (optionally) implementing link-time optimizations across the source code.[…]
Cristian Cadar announced the 1.4.0 release of KLEE.
KLEE 1.4.0 is now available at
Lots of new changes, in particular a new CMake build system, support for some missing features for LLVM 3.4 (and partial support for 3.5 and 3.6), better support for MacOS, support for release documentation (as in http://klee.github.io/releases/docs/v1.4.0/) and many other optimizations, features and bug fixes.[…]
This repo contains some materials related to a talk on security r&d projects that in some way use LLVM for doing their work.
Steven Shi of Intel\SSG\STO\UEFI Firmware has created an LLVM Clang-centric fork of the EDK2 project. The EDK2 project already supports LLVM clang alongside GCC and ICC and MCS, but this fork appears to be taking advantage of some of LLVM’s security features, like Clang’s Static Analyzer, “special checkers for edk2 security” sounds interesting!
I forked a edk2 branch to apply the LLVM compiler and tool chain technologies (http://www.llvm.org/) on the edk2 codebase in below link. If you are also interested in the LLVM/Clang support, please take a look. I welcome and appreciate any feedback or contribution to this branch.
So far, this branch focus on below items:
* Clang compiler optimization for edk2 code size improvement, e.g. Link Time Optimization (LTO)
* Clang Static Analyzer (scan-build) for edk2, e.g. special checkers for edk2 security, checkers for Intel Firmware Engine automation
There are 4 new tool chains are introduced in branch llvm:
* CLANG38: Clang3.8.0 build tool chain, and enable code size optimization flag (-Os) by default on both Ia32 and X64.
* CLANGLTO38: Base on CLANG38 to enable LLVM Link Time Optimization (LTO) for more aggressive code size improvement.
* CLANGSCAN38: Base on CLANG38 to seamlessly integrate Clang scan-build analyzer infrastructure into edk2 build infrastructure.
* GCCLTO53: Enabled GCC Link Time Optimization (LTO) and code size optimization (-Os) for more aggressive code size improvement.
There are several known issues as below. WELCOME and APPRECIATE any suggestion to them:
* Not use gold linker for now, but directly use standard ld. GNU gold linker ld-new (GNU Binutils 2.26.20160125) 1.11 fails to link edk2 static library file (*.dll) with error message: “ld: internal error in do_layout, at ../../binutils-2.26/gold/object.cc:1819” Have submitted a bug in Bug 20062 – Gold2.26 fail to link Uefi firmware with internal error in do_layout, but ld works (https://sourceware.org/bugzilla/show_bug.cgi?id=20062)
* CLANG LTO optimization (on ld, not on gold) can generate incorrect code. Current CLANGLTO38 LTO X64 debug build will generate wrong code for BasePrintLib.inf and LzmaCustomDecompressLib.inf modules, and the Ovmf boot will hang in these two modules. Already add work around to disable the lto optimization in these two modules’ inf files. Please see the log of commit 6a55aa9c3fa58f275041bf8cda138643f04baf5c
* GCC LTO optimization can generate incorrect code. Current GCCLTO53 is even worse than CLANGLTO38, and there are more modules need to disable the LTO optimization to work around the CPU exceptions during boot time.
I wonder if this project is related to the Intel LLVM KLEE/S2E static analysis project that they are hopefully going to open source this year?
I hope they take the handful of metrics that William’s LangToolUEFIZBIOS(sp) — his grad project — did. It’ll be a lot simpler as a LLVM filter, no need for all the Java ANTLR code!