LAVA: Large-scale Automated Vulnerability Addition for PANDA

Re: https://firmwaresecurity.com/2015/11/23/panda-vm/ and https://firmwaresecurity.com/2016/12/01/panda-2-0-released/

PANDA is an open-source Platform for Architecture-Neutral Dynamic Analysis. It is built upon the QEMU whole system emulator, and so analyses have access to all code executing in the guest and all data. PANDA adds the ability to record and replay executions, enabling iterative, deep, whole system analyses. Further, the replay log files are compact and shareable, allowing for repeatable experiments. A nine billion instruction boot of FreeBSD, e.g., is represented by only a few hundred MB. PANDA leverages QEMU’s support of thirteen different CPU architectures to make analyses of those diverse instruction sets possible within the LLVM IR. In this way, PANDA can have a single dynamic taint analysis, for example, that precisely supports many CPUs. PANDA analyses are written in a simple plugin architecture which includes a mechanism to share functionality between plugins, increasing analysis code re-use and simplifying complex analysis development.

LAVA (Large Scale Automated Vulnerability Addition) for PANDA:

Evaluating and improving bug-finding tools is currently difficult due to a shortage of ground truth corpora (i.e., software that has known bugs with triggering inputs). LAVA attempts to solve this problem by automatically injecting bugs into software. Every LAVA bug is accompanied by an input that triggers it whereas normal inputs are extremely unlikely to do so. These vulnerabilities are synthetic but, we argue, still realistic, in the sense that they are embedded deep within programs and are triggered by real inputs. Our work forms the basis of an approach for generating large ground-truth vulnerability corpora on demand, enabling rigorous tool evaluation and providing a high-quality target for tool developers. 

https://github.com/panda-re/lava

https://github.com/panda-re/panda

PANDA’s LAVA is separate from the Linaro LAVA project, which the Tags on this blog points to.

 

 

PANDA 2.0 released

“The PANDA team is pleased to announce the initial release of PANDA 2.0. It’s been roughly four years since we first released PANDA, and it’s come a long way, becoming more stable, featureful, and easier to use — in large part because of fantastic contributions from developers around the world. At the same time, though, QEMU has undergone huge changes, and PANDA hasn’t kept up. QEMU now supports new platforms like Mac OS X, has improved the TCG emulator’s performance, and includes countless security fixes. The main goal of PANDA 2.0 is to re-sync with upstream QEMU, allowing us to take advantage of all of these improvements. We’ve also restructured the repository, which will make it easier to keep up with upstream changes in the future.”

“PANDA is an open-source Platform for Architecture-Neutral Dynamic Analysis. It is built upon the QEMU whole system emulator, and so analyses have access to all code executing in the guest and all data. PANDA adds the ability to record and replay executions, enabling iterative, deep, whole system analyses. Further, the replay log files are compact and shareable, allowing for repeatable experiments. A nine billion instruction boot of FreeBSD, e.g., is represented by only a few hundred MB. PANDA leverages QEMU’s support of thirteen different CPU architectures to make analyses of those diverse instruction sets possible within the LLVM IR. In this way, PANDA can have a single dynamic taint analysis, for example, that precisely supports many CPUs. PANDA analyses are written in a simple plugin architecture which includes a mechanism to share functionality between plugins, increasing analysis code re-use and simplifying complex analysis development. It is currently being developed in collaboration with MIT Lincoln Laboratory, NYU, and Northeastern University.”

https://github.com/panda-re/panda