Summary of "The Original Sin of Computing...that no one can fix"
Summary of "The Original Sin of Computing...that no one can fix"
This video explores a foundational and unsettling concept in computer science known as the "original sin" of computing, first introduced by Ken Thompson in his 1984 Turing Award acceptance speech titled Reflections on Trusting Trust. The main technological focus is on compiler trustworthiness and the possibility of undetectable, self-propagating backdoors embedded in Compilers—a meta backdoor or Trojan horse that can infect every program compiled thereafter, including subsequent Compilers.
Key Technological Concepts and Analysis
- Turing Award & Ken Thompson’s Thought Experiment: The video begins by explaining the prestige of the Turing Award and Ken Thompson’s speech, where he proposed that Compilers themselves could be compromised in a way that no amount of source code auditing could detect. This "original sin" would be passed down through generations of Compilers, creating a persistent and invisible security risk.
- Self-hosting Compilers and Bootstrapping: The video explains that many Compilers are self-hosted, meaning they are written in the language they compile (e.g., the C compiler written in C). This creates a paradox and a chain of trust that traces back to the earliest Compilers, which were manually assembled by humans. This lineage makes it theoretically possible for a malicious backdoor to be perpetuated indefinitely.
- Compiler Family Trees: A complex family tree of programming languages and Compilers is shown, highlighting the extensive derivations and dependencies that make auditing every compiler nearly impossible.
- Quines and Self-reproducing Code: To demonstrate the concept, the creator writes Quines—programs that output their own source code—in C and Fortran. This illustrates how code can reproduce itself exactly, a key property that enables a malicious compiler to replicate its backdoor code.
- Trojanized Compiler Proof of Concept: The video shows a modified quine called "TrojanCoin" that inserts malicious changes (e.g., replacing the word "right" with "wrong") into the output on subsequent runs. This simulates how a Trojanized compiler could insert malicious payloads into compiled programs or into the source code of other Compilers it compiles.
- Code obfuscation Techniques: Basic obfuscation is demonstrated by encoding source code as decimal ASCII values, making detection harder. The video notes that professional reverse engineers can apply far more advanced obfuscation, and that malicious Compilers typically distribute only binaries, making source code inspection ineffective.
- LLVM Project and Compiler Infrastructure: The video highlights the LLVM project as a massive, widely-used compiler infrastructure that underpins many languages (C, C++, Objective-C, Swift, Rust, Fortran). This amplifies the risk because a Trojanized LLVM compiler could affect a huge ecosystem of software, including iOS apps compiled with Objective-C via clang.
- Challenges in Auditing Compilers: The sheer size and complexity of modern compiler codebases like GCC and LLVM make manual auditing impractical.
- Mitigation Technique: Diverse Double Compiling (DDC): Introduced by David Wheeler, DDC is a method to detect compiler tampering by compiling a compiler twice using two independent Compilers and comparing results. If the output binaries match bit-for-bit, it suggests no tampering. However, this requires having a trusted independent compiler and is complex to implement.
- Bootstrapping a Trusted Compiler: Another mitigation is to write a small, simple compiler by hand and use it to build larger Compilers, ensuring a trusted lineage.
- Real-world Example: XcodeGhost Malware (2015): The video references XcodeGhost, a Trojanized version of Apple’s Xcode IDE that infected iOS apps by injecting malicious code during compilation without developers’ knowledge, demonstrating the real-world feasibility of compiler-based attacks.
- Ken Thompson’s Admission and Legacy: Thompson admitted to inserting a Trojan backdoor in the Unix compiler at Bell Labs as a prank, which was cleverly designed to avoid detection even when inspecting assembly output. This historical anecdote underscores the seriousness of the threat.
- Philosophical and Practical Implications: The video closes with the sobering realization that trusting code you did not write yourself is inherently risky, yet writing and auditing all code personally is infeasible in modern software ecosystems.
Guides, Tutorials, and Demonstrations Provided
- Explanation of compiler self-hosting and Bootstrapping
- Writing and running quine programs in C and Fortran
- Creating a Trojanized quine to simulate compiler backdoors
- Basic Code obfuscation using ASCII decimal encoding
- Demonstration of compiling Objective-C iOS apps manually with LLVM
- Overview of the Diverse Double Compiling (DDC) mitigation technique
- Reference to
Category
Technology