Summary of "Programming in Assembly without an Operating System"
Technological focus & main takeaways
- Goal: Build and run a bare-metal-style game in x86-64 Assembly using no operating system, executed directly as a UEFI application (not a BIOS bootloader). The intent is total control of hardware with minimum OS interference.
- Target firmware layer: UEFI acts as a hardware/software abstraction layer.
- BIOS vs UEFI (Unified Extensible Firmware Interface)
- UEFI exposes an API via tables and protocols, letting code interact with platform features without hard-coding chip specifics.
UEFI services used
- Boot Services
- Memory-related services
- Loading other UEFI applications
- Waiting for events
- Runtime Services
- Getting date/time
- System reset
- Console I/O
- Uses Simple Text Output and Simple Text Input for output and key waiting
- Graphics
- Uses Graphics Output Protocol (GOP) for:
- Mode setting
- Framebuffer access
- Blt operations
- Uses Graphics Output Protocol (GOP) for:
- Parallel rendering
- Later uses EFI MP Services to spread rendering across multiple cores
- BSP (Bootstrap Processor) remains the main controller
- APs (Application Processors) perform rendering work
- Input improvement
- Uses Simple Pointer Protocol (mouse/joystick-like mapping)
- Left click = unlock
- Right click = fire
- Uses Simple Pointer Protocol (mouse/joystick-like mapping)
Product features / tooling / implementation details
UEFI program structure in assembly
- Entry point described as
EFI main - Console output approach:
- Manually navigates UEFI tables to locate:
- The Simple Text Output protocol pointer
- The OutputString function
- Manually navigates UEFI tables to locate:
- 64-bit assumptions
- On a 64-bit system, UEFI pointers/table entries are 8 bytes
Build and packaging flow
- UEFI executables are PE/COFF images (similar family to Windows executables)
- Build flow:
- Use ML64 (Microsoft Macro Assembler) to create an object file
- Use the Visual C++ linker configured with:
subsystem = EFI applicationentry = EFI main
Boot/placement requirements
- Requires GPT (GUID Partition Table)
- Uses the EFI System Partition with directory:
EFI/BOOT/
- Program name must be:
BOOTX64.EFI
Testing setup
- Primarily tested on real hardware
- Emulator testing via QEMU was mentioned as possible, but required additional hardware/effort
- Testing device cited:
- GEEKOM A9 MAX
Performance / timing analysis (problem + solution)
Initial timing approach and issue
- Uses UEFI runtime service
GetTime() - Problem:
- UEFI time resolution can be 1 second (based on EFI time capabilities)
- Result: cannot achieve smooth/high-frequency loops like 60 Hz using nanosecond-level timestamps from UEFI time
Fix: CPU timing using RDTSC
- Uses RDTSC (Read Time Stamp Counter)
- Notes:
- Invariant TSC ticks at a stable rate regardless of CPU frequency scaling
- Strategy:
- Use UEFI
GetTime()to detect second boundaries - Measure ticks per second with RDTSC
- Compute CPU cycles per frame for a target FPS
- Targets mentioned:
- 64 FPS
- then attempts 128 FPS
- Use UEFI
Parallel rendering rationale
- At higher scaling (duplicating each pixel into a 4×4 block), performance drops
- Cause:
- Too many GOP Blt calls (~65,000 calls)
- Solution:
- Use multi-core UEFI EFI MP Services to render in parallel
Graphics pipeline features (GOP usage + custom “PPU” design)
GOP mode selection and framebuffer behavior
- Queries supported modes and selects 1280×1024
- GOP mode setting may clear the buffer depending on the system
Framebuffer writing via Blt
Supported operations include:
- Filling rectangles with a color
- 24-bit color, stored as BGR in memory
- Copying raw pixel buffers into the framebuffer
Tile/sprite engine (“PPU-like” design)
Planned retro-style rendering pipeline:
- Two background tile layers
- A sprite layer rendered in between
- Top layer supports transparency:
- Skips transparent/black tiles
Buffer/scaling details referenced:
- Game output buffer: 256×256
- Scaling via GOP:
- Expands each pixel into a 4×4 square
- Aims to reach 1024×1024 within a larger 1280×1024 display
Scaling performance strategy
- Naive approach (per-pixel Blt) reduced FPS
- Optimized approach:
- APs render into an intermediate larger buffer (e.g., 1024×1024) by assigning scanlines to cores
- BSP then blits the final rendered buffer once per frame
Input system findings (and workaround)
Keyboard input limitation
- UEFI keyboard input event model is described as poor for real-time games
- Behavior issues:
- Key press sets an expiration timer and repeats based on event behavior
- Holding/tapping can cause repeated press states longer than desired
More game-friendly input via mouse protocol
- Uses Simple Pointer Protocol → GetState()
- Converts relative mouse movement into joystick-style control:
- Left click = unlock
- Movement = joystick axes
- Right click = fire
- Result:
- Much more responsive ship control
Hardware / product context from the video
- Device used: GEEKOM A9 MAX
- AMD Ryzen AI 9 HX370
- 2TB SSD
- Dual-channel 32GB DDR5
- Wi‑Fi 7
- Multiple USB/Ethernet/HDMI ports, SD slot
- Multi-core CPU supports aggressive parallel rendering
- Firmware requirement handled:
- Disables Secure Boot to allow custom UEFI app execution
- Performance context:
- The system was used earlier for CPU/GPU benchmarking and is described as capable for heavy tasks and high-FPS testing in the demo
Key “review / guide / tutorial” style elements (what the video teaches)
- Writing a UEFI application in (x64) assembly
- Navigating tables/protocols to call
OutputString - Using boot/runtime services via UEFI tables
- Navigating tables/protocols to call
- Packaging and booting
- GPT + EFI System Partition layout
- Naming/placement:
EFI/BOOT/BOOTX64.EFI
- Timing correctly in UEFI
- Why UEFI
GetTime()may have low resolution - Using RDTSC for frame-accurate loops
- Why UEFI
- Rendering using GOP
- Set modes and query supported resolutions
- Use
Bltfor framebuffer writes
- Parallelizing inside UEFI
- Use EFI MP Services and APs for concurrent scanline rendering
- Getting better input for a game
- Keyboard event shortcomings
- Use Simple Pointer Protocol as an alternative input source
Main speakers / sources
- Speaker: The video creator (author of the “Programming in Assembly without an Operating System” demo), who narrates and implements “Space Game for x64”
- Hardware source: GEEKOM (GEEKOM A9 MAX)
- Firmware/tooling sources referenced:
- Intel EFI/UEFI legacy and open-source maintenance by TianoCore volunteers
- Tools mentioned: EFI Development Kit II, ML64, MSVC linker
- Project/source code location:
- GitHub (Space Game for x64)
- Patreon/YouTube membership follow-up
Category
Technology
Share this summary
Is the summary off?
If you think the summary is inaccurate, you can reprocess it with the latest model.
Preparing reprocess...