ARM Semihosting (Part 1) - Introduction
Table of Contents
In this series I explore ARM Semihosting and work toward a small C library and host-side tools to track and visualize heap activity on a live target. For development and testing I’ll use the RP2350 (Raspberry Pi Pico 2), but the concepts apply broadly to other ARM platforms.
Introduction to ARM Semihosting⏚
ARM semihosting lets code running on a target MCU request services from the host (console I/O, file I/O, system queries, etc.) via the debugger. The target signals a semihosting request by issuing a special exception (typically using BKPT, SVC, or a similar trap instruction). The debugger intercepts the trap, performs the requested operation on behalf of the target, returns the result (usually in R0), and resumes execution.
On ARMv6-M or ARMv7-M devices, semihosting traps are typically generated with
BKPT. On other ARM profiles the mechanism may useSVC/SWI. The observable effect is the same: the debugger handles the request and returns a result to the target.
Semihosting is often useful during board bring-up when serial/UART hardware is not yet available, since it lets you get printf-like output and file operations over the debug connection. There are alternatives with different trade-offs (SWO, Segger RTT, dedicated UART/USB logging), and semihosting has limitations I cover below, but it remains a convenient tool during early development.
How it works (brief)⏚
There are various blog posts1 2 and documentation3 that do a much better job of describing in detail how semihosting works, but at a high level we have the following:
- A special instruction is issued by the target that indicates that a special semihosting operation is being performed. For this instruction
R0will contain an operation number, and any parameter is passed viaR1. - This will cause the CPU to halt, and now its the debugger’s turn to take the data in
R0(and optionallyR1), perform the requested operation and return the result to the target (inR0), along with resuming the core. - Some common operations that the debugger might support via semihosting are:
SYS_OPEN / SYS_CLOSE,SYS_READ / SYS_WRITE…
Pros and Cons⏚
- Pros: You usually already have SWD/JTAG connected during development, so semihosting requires no additional board wiring. It provides convenient, host-backed I/O without a serial port.
- Cons: Semihosting is blocking — the CPU halts while the debugger performs the operation. If no debugger is attached (or the debugger doesn’t implement the requested call), the trap can cause a fault unless you explicitly handle or detect that case. That requires defensive code or configuration to ensure production systems do not rely on semihosting.
Alternatives to semihosting:⏚
- SWO: Good for streaming debug output with lower overhead than semihosting, but it requires the SWO pin to be available and a debugger that supports it.
- Segger RTT: Non-blocking and very flexible for real-time logging and data transfer over SWD, but is a proprietary solution that requires Segger tooling and their RTT library.
Semihosting Project Idea⏚
Heap instrumentation and visualization⏚
The goal is to use semihosting to instrument heap activity on a running target and produce a visual timeline of allocations and deallocations. With this I want to be able to answer two questions for a running target:
- How many allocations and deallocations occur over time?
- What does the heap’s layout look like as the system runs?
My hope if to have an implementation that looks as follows:
- Reserve (statically) a small circular buffer in RAM to record heap events.
- Wrap
malloc/realloc/free(or the platform’s allocator hooks) to capture basic metadata for each operation and append an event to the buffer. - When the buffer fills or a flush condition is met, issue a semihosting transfer to dump the buffered events to the host (only if a debugger is attached).
- Implement a host-side tool to parse the event stream and generate time-series graphs and heap-layout visualizations.
Development target⏚
The library should be portable across ARM platforms. For development I’ll use the Raspberry Pi Pico 2 (RP2350). Initially I’ll use a single Cortex‑M33 core for the instrumented firmware; later I may explore offloading data transfer to the second core or using a non-blocking transport to avoid halting the main application.