Kernel: ELF Loader
Overview
The IRIX kernel's ELF loader manages the execution of ELF-format binaries on MIPS-based systems, accommodating 32-bit and 64-bit architectures along with multiple ABIs. It performs tasks such as validating file headers, preparing the user stack and arguments, mapping memory segments, handling dynamic linking, and applying hardware-specific adjustments. Integration with kernel components like address space handling, virtual memory allocation, process control, and resource management ensures efficient and reliable execution.
The loader supports ABI transitions during execution, resource pre-allocation for constrained environments, and extensions for MIPS hardware errata. It processes both static and dynamic executables, coordinating with the runtime linker for shared libraries.
Key Functions
The loader comprises several core operations that orchestrate the execution process. Below is a detailed functional overview of these operations, describing their roles, inputs, logic, and outputs.
Header Retrieval and Validation
This function reads the ELF header and program headers from the executable file. It validates essential fields such as the magic number, class (32-bit or 64-bit), data encoding, file type (executable or dynamic), and MIPS-specific architecture flags. It determines the ABI version based on machine type and flags, ensuring compatibility with the system's capabilities. On success, it provides the parsed headers for further processing; failures result in errors like invalid executable format or bad request due to unsupported architecture.
ELF Execution Dispatch
The primary entry point for ELF binary execution dispatches based on the file's class to handle 32-bit or 64-bit specifics. It retrieves headers, scans program headers for dynamic linking indicators, auxiliary setup needs, and hardware options. It calculates auxiliary vector size, allocates kernel space for argument building, constructs the user stack, and reserves memory resources if required for batch processing. The process splits to manage stack usage, proceeding to image removal and new address space creation. Errors lead to cleanup and restoration of original state.
Execution Completion
Continuing from the dispatch, this phase removes the existing process image, creates a new address space with affinity considerations, and maps segments. For dynamic binaries, it loads the runtime linker, builds extended auxiliary vectors, and sets up the stack with execution parameters. It configures registers, including floating-point unit settings based on architecture, and transfers control to the entry point. Failures trigger process termination with detailed logging.
Segment Mapping
This operation maps ELF segments into the address space, focusing on loadable sections. It computes protections (read, write, execute) and flags (local mapping, heap consideration), handles zero-fill regions, and applies workarounds for hardware issues in executable pages. It identifies special program headers for dynamic linking, shared libraries, and options. Mapping occurs with file-backed regions, supporting checkpointing. Errors include memory shortages or invalid segment orders.
Accessible from user space, this function maps segments for shared objects, typically invoked by the runtime linker. It validates file descriptors and permissions, copies program headers from user space, and performs mappings with potential workarounds. On memory errors, it attempts relocation to an alternate address range. Success returns the base address; failures unmap partial allocations.
Memory Requirement Calculation
These utilities estimate the physical memory needed for execution, summing requirements for writable segments, the user stack, and the runtime linker. Computations account for page alignments and cache results for repeated use, updating on file changes. This supports early reservation in resource-managed environments, preventing failures during critical phases.
Hardware Workaround Check
For specific CPU errata, this checks program headers for cleanliness indicators regarding problematic instructions. If needed, it flags the use of proxy mechanisms during mapping to mitigate issues in read-only executable regions.
Undocumented or IRIX-Specific Interfaces and Behaviors
MIPS Hardware Workarounds in PT_MIPS_OPTIONS
IRIX scans PT_MIPS_OPTIONS sections for hardware patch flags:
- OHW_R5KCVTL (0x8): Indicates the binary is clean of the R5000 cvt.[ds].l bug. If absent on affected CPUs (with R5000_cvt_war enabled, typically n32/n64 ABIs), the kernel uses the mtext subsystem (proxy vnodes) to rewrite problematic instructions in read-only executable pages during mapping.
- OHW_R8KPFETCH (TFP_PREFETCH_WAR): On TFP (R8000) CPUs, non-dynamic MIPS4 binaries with prefetch instructions are rejected (EBADRQC) unless chk_ohw_r8kpfetch is non-zero. For dynamic binaries, the kernel sets a flag to add AT_PFETCHFD to auxv (see below).
These flags trigger kernel-level instruction patching or rejection, not documented in public ELF ABI supplements.
Extended Auxiliary Vector Entries
- AT_PFETCHFD: Provides an open file descriptor to the executable for the runtime linker (rld) to apply prefetch instruction patches (no-ops) on TFP systems. Added when a dynamic binary requires the prefetch workaround.
Standard auxv entries (AT_PHDR, AT_BASE, etc.) are used, but AT_PFETCHFD is an IRIX-specific extension.
Memory Reservation (availsmem) for Exec
- Functions elf_compute_availsmem and elf_get_availsmem precompute physical memory needs for writable segments (PF_W in PT_LOAD), user stack, and the runtime linker.
- Cached globally in runtime_loader_availsmem (recomputed if the runtime linker's vnode changes).
- For Miser batch jobs, reserves memory early via vm_pool_exec_reservemem; failure returns EMEMRETRY for retry.
This mechanism prevents mid-exec failures in memory-constrained environments.
Mapping Interfaces
- execmap / mtext_execmap_vnode: Low-level mapping of file-backed regions with ZFOD, protections, and flags (e.g., MAP_LOCAL, MAP_BRK, MAP_PRIMARY).
- elfmap: User-called (from rld) to map DSO PT_LOAD segments. Supports alternate address relocation on ENOMEM by finding a free range and adjusting vaddrs.
- Integration with checkpointing (ckpt handles passed to mapping calls).
Other Internal Behaviors
- ABI transition handling: Temporarily sets process ABI proxy during exec; resets on failure.
- NUMA/affinity: New address space creation via aspm_exec_create and affinity link setup.
- FPU flags from PT_MIPS_OPTIONS (ODK_EXCEPTIONS): Sets precise exceptions or speculative mode in process FP flags.
- Runtime linker loading: Caches availsmem needs; sticky bit (VSVTX) on rld allows saving pregions.
- Error reporting: Detailed kill messages with reason codes for failed execs.
These features extend standard ELF handling with IRIX-specific resource management, hardware errata mitigation, and kernel-runtime linker coordination.