Null-Terminated Data Handling
Relevant source files
Purpose and Scope
This document explains how the axptr library safely handles null-terminated data structures in user memory, such as C-style strings and arrays. These special data structures have variable length and are terminated by a sentinel "null" value rather than having an explicit length parameter. For information about general memory region checking, see Memory Region Checking.
Overview
Null-terminated data structures present unique challenges for safe memory access. Unlike fixed-size arrays, their length cannot be determined without scanning the memory until a null terminator is found. This requires special handling to ensure memory safety while efficiently accessing these structures.
flowchart TD
subgraph subGraph0["Null-terminated Data Handling"]
C["check_null_terminated()"]
D["Alignment Verification"]
E["Page-by-Page Scan"]
F["Return validated pointer + length"]
end
A["Kernel Code"]
B["User Memory Pointer"]
G["Safe Access Methods"]
H["Null-terminated arrays"]
I["C-strings"]
A --> B
B --> C
B --> G
C --> D
D --> E
E --> F
G --> H
G --> I
Sources: src/lib.rs(L56 - L107) src/lib.rs(L204 - L217) src/lib.rs(L282 - L292) src/lib.rs(L294 - L303)
Core Mechanism
The axptr library implements a specialized mechanism for safely handling null-terminated data from user space. This is performed by the check_null_terminated function.
sequenceDiagram
participant KernelCode as "Kernel Code"
participant UserPtrUserConstPtr as "UserPtr/UserConstPtr"
participant check_null_terminated as "check_null_terminated()"
participant UserMemory as "User Memory"
KernelCode ->> UserPtrUserConstPtr: get_as_null_terminated(aspace)
UserPtrUserConstPtr ->> check_null_terminated: check address space & memory
check_null_terminated ->> check_null_terminated: Check alignment
check_null_terminated ->> check_null_terminated: Set up page tracking
loop For each byte until null terminator
check_null_terminated ->> check_null_terminated: Check if current position crosses page boundary
alt Crosses page boundary
check_null_terminated ->> check_null_terminated: Check if new page is accessible
check_null_terminated ->> check_null_terminated: Move to next page
end
check_null_terminated ->> UserMemory: Read memory (with fault handling)
UserMemory -->> check_null_terminated: Return value
alt Value equals
alt terminator
check_null_terminated ->> check_null_terminated: Stop scanning
else Value not terminator
else Value not terminator
check_null_terminated ->> check_null_terminated: Increment position & counter
end
end
end
check_null_terminated ->> UserPtrUserConstPtr: Return pointer & length
UserPtrUserConstPtr ->> KernelCode: Return safe slice reference
Sources: src/lib.rs(L56 - L107)
Memory Layout Processing
The function processes null-terminated data by checking memory one page at a time, efficiently handling arbitrarily long data structures without needing to know their size in advance.
- Alignment Check: Ensures the starting address is properly aligned for the specified type.
- Page-by-Page Processing: Handles memory in page-sized chunks, validating each page before access.
- Safe Memory Reading: Uses the
access_user_memoryfunction to safely read user memory with proper fault handling. - Terminator Detection: Scans until it finds the terminator value (default value of type T).
The function returns a raw pointer to the start of the data and its length (excluding the terminator).
Sources: src/lib.rs(L56 - L107)
Access Methods for Null-Terminated Data
The library provides specialized methods for both UserPtr<T> and UserConstPtr<T> to handle null-terminated data.
Methods for UserPtr
UserPtr<T> provides the get_as_null_terminated method for accessing mutable null-terminated arrays:
For types that implement Eq + Default, this method:
- Calls
check_null_terminatedwith the appropriate access flags - Converts the raw pointer and length into a safe mutable slice
- Returns the slice wrapped in a
LinuxResult
Sources: src/lib.rs(L204 - L217)
Methods for UserConstPtr
Similarly, UserConstPtr<T> provides a read-only version of the same functionality:
Sources: src/lib.rs(L282 - L292)
C-String Handling
The library includes specialized handling for C-style strings through the get_as_str method on UserConstPtr<c_char>.
Processing Flow
flowchart TD A["UserConstPtr"] B["get_as_null_terminated()"] C["Memory transmute to &[u8]"] D["str::from_utf8()"] E["Return &str"] F["Return EILSEQ error"] A --> B B --> C C --> D D --> E D --> F
Sources: src/lib.rs(L294 - L303)
This method:
- Gets the null-terminated array of
c_charcharacters - Transmutes the slice from
&[c_char]to&[u8](safe sincec_charisu8) - Attempts to parse the byte slice as a UTF-8 string
- Returns either a valid string slice or an error if the string is not valid UTF-8
Sources: src/lib.rs(L294 - L303)
Technical Implementation Details
Accessing User Memory Safely
The check_null_terminated function uses the access_user_memory helper to safely access user memory while handling page faults properly. This ensures that:
- The
ACCESSING_USER_MEMflag is set to true during memory access - Any page faults occurring during the operation are handled correctly
- The flag is reset to false after the operation completes
Type Constraints
The null-terminated handling functions require that the type T implements both:
Eq- To compare values for equality with the terminatorDefault- To create the terminator value (usually zero/null)
This allows the system to work with different types of null-terminated data beyond just strings.
Memory Safety Guarantees
The null-terminated data handling system provides the following safety guarantees:
| Aspect | Guarantee |
|---|---|
| Memory Alignment | Ensures the pointer is properly aligned for type T |
| Access Permissions | Verifies each page has appropriate read/write permissions |
| Page Faults | Handles page faults during user memory access |
| Memory Boundaries | Safely traverses page boundaries |
| Data Validation | Ensures data is properly terminated |
| UTF-8 Validation | Validates UTF-8 encoding for strings |
Sources: src/lib.rs(L56 - L107) src/lib.rs(L204 - L217) src/lib.rs(L282 - L292) src/lib.rs(L294 - L303)
Practical Considerations
Performance Characteristics
Scanning for null terminators can potentially traverse many pages of memory, especially for long strings or arrays. The implementation optimizes this by:
- Checking page boundaries only when necessary
- Validating permissions at the page level, not for each element
- Using volatile reads for maximum safety with minimal overhead
Error Handling
The null-terminated data methods return LinuxResult values with appropriate error codes:
EFAULT- If memory is inaccessible or improperly alignedEILSEQ- If string data is not valid UTF-8 (forget_as_str)