Overview
Relevant source files
This document provides an introduction to the weak-map repository, a Rust library that implements a specialized B-Tree map data structure capable of storing weak references to values. These entries are automatically removed when the referenced values are dropped, preventing memory leaks and dangling references.
For detailed information about specific components, see Core Components, WeakMap and StrongMap, and Reference Traits. For practical applications, refer to Usage Guide.
Sources: README.md(L1 - L7) src/lib.rs(L1 - L3)
What is weak-map?
The weak-map library offers a Rust implementation of WeakMap
, which wraps the standard BTreeMap
to store weak references to values rather than the values themselves. This approach enables memory-efficient collections where entries are automatically removed when the referenced values are dropped elsewhere in the program.
Key characteristics:
- No standard library dependency (works in
no_std
environments) - Support for both single-threaded (
Rc
) and thread-safe (Arc
) reference counting - Similar to the weak-table library, but uses
BTreeMap
as its underlying implementation
Sources: README.md(L1 - L7) src/lib.rs(L1 - L6) Cargo.toml(L2 - L11)
Core Components
The library consists of four main components defined in the src/map.rs
and src/traits.rs
files and exposed through src/lib.rs
:
- WeakMap: A map that stores weak references to values, automatically cleaning up entries when values are dropped
- StrongMap: A simpler wrapper around
BTreeMap
for storing strong references - WeakRef: Trait defining the interface for weak reference types
- StrongRef: Trait defining the interface for strong reference types
Component Architecture
flowchart TD A["WeakMap"] B["BTreeMap"] C["WeakRef Trait"] D["StrongMap"] E["StrongRef Trait"] A --> B A --> C D --> B E --> C
Sources: src/lib.rs(L9 - L13)
Working Mechanism
The WeakMap
data structure operates through a reference conversion process:
- Insertion: When a value is inserted, it's first converted to a weak reference using the
downgrade
method from theStrongRef
trait - Storage: The weak reference is stored in the underlying
BTreeMap
- Retrieval: When retrieving a value, the weak reference is obtained from the
BTreeMap
- Upgrade Attempt: The system attempts to upgrade the weak reference to a strong reference using the
upgrade
method from theWeakRef
trait - Result: If the original value has been dropped, the upgrade fails and returns
None
- Cleanup: Periodically, after a certain number of operations (defined by
OPS_THRESHOLD
), theWeakMap
removes expired references
Operation Flow
flowchart TD Client["Client"] WeakMap["WeakMap"] WeakRef["WeakRef"] BTreeMap["BTreeMap"] Cleanup["Cleanup Process"] Cleanup --> BTreeMap Client --> WeakMap WeakMap --> BTreeMap WeakMap --> Cleanup WeakMap --> Client WeakMap --> WeakRef
Sources: src/map.rs
Reference Management System
The library defines two core traits that abstract over reference-counted types:
- StrongRef: Implemented for reference-counted types like
Rc
andArc
- Provides
downgrade()
to convert a strong reference to a weak reference - Provides
ptr_eq()
to check if two references point to the same value
- WeakRef: Implemented for weak reference types like
Weak<T>
from bothRc
andArc
- Provides
upgrade()
to attempt converting a weak reference to a strong reference - Provides
is_expired()
to check if the referenced value has been dropped
This trait-based design allows WeakMap
to work with different reference-counted types flexibly.
Reference Traits Implementation
classDiagram class StrongRef { <<trait>> type Weak downgrade() -~ Self::Weak ptr_eq(other: &Self) -~ bool } class WeakRef { <<trait>> type Strong upgrade() -~ Option is_expired() -~ bool } class Rc { downgrade() -~ Weak } class Arc { downgrade() -~ Weak } class RcWeak { } class ArcWeak { } StrongRef ..|> Rc : "implements for" StrongRef ..|> Arc : "implements for" RcWeak ..|> Rc : "implements for" ArcWeak ..|> Arc : "implements for"
Sources: src/traits.rs
Common Use Cases
The WeakMap
is particularly useful in scenarios where:
- Caching: Storing objects that may be dropped elsewhere without creating memory leaks
- Observer Pattern: Implementing observers without creating reference cycles
- Object Registry: Maintaining a registry of objects without preventing them from being garbage collected
- Graph Data Structures: Working with graphs while avoiding circular reference memory leaks
- Resource Management: Tracking resources without extending their lifetime
Sources: README.md src/lib.rs(L1 - L3)
Project Structure
The weak-map library is organized into the following key files:
File | Purpose |
---|---|
src/lib.rs | Entry point of the library, re-exporting the main components |
src/map.rs | Contains the implementations ofWeakMapandStrongMap |
src/traits.rs | Defines theStrongRefandWeakReftraits |
Project Structure Diagram
flowchart TD lib["src/lib.rs"] map["src/map.rs"] traits["src/traits.rs"] WeakMap["WeakMap implementation"] StrongMap["StrongMap implementation"] StrongRef["StrongRef trait"] WeakRef["WeakRef trait"] lib --> map lib --> traits map --> StrongMap map --> WeakMap traits --> StrongRef traits --> WeakRef
Sources: src/lib.rs(L7 - L13)
Related Documentation
For more detailed information about specific aspects:
- For implementation details of
WeakMap
andStrongMap
, see WeakMap and StrongMap - For more information about reference traits, see Reference Traits
- For usage examples, see Basic Usage Examples
- For performance considerations and memory management details, see Implementation Details
Sources: README.md
Core Components
Relevant source files
This document provides an overview of the main components that make up the weak-map library and their relationships. It explains the architectural structure and key mechanisms that enable the library's functionality of maintaining maps with weak references.
For detailed implementation details of each component, see WeakMap and StrongMap and Reference Traits. For usage examples, refer to the Usage Guide.
System Architecture
The weak-map library is built around several core components that work together to provide a map data structure that automatically removes entries when referenced values are dropped.
flowchart TD subgraph subGraph2["Internal Management"] OC["OpsCounter"] CU["Cleanup mechanism"] end subgraph subGraph1["Reference Abstraction Layer"] WR["WeakRef trait"] SR["StrongRef trait"] end subgraph subGraph0["Core Data Structures"] WM["WeakMap<K,V>"] SM["StrongMap<K,V>(alias for BTreeMap)"] end BT["BTreeMap<K,V>"] CU --> BT OC --> CU SM --> BT WM --> BT WM --> OC WM --> WR WR --> SR
Sources: src/map.rs(L57 - L65) src/traits.rs(L3 - L40)
Key Components
1. WeakMap
WeakMap<K, V>
is the primary data structure provided by this library. It wraps a BTreeMap
and stores weak references to values, automatically cleaning up entries when the referenced values are dropped.
Key characteristics:
- Generic over key type
K
and weak reference typeV
V
must implement theWeakRef
trait- Contains an operations counter to trigger periodic cleanup
- Provides methods to insert, retrieve, and remove entries that handle weak reference conversion
classDiagram class WeakMap~K,V~ { -BTreeMap~K,V~ inner -OpsCounter ops +new() WeakMap +get(key) Option~V::Strong~ +insert(key, value) Option~V::Strong~ +remove(key) Option~V::Strong~ +cleanup() +len() usize +is_empty() bool } class OpsCounter { -AtomicUsize counter +bump() +reset() +reach_threshold() bool } WeakMap o-- OpsCounter
Sources: src/map.rs(L62 - L307) src/map.rs(L13 - L55)
2. StrongMap
StrongMap<K, V>
is a simple type alias for the standard BTreeMap<K, V>
. It serves as a counterpart to WeakMap
for situations where strong references are needed.
pub type StrongMap<K, V> = btree_map::BTreeMap<K, V>;
Sources: src/map.rs(L57 - L58)
3. Reference Traits
The library defines two key traits that abstract over reference types:
WeakRef Trait
Defines the interface for weak references:
classDiagram class WeakRef { <<trait>> type Strong upgrade() Option~Self::Strong~ is_expired() bool } class RcWeak~T~ { upgrade() Option~Rc~T~~ is_expired() bool } class ArcWeak~T~ { upgrade() Option~Arc~T~~ is_expired() bool } WeakRef ..|> RcWeak : implements WeakRef ..|> ArcWeak : implements
StrongRef Trait
Defines the interface for strong references:
classDiagram class StrongRef { <<trait>> type Weak downgrade() Self::Weak ptr_eq(other) bool } class Rc~T~ { downgrade() Weak~T~ ptr_eq(other) bool } class Arc~T~ { downgrade() Weak~T~ ptr_eq(other) bool } StrongRef ..|> Rc : implements StrongRef ..|> Arc : implements
Sources: src/traits.rs(L3 - L19) src/traits.rs(L21 - L40) src/traits.rs(L42 - L88)
4. Operations Counter and Cleanup Mechanism
The OpsCounter
is an internal component that:
- Tracks the number of operations performed on a
WeakMap
- Triggers cleanup after a threshold (1000 operations)
- Uses atomic operations for thread safety
The cleanup mechanism removes expired weak references from the map, preventing memory leaks and maintaining map efficiency.
Sources: src/map.rs(L13 - L48) src/map.rs(L158 - L169)
Component Interactions
The core components interact to provide the WeakMap
functionality:
sequenceDiagram participant Client as Client participant WeakMap as WeakMap participant BTreeMap as BTreeMap participant WeakRefinstance as "WeakRef instance" participant OpsCounter as OpsCounter Client ->> WeakMap: insert(key, value) WeakMap ->> OpsCounter: bump() OpsCounter -->> WeakMap: reach_threshold()? alt threshold reached WeakMap ->> WeakMap: cleanup() loop for each entry WeakMap ->> WeakRefinstance: is_expired()? alt expired WeakMap ->> BTreeMap: remove(key) end end WeakMap ->> OpsCounter: reset() end WeakMap ->> WeakRefinstance: downgrade(value) WeakMap ->> BTreeMap: insert(key, weak_ref) BTreeMap -->> WeakMap: old_weak_ref? alt had old value WeakMap ->> WeakRefinstance: upgrade(old_weak_ref) WeakRefinstance -->> WeakMap: upgraded_value? end WeakMap -->> Client: result Note over WeakMap,WeakRefinstance: Later when retrieving... Client ->> WeakMap: get(key) WeakMap ->> OpsCounter: bump() WeakMap ->> BTreeMap: get(key) BTreeMap -->> WeakMap: weak_ref? alt has weak_ref WeakMap ->> WeakRefinstance: upgrade(weak_ref) WeakRefinstance -->> WeakMap: strong_ref? WeakMap -->> Client: strong_ref? else no entry WeakMap -->> Client: None end
Sources: src/map.rs(L152 - L277) src/traits.rs(L3 - L40)
Implementation Details
WeakMap Implementation
The WeakMap
is implemented using a BTreeMap
with the following key mechanisms:
Component | Purpose | Implementation |
---|---|---|
innerfield | Stores the actual map data | BTreeMap<K, V> |
opsfield | Tracks operations for cleanup | OpsCounter |
cleanupmethod | Removes expired references | Callsis_expired()on each value |
getmethod | Retrieves and upgrades references | Usesupgrade()fromWeakRef |
insertmethod | Stores new weak references | Usesdowngrade()fromStrongRef |
Sources: src/map.rs(L62 - L65) src/map.rs(L158 - L169) src/map.rs(L207 - L214) src/map.rs(L258 - L263)
Reference Trait Implementations
The library implements the reference traits for both Rc
/Weak
(for single-threaded use) and Arc
/Weak
(for multi-threaded use):
flowchart TD subgraph Multi-threaded["Multi-threaded"] Arc["Arc<T>"] ArcWeak["Weak<T>"] end subgraph Single-threaded["Single-threaded"] Rc["Rc<T>"] SR["StrongRef"] RcWeak["Weak<T>"] WR["WeakRef"] end Arc --> ArcWeak Arc --> SR ArcWeak --> Arc ArcWeak --> WR Rc --> RcWeak Rc --> SR RcWeak --> Rc RcWeak --> WR
Sources: src/traits.rs(L42 - L88)
Operations Counter
The operations counter uses atomic operations to ensure thread safety when tracking operations:
- Increments a counter with each operation
- Triggers cleanup when the threshold of 1000 operations is reached
- Resets after cleanup
Sources: src/map.rs(L13 - L48) src/map.rs(L16)
Iterator Support
WeakMap
provides several iterator types to access its contents:
Iterator | Description | Returned By |
---|---|---|
Iter | References to keys and upgraded values | iter() |
Keys | References to just the keys | keys() |
Values | Just the upgraded values | values() |
IntoIter | Owned keys and upgraded values | into_iter() |
IntoKeys | Owned keys | into_keys() |
IntoValues | Just the upgraded values | into_values() |
Each iterator handles weak reference upgrading automatically, skipping expired values.
Sources: src/map.rs(L382 - L623)
WeakMap and StrongMap
Relevant source files
This document provides a detailed explanation of the WeakMap
and StrongMap
data structures, their implementation, and usage within the weak-map library. These structures are core components that enable efficient memory management through the use of weak references. For information about the reference traits that power these structures, see Reference Traits.
Overview
WeakMap
is a specialized B-Tree map that stores weak references to values, automatically removing entries when the referenced values are dropped. StrongMap
is a simpler alias for the standard BTreeMap
structure. Together, they provide comprehensive solutions for storing mappings with different reference semantics.
classDiagram class WeakMap { inner: BTreeMap ops: OpsCounter +new() +get(key) +insert(key, value) +remove(key) +cleanup() } class StrongMap { "Alias for BTreeMap" } class OpsCounter { count: AtomicUsize +bump() +reset() +reach_threshold() } class BTreeMap { "Standard Rust BTreeMap" } WeakMap *-- OpsCounter : contains WeakMap *-- BTreeMap : stores data in StrongMap --> BTreeMap : type alias for
Sources: src/map.rs(L57 - L65) src/lib.rs(L9 - L10)
Internal Structure
WeakMap
WeakMap<K, V>
is implemented as a wrapper around a BTreeMap<K, V>
with an additional OpsCounter
to track operations for cleanup purposes.
flowchart TD subgraph subGraph0["WeakMap"] A["inner: BTreeMap"] B["ops: OpsCounter"] end C["BTreeMap"] D["AtomicUsize"] E["Weak Reference"] F["Strong Reference"] G["OPS_THRESHOLD (1000)"] C --> A D --> B E --> A E --> F F --> E G --> B
Sources: src/map.rs(L62 - L65) src/map.rs(L13 - L16)
The key features of WeakMap
:
- Inner Storage: Uses a standard
BTreeMap
to store key-value pairs - Operations Counter: Tracks the number of operations to trigger periodic cleanup
- Weak References: Values are stored as weak references, allowing them to be automatically collected when no strong references remain
- Cleanup Mechanism: Periodically removes expired weak references after
OPS_THRESHOLD
operations
StrongMap
StrongMap
is a simple type alias for the standard Rust BTreeMap
:
pub type StrongMap<K, V> = btree_map::BTreeMap<K, V>;
It provides a counterpart to WeakMap
for cases where strong references are needed.
Sources: src/map.rs(L57 - L58)
Operations Counter
The OpsCounter
structure manages automatic cleanup of expired references:
sequenceDiagram participant Client as Client participant WeakMap as WeakMap participant OpsCounter as OpsCounter participant BTreeMap as BTreeMap Client ->> WeakMap: insert/get/remove operation WeakMap ->> OpsCounter: bump() OpsCounter ->> OpsCounter: increment counter OpsCounter ->> WeakMap: check if reach_threshold() alt Threshold reached (1000 operations) WeakMap ->> WeakMap: cleanup() loop for all entries WeakMap ->> BTreeMap: get entry WeakMap ->> WeakMap: check if is_expired() alt is expired WeakMap ->> BTreeMap: remove entry end end WeakMap ->> OpsCounter: reset() end
Sources: src/map.rs(L13 - L48) src/map.rs(L152 - L169)
The cleanup mechanism has these characteristics:
- Operations are counted using an atomic counter
- After
OPS_THRESHOLD
(1000) operations, cleanup is triggered - During cleanup, all expired weak references are removed
- The operations counter is reset after cleanup
Core API
Creation and Basic Operations
WeakMap
provides the following core methods:
Method | Description | Source |
---|---|---|
new() | Creates a new, emptyWeakMap | src/map.rs72-77 |
insert(key, value) | Inserts a key-value pair, storing a weak reference to the value | src/map.rs258-263 |
get(key) | Returns the value corresponding to the key, if it exists and hasn't been dropped | src/map.rs207-214 |
remove(key) | Removes a key from the map, returning the value if present | src/map.rs270-277 |
clear() | Removes all entries from the map | src/map.rs106-109 |
len() | Returns the number of valid (non-expired) elements | src/map.rs177-179 |
raw_len() | Returns the total number of elements, including expired references | src/map.rs113-115 |
Sources: src/map.rs(L103 - L307)
Iteration
WeakMap
provides various iterators, all of which automatically filter out expired references:
flowchart TD A["WeakMap"] B["iter()"] C["Iter"] D["keys()"] E["Keys"] F["values()"] G["Values"] H["into_iter()"] I["IntoIter"] J["into_keys()"] K["IntoKeys"] L["into_values()"] M["IntoValues"] A --> B A --> D A --> F A --> H A --> J A --> L B --> C D --> E F --> G H --> I J --> K L --> M
Sources: src/map.rs(L119 - L149) src/map.rs(L382 - L623)
Key aspects of iterators:
- All iterators automatically filter out expired weak references
- Both borrowing and consuming iterators are provided
- Iterators for keys, values, and key-value pairs
Weak Reference Management
The core functionality of WeakMap
revolves around its handling of weak references:
sequenceDiagram participant Client as Client participant WeakMap as WeakMap participant BTreeMap as BTreeMap participant WeakRef as WeakRef participant StrongRef as StrongRef Client ->> WeakMap: insert(key, &strong_ref) WeakMap ->> StrongRef: downgrade() StrongRef -->> WeakMap: weak_ref WeakMap ->> BTreeMap: store(key, weak_ref) Note over Client,BTreeMap: Later... Client ->> WeakMap: get(key) WeakMap ->> BTreeMap: retrieve weak_ref BTreeMap -->> WeakMap: weak_ref WeakMap ->> WeakRef: upgrade() alt Reference still valid WeakRef -->> WeakMap: Some(strong_ref) WeakMap -->> Client: strong_ref else Reference expired WeakRef -->> WeakMap: None WeakMap -->> Client: None end
Sources: src/map.rs(L207 - L214) src/map.rs(L258 - L263)
When the original strong reference is dropped (when no more strong references exist):
- The weak reference in the map becomes expired
- Future calls to
get()
will returnNone
- The entry will be removed during the next cleanup cycle
Conversion Operations
WeakMap
provides several conversion methods and implementations:
From | To | Method/Trait |
---|---|---|
BTreeMap<K, V> | WeakMap<K, V> | Fromtrait |
WeakMap<K, V> | BTreeMap<K, V> | Fromtrait |
WeakMap<K, V> | StrongMap<K, V::Strong> | upgrade() |
&StrongMap<K, V::Strong> | WeakMap<K, V> | Fromtrait |
Iterator of(K, &V::Strong) | WeakMap<K, V> | FromIteratortrait |
Array[(K, &V::Strong); N] | WeakMap<K, V> | Fromtrait |
Sources: src/map.rs(L86 - L101) src/map.rs(L296 - L307) src/map.rs(L341 - L380)
Example Use Cases
- Cache with Automatic Cleanup:
- Store computation results keyed by input parameters
- Results are automatically removed when no longer referenced elsewhere
- Observer Pattern:
- Track observers without preventing them from being garbage collected
- Automatically clean up references to observers that have been dropped
- Resource Pooling:
- Maintain a pool of resources without keeping them alive indefinitely
- Resources are automatically removed from the pool when no longer in use
Memory Management Considerations
- Memory Leaks:
WeakMap
helps prevent memory leaks by not keeping values alive when they're no longer needed elsewhere - Cleanup Overhead: The periodic cleanup process introduces some overhead, but it's amortized over many operations
- Reference Counting Overhead: Using weak references incurs the overhead of reference counting, which is generally acceptable for most applications
Sources: src/map.rs(L152 - L169) src/map.rs(L625 - L660)
Performance Characteristics
For more detailed information about performance considerations, see Performance Considerations.
Operation | Time Complexity | Notes |
---|---|---|
insert() | O(log n) | Plus potential O(n) cleanup once perOPS_THRESHOLDoperations |
get() | O(log n) | |
remove() | O(log n) | |
len() | O(n) | Linear as it needs to check each entry for expiration |
raw_len() | O(1) | |
cleanup() | O(n) |
Sources: src/map.rs(L176 - L179) src/map.rs(L158 - L161)
Reference Traits
Relevant source files
This document explains the reference trait system that forms the foundation of the weak-map library. The reference traits provide a flexible abstraction layer for working with different types of references (both strong and weak) in a generic way, enabling the core functionality of WeakMap
. For details about the map implementations themselves, see WeakMap and StrongMap.
Overview of Reference Traits
The weak-map library defines two fundamental traits that abstract over reference types:
StrongRef
- Represents a strong reference that keeps a value aliveWeakRef
- Represents a weak reference that doesn't prevent a value from being dropped
These traits allow the WeakMap
to work with different types of reference-counted values without being tied to specific implementations.
Sources: src/traits.rs(L3 - L40)
The StrongRef Trait
The StrongRef
trait defines an interface for types that represent strong references to heap-allocated values. A strong reference keeps the referenced value alive for as long as the reference exists.
#![allow(unused)] fn main() { pub trait StrongRef { type Weak: WeakRef<Strong = Self>; fn downgrade(&self) -> Self::Weak; fn ptr_eq(&self, other: &Self) -> bool; } }
The trait requires:
Member | Type | Purpose |
---|---|---|
Weak | Associated type | Specifies the corresponding weak reference type |
downgrade() | Method | Converts a strong reference to a weak reference |
ptr_eq() | Method | Compares two strong references for pointer equality |
The Weak
associated type establishes a relationship with the WeakRef
trait, ensuring that both traits are implemented in a compatible way.
Sources: src/traits.rs(L3 - L19)
The WeakRef Trait
The WeakRef
trait defines an interface for types that represent weak references to heap-allocated values. A weak reference does not keep the referenced value alive; it can be used to access the value only if it's still alive due to strong references elsewhere.
#![allow(unused)] fn main() { pub trait WeakRef { type Strong: StrongRef<Weak = Self>; fn upgrade(&self) -> Option<Self::Strong>; fn is_expired(&self) -> bool { self.upgrade().is_none() } } }
The trait requires:
Member | Type | Purpose |
---|---|---|
Strong | Associated type | Specifies the corresponding strong reference type |
upgrade() | Method | Attempts to convert a weak reference to a strong reference |
is_expired() | Method | Checks if the weak reference is expired (has default implementation) |
The Strong
associated type complements the Weak
type in StrongRef
, creating a bidirectional relationship between the two traits.
Sources: src/traits.rs(L21 - L40)
Trait Implementations
The library provides implementations of these traits for standard Rust reference-counting types. This allows WeakMap
to work with both single-threaded and thread-safe reference types.
flowchart TD subgraph subGraph2["Thread-safe RC"] Arc["std::sync::Arc<T>"] ArcWeak["std::sync::Weak<T>"] end subgraph subGraph1["Single-threaded RC"] Rc["std::rc::Rc<T>"] RcWeak["std::rc::Weak<T>"] end subgraph subGraph0["Reference Traits"] S["StrongRef Trait"] W["WeakRef Trait"] end Arc --> ArcWeak ArcWeak --> Arc Rc --> RcWeak RcWeak --> Rc S --> Arc S --> Rc W --> ArcWeak W --> RcWeak
Sources: src/traits.rs(L42 - L88)
Implementation for std::rc
For single-threaded reference counting, the traits are implemented for std::rc::Rc<T>
and std::rc::Weak<T>
:
Type | Trait | Implementation Details |
---|---|---|
Rc | StrongRef | UsesRc::downgradeandRc::ptr_eq |
Weak | WeakRef | UsesWeak::upgradeand customis_expiredthat checksstrong_count |
The is_expired
implementation for Weak<T>
optimizes the check by directly using strong_count() == 0
instead of trying to upgrade the reference.
Sources: src/traits.rs(L42 - L64)
Implementation for std::sync
For thread-safe reference counting, the traits are implemented for std::sync::Arc<T>
and std::sync::Weak<T>
:
Type | Trait | Implementation Details |
---|---|---|
Arc | StrongRef | UsesArc::downgradeandArc::ptr_eq |
Weak | WeakRef | UsesWeak::upgradeand customis_expiredthat checksstrong_count |
Similar to the std::rc
implementation, the is_expired
implementation for std::sync::Weak<T>
directly checks the strong count for efficiency.
Sources: src/traits.rs(L66 - L88)
Reference Traits in Action
The reference traits enable WeakMap
to work with different reference types in a generic way. Here's how these traits are used in the workflow of WeakMap
:
sequenceDiagram participant Client as Client participant WeakMap as WeakMap participant StrongRef as StrongRef participant WeakRef as WeakRef Client ->> WeakMap: insert(key, strong_ref) WeakMap ->> StrongRef: downgrade() StrongRef -->> WeakMap: weak_ref WeakMap ->> WeakMap: store(key, weak_ref) Client ->> WeakMap: get(key) WeakMap ->> WeakMap: retrieve weak_ref WeakMap ->> WeakRef: upgrade() alt Value still alive WeakRef -->> WeakMap: Some(strong_ref) WeakMap -->> Client: Some(strong_ref) else Value dropped WeakRef -->> WeakMap: None WeakMap -->> Client: None end Client ->> WeakMap: cleanup() loop for each entry WeakMap ->> WeakRef: is_expired() alt Expired WeakRef -->> WeakMap: true WeakMap ->> WeakMap: remove entry else Not expired WeakRef -->> WeakMap: false end end
The use of traits allows WeakMap
to be generic over the specific reference type, supporting both Rc
and Arc
with the same implementation.
Sources: src/traits.rs(L3 - L40)
Type Relationships
The following diagram illustrates the relationships between the trait types and concrete implementations:
classDiagram class StrongRef { <<trait>> type Weak downgrade() -~ Self::Weak ptr_eq(other: &Self) -~ bool } class WeakRef { <<trait>> type Strong upgrade() -~ Option is_expired() -~ bool } class Rc~T~ { downgrade() -~ Weak ptr_eq(other: &Rc) -~ bool } class Weak~T~ { upgrade() -~ Option~ strong_count() -~ usize } class Arc~T~ { downgrade() -~ Weak ptr_eq(other: &Arc) -~ bool } class ArcWeak~T~ { upgrade() -~ Option~ strong_count() -~ usize } StrongRef ..|> Rc : implements StrongRef ..|> Arc : implements Weak ..|> WeakRef : implements ArcWeak ..|> Arc : implements
This abstraction allows the WeakMap
implementation to remain independent of the specific reference type, supporting both Rc
for single-threaded use cases and Arc
for multi-threaded scenarios.
Sources: src/traits.rs(L42 - L88)
Summary
The reference traits (StrongRef
and WeakRef
) provide a flexible abstraction for working with different types of references in the weak-map library. By implementing these traits for standard Rust reference types (Rc
/Weak
and Arc
/Weak
), the library allows for generic handling of references while maintaining the ability to use weak references that don't prevent values from being garbage collected.
These traits are fundamental to the operation of WeakMap
, enabling it to store weak references to values and automatically clean up entries when the referenced values are dropped.
Usage Guide
Relevant source files
This guide provides comprehensive instructions on how to effectively use the weak-map library. The library offers a specialized WeakMap
implementation - a B-Tree map that stores weak references to values, automatically removing entries when referenced values are dropped.
For detailed information about the core components, see Core Components and for implementation details, see Implementation Details.
Basic Concepts
The weak-map library centers around the WeakMap
data structure, which combines the ordered key-value storage of a B-Tree map with automatic memory management through weak references.
flowchart TD subgraph subGraph1["Reference Types"] E["StrongRef Trait"] F["Upgrade to Strong"] G["WeakRef Trait"] H["Downgrade to Weak"] end subgraph subGraph0["Key Concepts"] A["WeakMap<K, V>"] B["Weak References"] C["Automatic Cleanup"] D["B-Tree Structure"] end A --> B A --> C A --> D B --> G E --> F F --> H G --> H H --> F
Key Benefits:
- Prevents memory leaks in cyclic reference scenarios
- Automatically cleans up entries when referenced values are dropped
- Provides a familiar map interface with weak reference handling
Sources: src/map.rs(L60 - L65) README.md(L1 - L6)
Creating a WeakMap
There are several ways to create a WeakMap
instance:
Basic Creation
The simplest way to create a WeakMap
is using the new()
method:
let map = WeakMap::<u32, Weak<String>>::new();
From Existing Collections
You can create a WeakMap
from various sources:
- From a BTreeMap:
let btree_map = BTreeMap::<u32, Weak<String>>::new();
let weak_map = WeakMap::from(btree_map);
- From an iterator:
let items = [(1, &arc_value1), (2, &arc_value2)];
let weak_map = WeakMap::from_iter(items);
- From an array:
let weak_map = WeakMap::from([(1, &arc_value1), (2, &arc_value2)]);
- From a StrongMap:
let strong_map = StrongMap::<u32, Arc<String>>::new();
let weak_map = WeakMap::from(&strong_map);
Sources: src/map.rs(L68 - L77) src/map.rs(L86 - L101) src/map.rs(L341 - L380)
Basic Operations
WeakMap
provides standard map operations with weak reference handling:
Inserting Elements
To insert elements into a WeakMap
, use the insert
method:
use std::sync::{Arc, Weak};
let mut map = WeakMap::<u32, Weak<String>>::new();
// Create a strong reference
let value = Arc::new(String::from("example"));
// Insert into map (automatically creates weak reference)
map.insert(1, &value);
Note that insert
takes a strong reference (&V::Strong
) but stores it as a weak reference internally.
Retrieving Elements
To get an element from the map:
// Returns Option<Arc<String>> (or None if expired or not found)
if let Some(strong_ref) = map.get(&1) {
println!("Value: {}", strong_ref);
}
The get
method returns:
Some(value)
if the key exists and the weak reference can be upgradedNone
if the key doesn't exist or the reference has expired
Removing Elements
To remove elements:
// Remove and return the value if it exists and hasn't expired
let removed_value = map.remove(&1);
// Remove and return both key and value
let removed_entry = map.remove_entry(&1);
Sources: src/map.rs(L203 - L293)
Working with WeakMap Iterators
WeakMap
provides various iterators that automatically filter out expired references:
flowchart TD subgraph subGraph0["WeakMap Iterator Methods"] A["iter()"] D["Iter<K, V>"] B["keys()"] E["Keys<K, V>"] C["values()"] F["Values<K, V>"] G["into_iter()"] H["IntoIter<K, V>"] I["into_keys()"] J["IntoKeys<K, V>"] K["into_values()"] L["IntoValues<K, V>"] end M["(&K, V::Strong)"] N["&K"] O["V::Strong"] P["(K, V::Strong)"] Q["K"] R["V::Strong"] A --> D B --> E C --> F D --> M E --> N F --> O G --> H H --> P I --> J J --> Q K --> L L --> R
Non-consuming Iterators
// Iterate over key-value pairs
for (key, value) in map.iter() {
// value is a strong reference (expired references are skipped)
}
// Iterate over keys only
for key in map.keys() {
// ...
}
// Iterate over values only
for value in map.values() {
// value is a strong reference
}
Consuming Iterators
// Convert map into iterator and consume it
for (key, value) in map.into_iter() {
// value is a strong reference
}
// Or just get keys
for key in map.into_keys() {
// ...
}
// Or just get values
for value in map.into_values() {
// value is a strong reference
}
Sources: src/map.rs(L118 - L149) src/map.rs(L382 - L622)
Checking Map State
WeakMap
provides methods to check its state:
// Number of valid entries (excludes expired references)
let valid_count = map.len();
// Total number of entries (including expired references)
let total_count = map.raw_len();
// Check if map is empty (contains no valid entries)
let is_empty = map.is_empty();
// Check if map contains a specific key
let has_key = map.contains_key(&1);
Note that len()
is an O(n) operation as it needs to check if each reference is valid.
Sources: src/map.rs(L112 - L185) src/map.rs(L235 - L246)
Converting Between Map Types
You can convert between WeakMap
and StrongMap
:
// Convert WeakMap to StrongMap (includes only valid references)
let strong_map: StrongMap<K, V::Strong> = weak_map.upgrade();
// Convert StrongMap to WeakMap
let weak_map = WeakMap::from(&strong_map);
// Convert WeakMap to standard BTreeMap
let btree_map: BTreeMap<K, V> = weak_map.into();
// Convert BTreeMap to WeakMap
let weak_map = WeakMap::from(btree_map);
Sources: src/map.rs(L86 - L101) src/map.rs(L296 - L306) src/map.rs(L368 - L380)
Understanding Automatic Cleanup
The WeakMap
implements an automatic cleanup mechanism to remove expired weak references:
sequenceDiagram participant Client as Client participant WeakMap as WeakMap participant OpsCounter as OpsCounter participant BTreeMap as BTreeMap Note over WeakMap: OPS_THRESHOLD = 1000 Client ->> WeakMap: insert/get/remove operation WeakMap ->> OpsCounter: increment counter OpsCounter ->> WeakMap: check if threshold reached alt Threshold reached WeakMap ->> BTreeMap: retain only non-expired entries WeakMap ->> OpsCounter: reset counter end Note over Client,BTreeMap: When referenced value is dropped elsewhere Note over WeakMap: During next operation that reaches threshold... WeakMap ->> BTreeMap: expired entry is automatically removed
The cleanup process works as follows:
- Each operation (get, insert, remove) increments an internal operations counter
- When the operations counter reaches
OPS_THRESHOLD
(1000), cleanup is triggered - During cleanup, all expired references are removed from the map
- The operations counter is reset to zero
This amortizes the cost of cleanup across operations, preventing performance spikes.
Sources: src/map.rs(L14 - L48) src/map.rs(L158 - L169)
Practical Example
Here's a complete example demonstrating typical WeakMap
usage:
use std::sync::{Arc, Weak};
use weak_map::WeakMap;
// Create a new WeakMap
let mut map = WeakMap::<String, Weak<i32>>::new();
// Create some values with separate lifetimes
let value1 = Arc::new(42);
let value2 = Arc::new(100);
// Insert values (automatically creates weak references)
map.insert("first".to_string(), &value1);
map.insert("second".to_string(), &value2);
// Verify both values are accessible
assert_eq!(map.get(&"first".to_string()), Some(value1.clone()));
assert_eq!(map.get(&"second".to_string()), Some(value2.clone()));
assert_eq!(map.len(), 2);
// Drop one of the strong references
drop(value2);
// The weak reference is now expired
assert_eq!(map.get(&"second".to_string()), None);
assert_eq!(map.len(), 1); // Only one valid entry remains
// After enough operations, expired entries are automatically removed
// (This happens after OPS_THRESHOLD operations)
This example demonstrates how entries are automatically managed based on the lifecycle of the referenced values.
Sources: src/map.rs(L625 - L646)
Performance Considerations
When using WeakMap
, keep these performance aspects in mind:
- Cleanup Frequency: Cleanup occurs after every 1000 operations (
OPS_THRESHOLD
), which balances overhead with memory efficiency - Length Operations: The
len()
method is O(n) since it must check each reference's validity, whileraw_len()
is O(1) - Iterator Performance: All iterators filter out expired references, so iteration complexity is affected by the number of expired items
- Memory Usage:
WeakMap
maintains weak references which don't prevent garbage collection of unused values, but the map entries themselves remain until cleanup
These characteristics make WeakMap
particularly suitable for caching scenarios where you want to avoid memory leaks.
Sources: src/map.rs(L14 - L16) src/map.rs(L158 - L169) src/map.rs(L171 - L179)
When to Use WeakMap
WeakMap
is especially useful in these scenarios:
- Caching Systems: When you need to cache values but don't want to prevent them from being garbage collected when no longer needed elsewhere
- Observer Patterns: When tracking objects that may be destroyed independently from the tracking system
- Breaking Reference Cycles: When you need to break reference cycles that could cause memory leaks
- Resource Management: When associating metadata with resources without extending their lifetime
If you don't need weak reference semantics, consider using StrongMap
(which is just an alias for BTreeMap
) for better performance.
Sources: src/map.rs(L57 - L65)
Summary
The weak-map library provides an elegant solution for scenarios requiring weak references with map semantics. The WeakMap
implementation automatically handles reference lifecycle management while providing a familiar map interface.
Key takeaways:
- Use
WeakMap
when you need map functionality with weak reference semantics - Weak references are automatically upgraded when accessed and expired entries are cleaned up periodically
- The API closely mirrors standard map interfaces but handles weak reference conversion internally
- Performance considerations include automatic cleanup and O(n)
len()
operation
For more advanced usage patterns, see Advanced Usage Patterns.
Basic Usage Examples
Relevant source files
This page provides practical examples demonstrating how to use the WeakMap
and StrongMap
data structures in the weak-map library. For advanced usage patterns and optimization techniques, see Advanced Usage Patterns.
Introduction to WeakMap
WeakMap
is a B-Tree map that stores weak references to values, automatically removing entries when the referenced values are dropped. This makes it ideal for caching scenarios where you want to access objects as long as they're in use elsewhere, without preventing them from being garbage collected.
flowchart TD subgraph subGraph1["Object Lifecycle"] A["Arc"] AW["Weak"] AD["Arc dropped"] CE["Cleanup on next operation"] end subgraph subGraph0["WeakMap Data Structure"] WM["WeakMap>"] BT["BTreeMap>"] API["Automatic Cleanup API"] end A --> AW AD --> CE AW --> A AW --> WM WM --> API WM --> BT
Sources: src/map.rs(L60 - L65) src/map.rs(L158 - L169)
Creating a WeakMap
Creating a new WeakMap
is straightforward:
sequenceDiagram participant ClientCode as "Client Code" participant WeakMap as "WeakMap" ClientCode ->> WeakMap: "new()" Note over WeakMap: "Creates empty map" ClientCode ->> WeakMap: "default()" Note over WeakMap: "Creates empty map (alternative)" ClientCode ->> WeakMap: "from(btree_map)" Note over WeakMap: "Creates from existing BTreeMap" ClientCode ->> WeakMap: "from_iter()" Note over WeakMap: "Creates from key-value pairs"
Basic Creation Examples:
- Creating an empty
WeakMap
:
let map = WeakMap::<String, Weak<u32>>::new();
- Using
Default
trait:
let map: WeakMap<String, Weak<u32>> = WeakMap::default();
- Creating from an iterator of key-value pairs:
let values = vec![(1, &Arc::new("one")), (2, &Arc::new("two"))];
let map = WeakMap::from_iter(values);
Sources: src/map.rs(L68 - L77) src/map.rs(L80 - L84) src/map.rs(L341 - L355)
Basic Operations
Inserting Values
To insert values into a WeakMap
, you need to provide a key and a strong reference to the value. The map will store a weak reference to the value internally.
let mut map = WeakMap::<u32, Weak<String>>::new();
let value = Arc::new(String::from("example"));
map.insert(1, &value);
The insert
method returns an Option<V::Strong>
containing the previous strong reference associated with the key, if one existed and hasn't been dropped.
Sources: src/map.rs(L258 - L263)
Retrieving Values
To retrieve values, use the get
method with a reference to the key:
if let Some(strong_ref) = map.get(&1) {
// Use strong_ref
}
The get
method returns Option<V::Strong>
, containing the strong reference if the key exists and the weak reference could be upgraded.
Sources: src/map.rs(L207 - L214)
Checking for Keys
To check if a key exists in the map without retrieving the value:
if map.contains_key(&1) {
// Key exists and reference is not expired
}
Sources: src/map.rs(L239 - L246)
Removing Entries
To remove an entry from the map:
if let Some(strong_ref) = map.remove(&1) {
// Entry was removed and reference was still valid
}
Sources: src/map.rs(L270 - L277)
Handling of Dropped References
The key feature of WeakMap
is its ability to automatically clean up entries whose values have been dropped:
flowchart TD subgraph subGraph1["Value Lifecycle"] AC["Arc Created"] ST["Strong References Exist"] WK["Weak Reference in Map"] NR["No Strong References"] EX["Expired Weak Reference"] end subgraph subGraph0["Automatic Cleanup Process"] OP["Operations Counter"] TH["Threshold Check"] CL["Cleanup"] EXP["Check expired"] REM["Remove Entry"] KEEP["Keep Entry"] RST["Reset Counter"] end AC --> ST CL --> EXP CL --> RST EX --> CL EXP --> KEEP EXP --> REM NR --> EX OP --> TH ST --> NR ST --> WK TH --> CL
Sources: src/map.rs(L14 - L48) src/map.rs(L158 - L169)
Example of Automatic Cleanup
When a referenced value is dropped, it doesn't immediately get removed from the WeakMap
. Instead, the map detects and removes expired references during subsequent operations:
let mut map = WeakMap::<u32, Weak<String>>::new();
// Create a value in a nested scope so it gets dropped
{
let value = Arc::new(String::from("temporary"));
map.insert(1, &value);
} // value is dropped here
// The entry still exists in the underlying BTreeMap
assert_eq!(map.raw_len(), 1);
// But it won't be returned when getting or counting valid entries
assert_eq!(map.len(), 0);
assert_eq!(map.get(&1), None);
After a certain number of operations (defined by OPS_THRESHOLD
which is 1000), the map will automatically perform a cleanup to remove all expired references:
// After many operations, the expired references are cleaned up
assert_eq!(map.raw_len(), 0);
Sources: src/map.rs(L16) src/map.rs(L158 - L161) src/map.rs(L625 - L660)
Converting Between WeakMap and StrongMap
Upgrading to StrongMap
You can convert a WeakMap
to a StrongMap
(which contains only strong references) using the upgrade
method:
let strong_map: StrongMap<K, Arc<V>> = weak_map.upgrade();
This creates a new StrongMap
containing only the keys that have valid references in the WeakMap
.
Sources: src/map.rs(L296 - L306)
Creating WeakMap from StrongMap
Conversely, you can create a WeakMap
from a StrongMap
:
let weak_map = WeakMap::from(&strong_map);
Sources: src/map.rs(L368 - L380)
Iterating Over WeakMap Contents
WeakMap
provides various iteration methods that only yield entries with valid references:
flowchart TD subgraph subGraph1["Iterator Behaviors"] KV["(&K, V::Strong)"] K["&K"] V["V::Strong"] OKV["(K, V::Strong)"] OK["K"] OV["V::Strong"] end subgraph subGraph0["Iteration Methods"] WM["WeakMap"] IT["Iter"] KS["Keys"] VS["Values"] II["IntoIter"] IK["IntoKeys"] IV["IntoValues"] end II --> OKV IK --> OK IT --> KV IV --> OV KS --> K VS --> V WM --> II WM --> IK WM --> IT WM --> IV WM --> KS WM --> VS
Sources: src/map.rs(L119 - L149) src/map.rs(L383 - L623)
Iterating Over Key-Value Pairs
for (key, value) in map.iter() {
// key is a reference to the key, value is a strong reference
}
Iterating Over Keys or Values Only
// Iterating over keys
for key in map.keys() {
// Process key
}
// Iterating over values
for value in map.values() {
// Process value
}
Sources: src/map.rs(L383 - L528)
Complete Usage Example
Here's a more complete example demonstrating the key features of WeakMap
:
use std::sync::{Arc, Weak};
use weak_map::WeakMap;
// Create a new WeakMap
let mut cache = WeakMap::<String, Weak<Vec<u8>>>::new();
// Create some data and insert it into the map
let data1 = Arc::new(vec![1, 2, 3]);
let data2 = Arc::new(vec![4, 5, 6]);
cache.insert("data1".to_string(), &data1);
cache.insert("data2".to_string(), &data2);
// Data can be retrieved as long as strong references exist
assert_eq!(cache.get(&"data1".to_string()).unwrap(), data1);
// When all strong references to a value are dropped, it can't be retrieved
drop(data1);
assert_eq!(cache.get(&"data1".to_string()), None);
// But data2 is still accessible
assert_eq!(cache.get(&"data2".to_string()).unwrap(), data2);
// The map can tell us how many valid entries it has
assert_eq!(cache.len(), 1);
Sources: src/map.rs(L625 - L646)
Practical Use Cases
WeakMap
is ideal for several common scenarios:
Use Case | Description |
---|---|
Caches | Store cached data without preventing garbage collection |
Object Registries | Track objects by ID without affecting their lifecycle |
Observers | Maintain a list of observers without creating reference cycles |
Resource Pools | Track resources that can be released when no longer needed |
Example: Simple Cache Implementation
#![allow(unused)] fn main() { struct Cache { data: WeakMap<String, Weak<Vec<u8>>> } impl Cache { fn new() -> Self { Self { data: WeakMap::new() } } fn store(&mut self, key: String, value: &Arc<Vec<u8>>) { self.data.insert(key, value); } fn retrieve(&self, key: &str) -> Option<Arc<Vec<u8>>> { self.data.get(key) } } }
Sources: src/map.rs(L60 - L65) src/map.rs(L207 - L214) src/map.rs(L258 - L263)
Advanced Usage Patterns
Relevant source files
This page explores complex usage patterns, optimization techniques, and best practices for the WeakMap
implementation. While Basic Usage Examples covers fundamental operations, this section focuses on advanced scenarios that leverage the full potential of weak references in map data structures.
Understanding the Cleanup Mechanism
The WeakMap
implementation includes an automatic cleanup mechanism that purges expired weak references. Understanding this mechanism is crucial for optimizing performance.
flowchart TD OP["Operation on WeakMap"] BUMP["Counter.bump()"] CHECK["reach_threshold() check"] CLEAN["cleanup()"] RETAIN["retain(!is_expired())"] RESET["Counter.reset()"] EXIT["Continue"] BUMP --> CHECK CHECK --> CLEAN CHECK --> EXIT CLEAN --> RESET CLEAN --> RETAIN OP --> BUMP
By default, cleanup occurs after OPS_THRESHOLD
(1000) operations on the map. This value is defined as a constant in src/map.rs(L16)
Sources: src/map.rs(L14 - L47) src/map.rs(L158 - L169)
Examining Cleanup Performance
Understanding the performance characteristics of the cleanup process is important for applications with stringent timing requirements:
- Cleanup Complexity: The cleanup operation is O(n) as it iterates through all entries in the map.
- Lazy Cleanup: Entries are only removed during cleanup operations, not immediately when they expire.
- Actual vs. Raw Length: The
len()
method reports only valid entries, whileraw_len()
includes expired entries.
sequenceDiagram participant Client as Client participant WeakMap as WeakMap participant OpsCounter as OpsCounter participant BTreeMap as BTreeMap Client ->> WeakMap: Multiple operations loop Each operation WeakMap ->> OpsCounter: bump() end WeakMap ->> OpsCounter: reach_threshold()? OpsCounter -->> WeakMap: true WeakMap ->> WeakMap: cleanup() WeakMap ->> BTreeMap: retain(!is_expired()) WeakMap ->> OpsCounter: reset() Note over Client,BTreeMap: After cleanup Client ->> WeakMap: raw_len() WeakMap ->> BTreeMap: len() BTreeMap -->> Client: Count of all entries Client ->> WeakMap: len() WeakMap ->> WeakMap: iter().count() WeakMap -->> Client: Count of valid entries only
Sources: src/map.rs(L158 - L169) src/map.rs(L113 - L115) src/map.rs(L171 - L185)
Custom Reference Types
The WeakMap
can work with any reference type that implements the WeakRef
trait, while the values must be from types implementing the StrongRef
trait.
classDiagram class StrongRef { <<trait>> type Weak downgrade() -~ Self::Weak ptr_eq(other: &Self) -~ bool } class WeakRef { <<trait>> type Strong upgrade() -~ Option is_expired() -~ bool } class CustomStrongRef { data: T downgrade() -~ CustomWeakRef ptr_eq(other: &Self) -~ bool } class CustomWeakRef { reference: WeakInner~T~ upgrade() -~ Option~CustomStrongRef~ is_expired() -~ bool } StrongRef ..|> CustomStrongRef : implements WeakRef ..|> CustomWeakRef : implements CustomStrongRef --> CustomWeakRef : creates CustomWeakRef --> CustomStrongRef : may upgrade to
Implementing these traits for custom reference types allows you to integrate them with WeakMap
:
flowchart TD A["Custom Type Conversion"] B["Implement StrongRef for CustomStrong"] C["Implement WeakRef for CustomWeak"] D["Use in WeakMapUnsupported markdown: delT~~"] A --> B A --> C B --> D C --> D
Sources: src/traits.rs(L3 - L19) src/traits.rs(L21 - L40)
Advanced Conversion Operations
Converting Between WeakMap and StrongMap
The WeakMap
implementation provides methods for converting between weak and strong maps:
flowchart TD A["WeakMapUnsupported markdown: del"] B["StrongMapUnsupported markdown: del"] A --> B B --> A
The upgrade()
method creates a new StrongMap
containing only the valid entries:
Sources: src/map.rs(L296 - L306) src/map.rs(L368 - L380)
Working with Iterators
WeakMap
provides various iterator types to work with different aspects of the map:
Iterator Type | Description | Returns | Implementation |
---|---|---|---|
Iter | References to entries | (&'a K, V::Strong) | src/map.rs382-430 |
Keys | References to keys | &'a K | src/map.rs444-485 |
Values | Valid values | V::Strong | src/map.rs487-528 |
IntoIter | Owned entries | (K, V::Strong) | src/map.rs530-571 |
IntoKeys | Owned keys | K | src/map.rs573-597 |
IntoValues | Owned values | V::Strong | src/map.rs599-623 |
Note that all iterators automatically filter out expired references, so you only get valid entries.
flowchart TD WM["WeakMapUnsupported markdown: del"] Iter["IterUnsupported markdown: del"] Keys["KeysUnsupported markdown: del"] Values["ValuesUnsupported markdown: del"] IntoIter["IntoIterUnsupported markdown: del"] IntoKeys["IntoKeysUnsupported markdown: del"] IntoValues["IntoValuesUnsupported markdown: del"] EntryRef["(&'a K, V::Strong)"] KeyRef["&'a K"] Value["V::Strong"] Entry["(K, V::Strong)"] Key["K"] OwnedValue["V::Strong"] IntoIter --> Entry IntoKeys --> Key IntoValues --> OwnedValue Iter --> EntryRef Keys --> KeyRef Values --> Value WM --> IntoIter WM --> IntoKeys WM --> IntoValues WM --> Iter WM --> Keys WM --> Values
Sources: src/map.rs(L119 - L149) src/map.rs(L382 - L623)
Memory Management Strategies
Minimizing Memory Overhead
When working with WeakMap
, consider these strategies to minimize memory overhead:
- Preemptive Cleanup: For large maps, consider manually triggering cleanup before critical operations.
- Monitoring Raw Size: Use
raw_len()
to monitor the total size including expired entries. - Strategic Insert/Remove: Batch insertions and removals to minimize cleanup frequency.
flowchart TD A["Memory Optimization"] B["Preemptive Cleanup"] C["Size Monitoring"] D["Batch Operations"] B1["Call retain() manually"] C1["raw_len() vs len()"] D1["Insert/remove in batches"] A --> B A --> C A --> D B --> B1 C --> C1 D --> D1
Sources: src/map.rs(L113 - L115) src/map.rs(L158 - L169) src/map.rs(L187 - L201)
Thread Safety Considerations
The WeakMap
can be used with both single-threaded (Rc
/Weak
) and thread-safe (Arc
/Weak
) reference types.
flowchart TD A["Reference Type Selection"] B["Single-Threaded"] C["Multi-Threaded"] B1["RcUnsupported markdown: del / WeakUnsupported markdown: del"] B2["Implements StrongRef/WeakRef"] B3["Use in WeakMapUnsupported markdown: delT~~"] C1["ArcUnsupported markdown: del / WeakUnsupported markdown: del"] C2["Implements StrongRef/WeakRef"] C3["Use in WeakMapUnsupported markdown: delT~~"] A --> B A --> C B --> B1 B --> B2 B --> B3 C --> C1 C --> C2 C --> C3
Selection depends on your concurrency requirements:
Reference Type | Thread-Safe | Use Case |
---|---|---|
Rc/Weak | No | Single-threaded applications, better performance |
Arc/Weak | Yes | Multi-threaded applications, safe concurrent access |
Sources: src/traits.rs(L42 - L64) src/traits.rs(L66 - L88)
Advanced Usage Patterns
Caching with Automatic Cleanup
WeakMap
is particularly well-suited for implementing caches that automatically evict entries when they are no longer used elsewhere:
flowchart TD Client["Client"] Cache["Cache System"] WeakMap["WeakMap"] Compute["Compute Value"] StoreRef["Store in Application"] AppData["Application Data"] Expire["Entry Expires"] NextCleanup["Next Cleanup"] AppData --> Expire Cache --> WeakMap Client --> Cache Compute --> StoreRef Compute --> WeakMap Expire --> NextCleanup NextCleanup --> WeakMap StoreRef --> AppData WeakMap --> Client WeakMap --> Compute
Sources: src/map.rs(L203 - L214) src/map.rs(L258 - L263)
Observer Pattern Implementation
WeakMap
can be used to implement observer patterns without memory leaks:
flowchart TD Subject["Observable Subject"] Observer1["Observer 1"] Observer2["Observer 2"] ObserverN["Observer N"] WeakMap["WeakMapUnsupported markdown: delObserver~~"] Expired["Reference Expired"] Removed["Entry Removed"] Expired --> Removed Observer2 --> Expired Subject --> WeakMap WeakMap --> Observer1 WeakMap --> Observer2 WeakMap --> ObserverN
Sources: src/map.rs(L203 - L214) src/map.rs(L258 - L263)
Breaking Reference Cycles
WeakMap
is ideal for breaking reference cycles in complex data structures:
flowchart TD ParentNode["Parent Node (Strong)"] ChildNodes["Child Nodes (Strong)"] ParentRefs["Parent References (Weak)"] ChildNodes --> ParentRefs ParentNode --> ChildNodes ParentRefs --> ParentNode
This pattern avoids memory leaks while maintaining bidirectional relationships.
Sources: src/traits.rs(L3 - L19) src/traits.rs(L21 - L40)
Performance Optimizations
Choosing the Right Cleanup Strategy
The default cleanup strategy may not be optimal for all use cases:
Usage Pattern | Recommended Approach |
---|---|
High churn (many entries added/removed) | LowerOPS_THRESHOLDor manual cleanup |
Mostly static data with few expirations | Default cleanup is adequate |
Memory-constrained environments | Preemptive cleanup after critical operations |
Performance-critical code paths | Consider manual cleanup during idle periods |
Optimizing Map Operations
For performance-critical applications, consider these strategies:
- Pre-sizing: If approximate size is known, create with appropriate capacity
- Batch Processing: Group insertions and retrievals to minimize cleanup overhead
- Strategic Cleanup: Trigger cleanup during low-activity periods
- Monitoring: Track
raw_len()
vs.len()
to gauge cleanup effectiveness
Sources: src/map.rs(L158 - L169) src/map.rs(L113 - L115) src/map.rs(L171 - L185)
Conclusion
Advanced usage of WeakMap
requires understanding its internal cleanup mechanism, reference type interactions, and memory management characteristics. By applying the patterns and strategies outlined in this document, you can leverage WeakMap
effectively in complex applications while maintaining optimal performance.
Implementation Details
Relevant source files
This document provides a deep dive into the internal implementation of the weak-map
library. It covers the core mechanisms that enable automatic cleanup of expired references, the internal data structures, and how different components interact to provide efficient weak reference management. For information about usage patterns and API, see Usage Guide.
Internal Structure
The WeakMap
is implemented as a wrapper around Rust's standard BTreeMap
with additional logic to handle weak references and their lifecycle management.
classDiagram class WeakMap { inner: BTreeMap ops: OpsCounter new() cleanup() try_bump() insert(key, value) get(key) len() raw_len() } class OpsCounter { 0: AtomicUsize new() add(ops) bump() reset() get() reach_threshold() } class BTreeMap { <<standard library>> } WeakMap --> OpsCounter : contains WeakMap --> BTreeMap : wraps
Sources: src/map.rs(L11 - L65) src/map.rs(L13 - L55)
The WeakMap
structure has two main components:
inner
: A standardBTreeMap<K, V>
that stores the actual key-value pairsops
: AnOpsCounter
that tracks operations to trigger cleanup at appropriate intervals
The OpsCounter
is a simple wrapper around an atomic counter that helps determine when to perform cleanup operations.
Cleanup Mechanism
One of the most important aspects of the WeakMap
implementation is its automatic cleanup mechanism, which ensures that expired references are removed.
flowchart TD A["Operation on WeakMap"] B["ops.bump()"] C["ops.reach_threshold()?"] D["cleanup()"] E["Continue operation"] F["ops.reset()"] G["Remove expired entries"] A --> B B --> C C --> D C --> E D --> F D --> G G --> E
Sources: src/map.rs(L157 - L169) src/map.rs(L13 - L47)
The cleanup process:
- Each operation increments the operation counter
- When the counter reaches the threshold (1000 operations, defined as
OPS_THRESHOLD
), cleanup is triggered - The cleanup process resets the counter and removes all expired references from the map
This approach balances performance with memory usage:
- The map doesn't need to check every entry on every operation
- Cleanup is amortized over multiple operations
- Expired entries will eventually be removed without manual intervention
Reference Management
The weak-map
library relies on two core traits for reference management:
classDiagram class StrongRef { <<trait>> type Weak downgrade() -~ Self::Weak ptr_eq(other: &Self) -~ bool } class WeakRef { <<trait>> type Strong upgrade() -~ Option is_expired() -~ bool } class Rc { downgrade() -~ Weak ptr_eq(other: &Self) -~ bool } class Weak { upgrade() -~ Option~ strong_count() -~ usize } class Arc { downgrade() -~ Weak ptr_eq(other: &Self) -~ bool } class ArcWeak { upgrade() -~ Option~ strong_count() -~ usize } Weak --> WeakRef : associated type StrongRef ..|> Rc : implements Weak ..|> WeakRef : implements StrongRef ..|> Arc : implements ArcWeak ..|> Arc : implements
Sources: src/traits.rs(L3 - L40) src/traits.rs(L42 - L88)
The trait implementations enable the WeakMap
to work with different types of weak references:
StrongRef
is implemented forRc<T>
andArc<T>
WeakRef
is implemented forWeak<T>
(fromRc
) andWeak<T>
(fromArc
)
This abstraction allows the WeakMap
to be agnostic about the specific reference type being used, as long as it conforms to the trait requirements.
Operation Flow
When performing operations on a WeakMap
, there's a specific flow that handles the weak references correctly:
sequenceDiagram participant Client as Client participant WeakMap as WeakMap participant BTreeMap as BTreeMap participant WeakRef as WeakRef participant StrongRef as StrongRef Client ->> WeakMap: insert(key, strong_ref) WeakMap ->> WeakMap: try_bump() WeakMap ->> StrongRef: downgrade(strong_ref) StrongRef -->> WeakMap: weak_ref WeakMap ->> BTreeMap: insert(key, weak_ref) BTreeMap -->> WeakMap: optional old_weak_ref WeakMap ->> WeakRef: upgrade(old_weak_ref) WeakRef -->> WeakMap: optional old_strong_ref WeakMap -->> Client: optional old_strong_ref Note over WeakMap,BTreeMap: Later... Client ->> WeakMap: get(key) WeakMap ->> WeakMap: ops.bump() WeakMap ->> BTreeMap: get(key) BTreeMap -->> WeakMap: optional weak_ref WeakMap ->> WeakRef: upgrade(weak_ref) WeakRef -->> WeakMap: optional strong_ref WeakMap -->> Client: optional strong_ref
Sources: src/map.rs(L203 - L214) src/map.rs(L258 - L263)
Key points in the operation flow:
- For insertion (
insert
):
- The strong reference is downgraded to a weak reference
- The weak reference is stored in the map
- If an existing reference is replaced, it's upgraded before being returned
- For retrieval (
get
):
- The weak reference is retrieved from the map
- The weak reference is upgraded to a strong reference if still valid
- If the reference has expired,
None
is returned
Iterator Implementation
Iterators in WeakMap
are designed to filter out expired references automatically:
flowchart TD A["WeakMap Iter"] B["BTreeMap Iter"] C["For each entry"] D["Is referenceexpired?"] E["Yield (key, value)"] F["Skip entry"] A --> B B --> C C --> D D --> E D --> F E --> C F --> C
Sources: src/map.rs(L382 - L430) src/map.rs(L445 - L485) src/map.rs(L488 - L528)
The library provides several iterator types:
Iterator Type | Description | Returns |
---|---|---|
Iter | References to entries | (&'a K, V::Strong) |
Keys | References to keys | &'a K |
Values | Values as strong references | V::Strong |
IntoIter | Owned entries | (K, V::Strong) |
IntoKeys | Owned keys | K |
IntoValues | Owned values as strong references | V::Strong |
Each iterator automatically filters out entries with expired references by attempting to upgrade the weak reference. If the upgrade fails, the entry is skipped.
Performance Considerations
The performance of WeakMap
is influenced by several implementation choices:
- Cleanup threshold: The cleanup process only runs after a certain number of operations (
OPS_THRESHOLD = 1000
), which amortizes the cost of cleanup. - BTreeMap as the underlying data structure: The choice of
BTreeMap
provides O(log n) complexity for most operations. - Lazy iteration: Iterators only yield valid entries, but they must attempt to upgrade each weak reference, which can be expensive.
flowchart TD subgraph subGraph0["Performance Trade-offs"] A["Immediate Cleanup"] B["High Operation Cost"] C["Minimal Memory Usage"] D["Lazy Cleanup"] E["Fast Operations"] F["Temporary Memory Overhead"] G["Current Approach(Threshold-based)"] H["Amortized Cost"] I["Bounded Memory Overhead"] end A --> B A --> C D --> E D --> F G --> H G --> I
Sources: src/map.rs(L15 - L16) src/map.rs(L157 - L169) src/map.rs(L625 - L660)
The current implementation strikes a balance between operation speed and memory usage:
- Operations are fast most of the time (no cleanup)
- Memory overhead is bounded (cleanup happens periodically)
- The cost of cleanup is amortized over multiple operations
The test cases in the codebase demonstrate this behavior, showing that after many operations the map will clean up expired references automatically.
Memory Management
The core memory management feature of WeakMap
is its ability to automatically handle expired references.
flowchart TD A["Object Creation"] B["Strong Reference(Rc/Arc)"] C["WeakMap storesWeak Reference"] D["Object Drop"] E["Strong Count = 0"] F["Weak ReferenceExpires"] G["Operation on WeakMap"] H["Cleanup Triggered?"] I["Remove ExpiredReferences"] J["Continue Operation"] A --> B B --> C C --> F D --> E E --> F F --> I G --> H H --> I H --> J
Sources: src/map.rs(L157 - L161) src/traits.rs(L33 - L39) src/map.rs(L632 - L660)
When an object is dropped elsewhere in the program:
- Its strong count reaches zero
- Any weak references to it become expired
- The next time the cleanup mechanism runs in
WeakMap
, these expired references will be removed
This ensures that WeakMap
doesn't hold onto memory for objects that are no longer needed elsewhere in the program.
Memory Management
Relevant source files
Purpose and Scope
This document explains how memory is managed in the weak-map library, focusing on weak references and the automatic cleanup process. It details how the WeakMap
data structure prevents memory leaks while allowing values to be deallocated when they're no longer needed elsewhere.
For information about the core components and API of the WeakMap
and StrongMap
implementations, see WeakMap and StrongMap. For details about the reference traits, see Reference Traits.
Weak vs Strong References
The weak-map library is built around the concept of weak references, which are references that don't prevent the referenced object from being deallocated.
flowchart TD subgraph subGraph0["Reference Types"] A["StrongReference (Rc/Arc)"] B["Referenced Object"] C["WeakReference (Weak)"] end A --> B A --> C C --> A C --> B
Sources: src/traits.rs(L3 - L40)
Key characteristics:
- Strong references (
Rc<T>
,Arc<T>
) increase the reference count and prevent deallocating the object - Weak references (
Weak<T>
) don't affect the reference count used for deallocation decisions - When all strong references are dropped, the object is deallocated, even if weak references still exist
- Weak references can be "upgraded" to strong references, but this will fail if the object has been deallocated
Reference Management Architecture
The library provides traits that abstract over different reference types:
classDiagram class StrongRef { <<trait>> type Weak downgrade() -~ Self::Weak ptr_eq(other: &Self) -~ bool } class WeakRef { <<trait>> type Strong upgrade() -~ Option is_expired() -~ bool } class Rc { downgrade() -~ Weak ptr_eq(other: &Rc) -~ bool } class RcWeak { upgrade() -~ Option~ is_expired() -~ bool } class Arc { downgrade() -~ Weak ptr_eq(other: &Arc) -~ bool } class ArcWeak { upgrade() -~ Option~ is_expired() -~ bool } StrongRef --> WeakRef : associated types StrongRef ..|> Rc : implements StrongRef ..|> Arc : implements RcWeak ..|> Rc : implements ArcWeak ..|> Arc : implements
Sources: src/traits.rs(L3 - L88)
This architecture allows the WeakMap
to work with different types of references, making it flexible for various use cases:
- Single-threaded applications can use
Rc
/Weak
references - Multi-threaded applications can use
Arc
/Weak
references
Automatic Cleanup Mechanism
A key feature of WeakMap
is its automatic cleanup of expired weak references:
flowchart TD subgraph subGraph0["Cleanup Process"] C["OpsCounter"] D["cleanup()"] end A["WeakMap"] B["BTreeMap"] A --> B A --> C C --> D D --> B
Sources: src/map.rs(L13 - L47) src/map.rs(L158 - L169)
Operations Counter
The operations counter tracks the number of operations performed on the map:
Component | Purpose | Implementation |
---|---|---|
OpsCounter | Tracks operations on the map | Uses an atomic counter (AtomicUsize) |
OPS_THRESHOLD | Determines when cleanup occurs | Constant set to 1000 operations |
try_bump() | Increments counter and checks threshold | Called on mutations like insert/remove |
cleanup() | Removes expired entries | Retains only non-expired entries |
Sources: src/map.rs(L13 - L47)
The operations counter uses atomic operations to ensure thread safety. Each mutation operation (insert, remove) increments the counter, and when it reaches the threshold (1000 operations), the cleanup process is triggered.
sequenceDiagram participant Client as Client participant WeakMap as WeakMap participant OpsCounter as OpsCounter participant BTreeMap as BTreeMap Client ->> WeakMap: insert(key, value) WeakMap ->> OpsCounter: bump() OpsCounter ->> OpsCounter: increment count OpsCounter ->> WeakMap: check if count >= OPS_THRESHOLD alt Threshold reached WeakMap ->> OpsCounter: reset() WeakMap ->> BTreeMap: retain(!v.is_expired()) end WeakMap ->> BTreeMap: insert(key, downgraded_value)
Sources: src/map.rs(L164 - L169) src/map.rs(L258 - L263)
Reference Lifecycle in WeakMap
The lifecycle of references in the WeakMap
follows this pattern:
flowchart TD A["Client code"] B["WeakMap"] C["Weak reference"] D["BTreeMap"] E["Client code"] F["Strong reference (if still valid)"] G["Original object"] H["Weak reference becomes expired"] I["Any operation"] J["Remove expired references"] A --> B B --> C B --> D B --> E B --> F E --> B G --> H I --> J
Sources: src/map.rs(L207 - L214) src/map.rs(L258 - L263)
Key Stages:
- Storage Phase:
- The client provides a key and a strong reference (
&V::Strong
) WeakMap
downgrades the strong reference to a weak reference- The key and weak reference are stored in the underlying
BTreeMap
- Retrieval Phase:
- The client requests a value by key
WeakMap
retrieves the weak reference from theBTreeMap
- It attempts to upgrade the weak reference to a strong reference
- If successful (object still exists), it returns the strong reference
- If unsuccessful (object has been deallocated), it returns
None
- Cleanup Phase:
- After a threshold number of operations, cleanup is triggered
- All expired weak references are removed from the map
- The operations counter is reset
Sources: src/map.rs(L158 - L161)
Memory Management in Practice
The following test demonstrates how memory is automatically managed:
sequenceDiagram participant Test as Test participant WeakMap as WeakMap participant InnerScope as "Inner Scope" Test ->> WeakMap: create WeakMap<u32, Weak<&str>> Test ->> Test: create elem1 = Arc::new("1") Test ->> WeakMap: insert(1, &elem1) Test ->> InnerScope: enter inner scope InnerScope ->> InnerScope: create elem2 = Arc::new("2") InnerScope ->> WeakMap: insert(2, &elem2) InnerScope ->> Test: exit scope (elem2 is dropped) Test ->> WeakMap: get(1) WeakMap ->> Test: return Some(elem1) Test ->> WeakMap: get(2) WeakMap ->> Test: return None (elem2 was dropped) Test ->> WeakMap: len() WeakMap ->> Test: return 1 (only elem1 is still valid)
Sources: src/map.rs(L632 - L646)
In this test:
- Two values are inserted into the
WeakMap
- The second value (
elem2
) goes out of scope and is dropped - When retrieving the values, only the first one is still available
- The
len()
method accurately reports only one valid element, even though there are two entries in the underlying map
Performance Considerations
The automatic cleanup mechanism balances memory usage with performance:
Consideration | Implementation | Trade-off |
---|---|---|
Delayed cleanup | Cleanup occurs afterOPS_THRESHOLDoperations | Amortizes cleanup cost across operations |
Lazy iteration | Iterators skip expired references | Avoids unnecessary memory allocations |
On-demand length | len()counts only valid references | More expensive than raw_len() but accurate |
Targeted cleanup | retain()function used to filter | More efficient than rebuilding the map |
Sources: src/map.rs(L158 - L161) src/map.rs(L172 - L179)
For high-performance scenarios where cleanup frequency needs to be tuned, the OPS_THRESHOLD
constant (set to 1000) determines how often the cleanup process runs. This value represents a balance between memory usage (keeping expired references around) and CPU usage (cleaning up frequently).
For more in-depth performance considerations, see Performance Considerations.
Performance Considerations
Relevant source files
This document covers the performance characteristics of the WeakMap
implementation, including its automatic cleanup mechanism, operation complexity, and memory management considerations. For information about memory management internals, see Memory Management.
Automatic Cleanup Mechanism
WeakMap
implements an automatic garbage collection mechanism that periodically removes expired weak references from its internal storage. This prevents the map from accumulating dead entries indefinitely.
flowchart TD subgraph subGraph0["Cleanup Process"] D["Run cleanup()"] F["Reset counter"] G["Remove expired references"] end A["Operation on WeakMap"] B["Increment OpsCounter"] C["Counter ≥ OPS_THRESHOLD?"] E["Continue"] A --> B B --> C C --> D C --> E D --> F D --> G G --> E
The cleanup process is controlled by an operations counter that triggers garbage collection after a set threshold:
- Each map operation increments a counter
- When counter reaches
OPS_THRESHOLD
(1000 operations), cleanup runs - Cleanup removes all entries with expired weak references
- Counter resets after cleanup
Sources: src/map.rs(L13 - L47) src/map.rs(L158 - L169)
Operation Complexity
WeakMap
is built on top of BTreeMap
and inherits its characteristics, with additional overhead for weak reference handling.
Operation | Time Complexity | Notes |
---|---|---|
get(key) | O(log n) | Plus weak reference upgrade cost |
insert(key, value) | O(log n) | Plus weak reference downgrade cost |
remove(key) | O(log n) | Plus weak reference upgrade cost |
len() | O(n) | Must iterate all entries to filter expired refs |
is_empty() | O(n) | Callslen()under the hood |
contains_key(key) | O(log n) | Must check if reference is expired |
cleanup() | O(n) | Full scan removing expired references |
upgrade()toStrongMap | O(n) | Must attempt to upgrade all references |
Iteration | O(n) | Filters out expired references during iteration |
Sources: src/map.rs(L158 - L161) src/map.rs(L176 - L179) src/map.rs(L383 - L430)
Memory Management Model
The key performance advantage of WeakMap
is its ability to avoid memory leaks by not keeping values alive when they're no longer needed elsewhere in the program.
sequenceDiagram participant Client as Client participant WeakMapKV as "WeakMap<K, V>" participant BTreeMapKV as "BTreeMap<K, V>" participant OpsCounter as OpsCounter Client ->> WeakMapKV: insert(key, strong_ref) WeakMapKV ->> WeakMapKV: downgrade(strong_ref) WeakMapKV ->> BTreeMapKV: store(key, weak_ref) WeakMapKV ->> OpsCounter: bump() Note over Client,BTreeMapKV: Later, value dropped elsewhere Client ->> WeakMapKV: get(key) WeakMapKV ->> BTreeMapKV: get(key) BTreeMapKV ->> WeakMapKV: return weak_ref WeakMapKV ->> WeakMapKV: attempt upgrade() WeakMapKV ->> Client: return None (reference expired) Note over WeakMapKV,OpsCounter: After OPS_THRESHOLD operations WeakMapKV ->> OpsCounter: check threshold OpsCounter ->> WeakMapKV: threshold reached WeakMapKV ->> WeakMapKV: cleanup() WeakMapKV ->> BTreeMapKV: remove expired entries
This diagram illustrates how WeakMap
interacts with its references and automatic cleanup mechanism throughout the lifecycle of operations.
Sources: src/map.rs(L158 - L169) src/map.rs(L207 - L214) src/map.rs(L258 - L263)
Performance Implications
Cleanup Overhead
While the automatic cleanup provides memory safety, it comes with performance costs:
- Periodic O(n) cleanup operations that scan all entries
- Unpredictable timing of these operations can cause occasional latency spikes
- Cleanup frequency depends on operation patterns (thrashing can occur with certain workloads)
In the worst case, if a WeakMap
contains many expired references and few valid ones, a significant portion of operations can be spent on cleanup rather than useful work.
Sources: src/map.rs(L158 - L161) src/map.rs(L16)
Iterator Performance
Iterators in WeakMap
must filter out expired references during iteration, adding overhead compared to regular collection iterators:
flowchart TD subgraph subGraph0["Regular BTreeMap iteration"] G["BTreeMap::iter()"] H["Iterator.next() called"] I["Return (key, value) directly"] end A["WeakMap::iter()"] B["BTreeMap::iter()"] C["Wrapped in WeakMap::Iter"] D["Iterator.next() called"] E["Reference expired?"] F["Return (key, upgraded_value)"] A --> B B --> C C --> D D --> E E --> D E --> F G --> H H --> I
This filtering during iteration means:
- Size hints are less accurate (
0
ton
rather than exact counts) - Iteration may be slower than with regular collections
- Memory usage during iteration remains efficient due to lazy evaluation
Sources: src/map.rs(L383 - L405) src/map.rs(L390 - L399)
Memory Usage vs Regular Collections
The weak reference approach offers significant memory advantages in certain scenarios:
Collection Type | Memory Behavior | Reference Behavior |
---|---|---|
BTreeMap<K, V> | Stores full values | Values kept alive even when unused elsewhere |
WeakMap<K, V> | Stores weak references | Values collected when no strong refs exist |
When storing large objects that may be dropped elsewhere in the program, WeakMap
allows for automatic reclamation of memory without manual bookkeeping.
Sources: src/map.rs(L60 - L65)
Performance Tuning
The main tunable parameter for WeakMap
performance is the OPS_THRESHOLD
constant:
flowchart TD A["OPS_THRESHOLD"] B["Cleanup Frequency"] C["More frequent cleanup"] D["Less frequent cleanup"] E["(+) Less memory overhead"] F["(–) More CPU overhead"] G["(–) More latency spikes"] H["(+) Less CPU overhead"] I["(+) Fewer latency spikes"] J["(–) More memory overhead"] A --> B B --> C B --> D C --> E C --> F C --> G D --> H D --> I D --> J
The default threshold (1000 operations) aims to balance:
- Memory usage (keeping expired references consumes memory)
- CPU overhead (running cleanup too frequently is expensive)
- Latency consistency (avoiding frequent pauses for cleanup)
Sources: src/map.rs(L16)
Real-World Performance Behavior
The test cases demonstrate key performance characteristics:
- Basic Functionality Test: Shows how expired references are automatically excluded from operations like
len()
andget()
. - Cleanup Trigger Test: Shows how the cleanup mechanism is automatically triggered after
OPS_THRESHOLD
operations, removing expired references from the map's internal storage.
Testing shows that after many operations, the map correctly maintains its state:
- Only counts valid references in its logical length (
len()
) - Still tracks the total entries in its raw length (
raw_len()
) - Automatically cleans up entries when the threshold is reached
Sources: src/map.rs(L625 - L660)
Optimization Recommendations
When using WeakMap
in performance-sensitive code, consider these guidelines:
- Avoid frequent
len()
calls: Since this operation is O(n), cache the length if needed repeatedly. - Be aware of operation count: Operations that might trigger cleanup can cause occasional performance spikes.
- Use
raw_len()
for debugging: This gives you the total entries including expired ones without the O(n) scan. - Consider selective cleanup: For very large maps, consider manually cleaning up at strategic times rather than relying solely on the automatic threshold.
- Use appropriate data structures: If you don't need the weak reference behavior, consider using
StrongMap
which avoids the overhead of reference handling.
Sources: src/map.rs(L113 - L115) src/map.rs(L176 - L179)
Project Information
Relevant source files
This document provides essential information about the weak-map project structure, development workflow, contribution guidelines, and licensing. For detailed technical information about the implementation, please refer to Core Components and Implementation Details.
Project Overview
The weak-map repository provides a Rust implementation of WeakMap
- a B-Tree map data structure that stores weak references to values, automatically removing entries when referenced values are dropped. It is hosted on GitHub at https://github.com/Starry-OS/weak-map and published as a crate on crates.io.
flowchart TD A["weak-map Repository"] B["Source Code"] C["Project Metadata"] D["CI Configuration"] E["lib.rs"] F["map.rs"] G["traits.rs"] H["README.md"] I["Cargo.toml"] J["License Files"] K[".github/workflows/ci.yml"] A --> B A --> C A --> D B --> E B --> F B --> G C --> H C --> I C --> J D --> K
Sources: README.md, .github/workflows/ci.yml
Repository Structure
The weak-map project follows a standard Rust crate organization with a clean separation between the core implementations and trait definitions.
classDiagram class SourceCode { src/lib.rs src/map.rs src/traits.rs } class Implementation { WeakMap StrongMap } class Traits { StrongRef WeakRef } class ProjectFiles { README.md Cargo.toml LICENSE-MIT LICENSE-APACHE-2.0 } class CIConfig { .github/workflows/ci.yml } SourceCode --> Implementation : contains SourceCode --> Traits : contains Implementation ..> Traits : uses
Sources: README.md
Development Workflow
The weak-map project employs GitHub Actions for continuous integration to ensure code quality and test coverage.
CI Process
The CI workflow runs automatically on:
- Push to the main branch
- Pull requests targeting the main branch
flowchart TD A["Push/PR to main branch"] B["CI Workflow Triggered"] C["Matrix Setup"] D["Rust Toolchain Versions"] E["stable"] F["nightly"] G["nightly-2025-01-18"] H["Run Clippy"] I["Run Tests"] J["Build Success/Failure Report"] A --> B B --> C C --> D D --> E D --> F D --> G E --> H F --> H G --> H H --> I I --> J
Sources: .github/workflows/ci.yml(L3 - L10)
CI Actions
The CI performs these specific checks:
- Clippy Linting: Runs with all features and targets, with warnings treated as errors:
cargo clippy --all-features --all-targets -- -Dwarnings
- Comprehensive Testing: Runs all tests with all features enabled:
cargo test --all-features
Sources: .github/workflows/ci.yml(L28 - L31)
Contributing Guidelines
Contributions to the weak-map project are welcome. Based on the repository structure and CI configuration, here are the recommended steps for contributing:
- Fork the repository on GitHub
- Create a feature branch for your changes
- Make your changes following the code style of the project
- Add tests for your changes to ensure they work correctly
- Run the checks locally that will be performed by CI:
cargo clippy --all-features --all-targets -- -Dwarnings
cargo test --all-features
- Submit a pull request to the main branch
sequenceDiagram participant Developer as Developer participant Repository as Repository participant CI as CI Developer ->> Repository: Fork repository Developer ->> Developer: Create feature branch Developer ->> Developer: Make changes Developer ->> Developer: Add tests Developer ->> Developer: Run local checks Developer ->> Repository: Submit pull request Repository ->> CI: Trigger CI checks CI ->> Repository: Report results alt Tests Pass Repository ->> Developer: Approve and merge else Tests Fail Repository ->> Developer: Request changes end
Sources: .github/workflows/ci.yml(L14 - L31)
License Information
The weak-map project is dual-licensed under both the MIT License and the Apache License 2.0, allowing users to choose the license that best suits their needs.
Dual License Approach
flowchart TD A["weak-map Project"] B["MIT License"] C["Apache License 2.0"] D["Simple, permissive license"] E["Includes explicit patent grants"] F["Users can choose either license"] A --> B A --> C B --> D C --> E D --> F E --> F
License Usage
- MIT License: A permissive license that allows users to do almost anything with the code, including using it in proprietary software, as long as they provide attribution.
- Apache License 2.0: Also permissive, but includes explicit patent grants and more detailed terms around trademark usage.
The license files (LICENSE-MIT and LICENSE-APACHE-2.0) are included in the repository root directory, as indicated by project structure diagrams.
Sources: README.md
Package Information
The weak-map package is published on crates.io and documentation is available on docs.rs.
flowchart TD A["weak-map Package"] B["crates.io"] C["docs.rs"] D["Rust dependency management"] E["API documentation"] F["Your Project"] G["Add dependency in Cargo.toml"] A --> B A --> C B --> D C --> E F --> G G --> B
Project Origins
As noted in the README, weak-map is "similar to and inspired by weak-table but using BTreeMap
as underlying implementation."
Sources: README.md(L6)
Contributing Guide
Relevant source files
This document provides guidelines and instructions for contributing to the weak-map library. It covers the development workflow, code standards, CI process, and pull request procedures. For information on how to use the library, see the Usage Guide or Core Components for implementation details.
Development Environment Setup
Prerequisites
To contribute to weak-map, you'll need:
- Rust toolchain (stable, though the project is tested on nightly as well)
- Cargo (Rust's package manager)
- Git
Getting Started
flowchart TD A["Fork Repositoryon GitHub"] B["Clone Repositorygit clone Unsupported markdown: link"] C["Set Upstreamgit remote add upstream Unsupported markdown: link"] D["Create Branchgit checkout -b feature/your-feature"] E["Make Changes"] F["Run Testscargo test --all-features"] G["Run Clippycargo clippy --all-features --all-targets"] H["Commit Changesgit commit -m 'Add feature X'"] I["Push Changesgit push origin feature/your-feature"] J["Create Pull Requeston GitHub"] A --> B B --> C C --> D D --> E E --> F F --> G G --> H H --> I I --> J
Sources: .github/workflows/ci.yml(L1 - L32)
Code Standards and Guidelines
The weak-map codebase follows standard Rust coding conventions. All contributions should:
- Pass Clippy checks with no warnings (
cargo clippy --all-features --all-targets -- -Dwarnings
) - Include appropriate tests
- Maintain or improve test coverage
- Include documentation for public API items
- Follow the existing code style
Project Structure
When making changes, it's important to understand the project's structure:
flowchart TD A["src/lib.rsMain Entry Point"] B["src/map.rsWeakMap & StrongMap Implementations"] C["src/traits.rsWeakRef & StrongRef Traits"] D["WeakMap Struct"] E["StrongMap Struct"] F["WeakRef Trait"] G["StrongRef Trait"] A --> B A --> C B --> D B --> E C --> F C --> G
Sources: README.md(L1 - L7)
Testing Requirements
All contributions must include appropriate tests:
- Unit Tests: Test individual functions and methods
- Integration Tests: Test interactions between components
- Edge Cases: Include tests for boundary conditions
Run tests locally before submitting a PR:
cargo test --all-features
CI Process
The weak-map repository uses GitHub Actions for continuous integration:
flowchart TD A["Pull RequestCreated/Updated"] B["GitHub ActionsCI Workflow Triggered"] C["Check Job"] D["Rust StableClippy + Tests"] E["Rust NightlyClippy + Tests"] F["Rust Nightly-2025-01-18Clippy + Tests"] G["All ChecksPass?"] H["Ready for Review"] I["Fix Issues"] J["Push Updates"] A --> B B --> C C --> D C --> E C --> F D --> G E --> G F --> G G --> H G --> I I --> J J --> B
The CI process checks:
- Clippy static analysis with warnings treated as errors
- All tests passing across multiple Rust versions
- All features enabled during testing
Sources: .github/workflows/ci.yml(L1 - L32)
Pull Request Guidelines
PR Submission
When submitting a pull request:
- Provide a clear, descriptive title
- Include a detailed description of changes
- Reference any related issues
- Explain your testing approach
- Highlight any breaking changes
Review Process
The review process typically involves:
- CI checks passing
- Code review by maintainers
- Addressing feedback
- Final approval and merge
sequenceDiagram participant Contributor as Contributor participant CISystem as CI System participant Maintainer as Maintainer Contributor ->> CISystem: Submit PR CISystem ->> CISystem: Run tests & checks CISystem ->> Maintainer: Report results Maintainer ->> Contributor: Provide feedback Contributor ->> CISystem: Address feedback CISystem ->> CISystem: Re-run tests CISystem ->> Maintainer: Report updated results Maintainer ->> Maintainer: Final review Maintainer ->> Contributor: Approve/request changes Contributor ->> Maintainer: Address final requests Maintainer ->> Contributor: Merge PR
Sources: .github/workflows/ci.yml(L1 - L32)
Documentation
Documentation is a crucial part of the weak-map project:
- Code Documentation: All public APIs should have rustdoc comments
- Examples: Include examples for non-trivial functionality
- Wiki Contributions: Update relevant wiki pages when changing functionality
Documentation Style
/// A map containing weak references to values.
///
/// Values are automatically removed when the original reference is dropped.
///
/// # Examples
///
/// ```
/// use weak_map::WeakMap;
/// use std::rc::Rc;
///
/// let mut map = WeakMap::new();
/// let value = Rc::new("value");
///
/// map.insert("key", value.clone());
/// assert!(map.contains_key("key"));
///
/// drop(value); // Drop the strong reference
/// assert!(!map.get("key").is_some());
/// ```
Licensing
The weak-map project is dual-licensed under MIT and Apache 2.0 licenses. By contributing to this project, you agree that your contributions will be licensed under both licenses.
For details about the project's licenses, see the License Information page.
Technical Requirements Checklist
Before submitting your PR, ensure you've completed the following:
Requirement | Description | Status |
---|---|---|
Clippy Checks | cargo clippy --all-features --all-targetspasses with no warnings | ☐ |
Tests | All existing tests pass and new functionality has tests | ☐ |
Documentation | Public APIs are documented with rustdoc comments | ☐ |
CI Passing | All CI checks pass on GitHub | ☐ |
Code Style | Code follows existing style and conventions | ☐ |
Breaking Changes | Breaking changes are clearly documented | ☐ |
Sources: .github/workflows/ci.yml(L1 - L32)
License Information
Relevant source files
This document details the licensing structure of the weak-map library, explaining the dual-licensing approach that allows users to choose between two open-source licenses when using or modifying the codebase.
Dual-Licensing Model
The weak-map library is dual-licensed, allowing users to choose between the MIT License and the Apache License 2.0. This is specified in the Cargo.toml configuration file:
license = "MIT OR Apache-2.0"
The "OR" operator indicates that users may select either license according to their preferences and requirements, without needing to comply with both.
Sources: Cargo.toml(L7)
License Comparison
The following table compares key aspects of both licenses:
Feature | MIT License | Apache License 2.0 |
---|---|---|
License length | Brief (22 lines) | Comprehensive (200+ lines) |
Patent protection | No explicit patent grant | Explicit patent grant (Section 3) |
Trademark provisions | None | Explicit restrictions (Section 6) |
Modification notices | Requires copyright notice preservation | Requires indicating significant changes (Section 4) |
Contribution terms | Not specified | Explicitly addressed (Section 5) |
Warranty disclaimer | Simple disclaimer | Detailed disclaimer (Section 7) |
Liability limitation | Simple limitation | Detailed limitation (Section 8) |
Sources: LICENSE-MIT LICENSE-APACHE-2.0
License Files
The repository contains two license files:
- LICENSE-MIT: Contains the full text of the MIT License, dated 2025 with copyright attributed to Asakura Mizu.
- LICENSE-APACHE-2.0: Contains the full text of the Apache License 2.0, with copyright attributed to Asakura Mizu.
These files serve as the authoritative license texts for the project.
Sources: LICENSE-MIT LICENSE-APACHE-2.0
Dual-Licensing Structure
flowchart TD A["weak-map library"] B["Dual License Structure"] C["MIT License"] D["Apache License 2.0"] E["LICENSE-MIT file"] F["LICENSE-APACHE-2.0 file"] G["Cargo.toml license declaration"] A --> B B --> C B --> D C --> E D --> F G --> B
Sources: Cargo.toml(L7) LICENSE-MIT LICENSE-APACHE-2.0
Compliance Requirements
MIT License Compliance
To comply with the MIT License when using weak-map:
- Include the following copyright notice: "Copyright (c) 2025 Asakura Mizu"
- Include the complete MIT license text from the LICENSE-MIT file
- Include both items in all copies or substantial portions of the software
Apache License 2.0 Compliance
To comply with the Apache License 2.0 when using weak-map:
- Include the copyright notice: "Copyright 2025 Asakura Mizu"
- Include a complete copy of the Apache License 2.0
- For modified files, add notices stating that you changed the files
- Retain all copyright, patent, trademark, and attribution notices
- If the original contains a NOTICE file, include readable copy of attribution notices
Sources: LICENSE-MIT(L3 - L21) LICENSE-APACHE-2.0(L89 - L201)
License Selection Decision Flow
The following diagram provides guidance on selecting the appropriate license for your use case:
flowchart TD A["License selection factors"] B["Need patent protection?"] C["Apache 2.0 preferred"] D["Simple project needs?"] E["MIT preferred"] F["Need explicit contribution terms?"] G["Either license acceptable"] H["Use under Apache License 2.0"] I["Use under MIT License"] J["Choose based on ecosystem or preference"] A --> B B --> C B --> D C --> H D --> E D --> F E --> I F --> C F --> G G --> J
Sources: LICENSE-MIT LICENSE-APACHE-2.0 Cargo.toml(L7)
Implications for Contributors
Contributors to the weak-map project should understand:
- Their contributions will be available under both licenses
- By submitting a contribution, they agree their work may be distributed under either license
- The project maintainers can relicense their contributions as needed within these two options
- Any separate agreements with the project maintainers take precedence
This follows standard practice for dual-licensed Rust projects, ensuring maximum flexibility for users of the library.
Sources: LICENSE-MIT LICENSE-APACHE-2.0 Cargo.toml(L7)
License Coverage
The dual-licensing covers all components of the weak-map library, including:
- The core implementation files in
src/
- Documentation and examples
- Build configurations and metadata
All these components can be used, modified, and distributed according to either license at the user's discretion.
Sources: Cargo.toml(L7)
Project Metadata License Information
The licensing information is also reflected in the project metadata, which is important for users who install the library via Cargo. The specification in Cargo.toml ensures that the licensing information is properly included in the package registry (crates.io) and documentation (docs.rs).
[package]
name = "weak-map"
version = "0.1.0"
edition = "2024"
authors = ["Asakura Mizu <asakuramizu111@gmail.com>"]
description = "BTreeMap with weak references"
license = "MIT OR Apache-2.0"
repository = "https://github.com/Starry-OS/weak-map"
documentation = "https://docs.rs/weak-map"
This metadata ensures transparency about the licensing terms and helps users make informed decisions about incorporating the library into their projects.
Sources: Cargo.toml(L1 - L11)
Overview
Relevant source files
AXNS (Resource Namespace System) is a Rust library providing a unified interface for managing and controlling access to system resources across different deployment scenarios. It enables configurable resource sharing and isolation between processes and threads in various operating system environments, from unikernels with shared resources to monolithic kernels or containerized environments requiring isolation.
For more detailed information about specific components, see Core Concepts and Thread-Local Features.
Purpose and Scope
AXNS addresses several key requirements for flexible resource management:
- Unified Resource Access: Providing a consistent interface to system resources
- Configurable Isolation: Supporting varying degrees of resource sharing between threads
- Deployment Flexibility: Working effectively in different system architectures
- Memory Safety: Ensuring proper resource initialization and cleanup
- Type Safety: Providing strongly-typed access to resources
The system manages resources such as virtual address spaces, working directories, file descriptors, and other system facilities that might need to be shared or isolated.
Sources: README.md(L5 - L14)
Core Architecture
AXNS follows a modular design with several key architectural patterns:
flowchart TD subgraph subGraph2["Access Patterns"] F["Global Namespace"] G["Shared resources"] H["Thread-local Namespaces"] I["Isolated resources"] end subgraph subGraph1["Namespace Management Layer"] C["Namespace"] D["ResArc references"] E["Resource instances"] end subgraph subGraph0["Resource Definition Layer"] A["def_resource! macro"] B["Static Resource"] end J["Unikernel Mode"] K["Process/Container Mode"] A --> B B --> E C --> D D --> E F --> G G --> J H --> I I --> K
The architecture consists of these primary components:
Component | Description | Role |
---|---|---|
Namespace | Container for resources | Stores and provides access to system resources |
Resource | Resource type metadata | Defines memory layout and lifecycle functions |
ResWrapper | Static resource handle | Provides the public API for resource access |
ResArc | Reference-counted pointer | Manages resource lifecycle and memory |
def_resource! | Resource definition macro | Simplifies creation of new resource types |
Sources: src/lib.rs(L10 - L14)
Component Relationships
classDiagram class Namespace { +ptr: NonNull~ResArc~ +new() Namespace +get(Resource) &ResArc +get_mut(Resource) &mut ResArc } class Resource { +layout: Layout +init: fn pointer +drop: fn pointer +index() usize } class ResWrapper~T~ { +res: &'static Resource +current() ResCurrent~T~ +get(Namespace) &T +get_mut(Namespace) &mut T +share_from(dst, src) +reset(Namespace) } class ResArc~T~ { +ptr: NonNull~ResInner~ +as_ref() &T +get_mut() Option~&mut T~ } class CurrentNs { <<trait>> +new() Self +as_ref() &Namespace } Namespace "1" --> "*" ResArc : contains ResArc "*" --> Resource : references ResWrapper "1" --> Resource : describes ResWrapper ..> "1" Namespace : accesses through CurrentNs ..> "1" Namespace : provides context
Sources: src/lib.rs(L10 - L14) src/lib.rs(L32 - L59)
Resource Access Flow
Accessing resources in AXNS follows this pattern:
sequenceDiagram participant ApplicationCode as "Application Code" participant ResWrapperT as "ResWrapper<T>" participant Namespace as "Namespace" participant ResArcT as "ResArc<T>" ApplicationCode ->> ResWrapperT: Define with def_resource! ApplicationCode ->> Namespace: Create namespace ApplicationCode ->> ResWrapperT: resource.get(&namespace) ResWrapperT ->> Namespace: namespace.get(resource) Namespace ->> ResArcT: Get ResArc ResArcT -->> ApplicationCode: Return reference to T ApplicationCode ->> ResWrapperT: resource.get_mut(&mut namespace) ResWrapperT ->> Namespace: namespace.get_mut(resource) Namespace ->> ResArcT: Get mutable ResArc ResArcT -->> ApplicationCode: Return mutable reference to T ApplicationCode ->> ResWrapperT: resource.current() ResWrapperT ->> Namespace: Get current_ns() Note over Namespace: Uses thread-local or global NS Namespace -->> ApplicationCode: Access through current namespace
Sources: src/lib.rs(L16 - L59)
Thread-Local Feature
AXNS provides an optional thread-local feature for fine-grained resource isolation:
stateDiagram-v2 state UseThreadLocalNS { [*] --> CheckTLS CheckTLS --> InitializeNew : First access CheckTLS --> UseExisting : Subsequent access } state AccessResources { [*] --> GetResource GetResource --> ModifyResource : get_mut() GetResource --> ShareResource : share_from() GetResource --> ResetResource : reset() } [*] --> FeatureCheck FeatureCheck --> UseGlobalNS : thread-local OFF FeatureCheck --> UseThreadLocalNS : thread-local ON UseGlobalNS --> AccessResources UseThreadLocalNS --> AccessResources
This feature is controlled by the thread-local
feature flag in Cargo.toml:
[features]
thread-local = ["dep:extern-trait"]
When enabled, AXNS uses the CurrentNs
trait to provide thread-local namespaces. When disabled, all access goes through the global namespace.
Sources: src/lib.rs(L32 - L59) Cargo.toml(L14 - L15)
Deployment Scenarios
AXNS supports various deployment models by adjusting namespace isolation:
flowchart TD subgraph subGraph0["Deployment Models"] A["Unikernel"] D["Single Global Namespace"] B["Monolithic Kernel"] E["Per-Process Namespaces"] C["Container Environment"] F["Grouped Namespaces"] end G["Shared Resources"] H["Process-Isolated Resources"] I["Container-Isolated Resources"] A --> D B --> E C --> F D --> G E --> H F --> I
- Unikernel Mode: A single global namespace shared by all threads (default)
- Monolithic Kernel Mode: Each process has its own namespace, with threads in the same process sharing resources
- Container Mode: System resources grouped into namespaces that are shared between specific processes
Sources: README.md(L5 - L14)
Summary
AXNS provides a flexible, efficient system for managing resource namespaces across different operating system environments. Its architecture balances the need for shared resources with isolation requirements, providing a consistent API regardless of the deployment scenario. The system's design ensures proper resource lifecycle management through reference counting, while the optional thread-local feature provides additional isolation when needed.
For practical guidance on using AXNS, see Usage Guide.
Sources: src/lib.rs(L1 - L59) README.md(L1 - L14)
Core Concepts
Relevant source files
This page explains the fundamental concepts of the AXNS namespace system, providing an overview of key components and their relationships. For detailed implementation information, see Namespaces, Resources and ResWrapper, and The def_resource! Macro.
System Purpose
AXNS (Axiomatic Namespace System) provides a unified interface for managing system resources in a structured, namespace-based approach. Its core purpose is to enable:
- Resource isolation or sharing between different parts of a system
- Consistent access to resources through well-defined interfaces
- Proper resource lifecycle management including initialization and cleanup
- Flexibility across different deployment scenarios from shared unikernel environments to containerized systems
Sources: src/lib.rs(L1 - L15)
Key Components Overview
The following diagram illustrates the primary components of the AXNS system and their relationships:
Sources: src/res.rs(L11 - L15) src/res.rs(L53 - L56) src/res.rs(L107 - L119) src/ns.rs(L7 - L10)
Component Descriptions
Namespace
A Namespace
is a collection of resources that can be managed as a unit. It serves as a container that holds references to various system resources and provides controlled access to them.
Key characteristics:
- Contains an array of
ResArc
pointers (one for each defined resource) - Provides methods to access resources both immutably and mutably
- Can be created explicitly or accessed implicitly via the current namespace
- Manages the lifecycle of its contained resources
Sources: src/ns.rs(L7 - L10) src/ns.rs(L22 - L36)
Resource
A Resource
represents system resource metadata including memory layout and lifecycle functions. Resources are defined statically and stored in a special program section called "axns_resources".
Key characteristics:
- Contains memory layout information
- Provides initialization and cleanup functions
- Stored in a special section of the compiled program
- Referenced by index in a namespace
Sources: src/res.rs(L11 - L15) src/res.rs(L36 - L44)
ResWrapper
ResWrapper<T>
provides a type-safe interface to a specific resource. It acts as the primary API for interacting with resources across namespaces.
Key characteristics:
- References a static
Resource
instance - Provides methods to access the resource in different namespaces
- Enables resource sharing between namespaces
- Allows resetting resources to their default values
Sources: src/res.rs(L53 - L56) src/res.rs(L58 - L105)
ResCurrent
ResCurrent<T>
provides access to a resource in the "current" namespace, which might be a global namespace or a thread-local namespace depending on configuration.
Key characteristics:
- References a static
Resource
instance - Contains a reference to the current namespace
- Implements
Deref
for convenient access to the resource
Sources: src/res.rs(L107 - L119) src/res.rs(L121 - L128)
Resource Access Flow
The following diagram shows how resources are accessed within the AXNS system:
flowchart TD subgraph subGraph1["Resource Management"] J["resource.reset(&mut ns)"] L["Reset to initial value"] K["resource.share_from(&mut dst, &src)"] M["Clone ResArc pointer"] end subgraph subGraph0["Resource Access Methods"] D["resource.get(&ns)"] G["Read resource data"] E["resource.get_mut(&mut ns)"] H["Modify resource data"] F["resource.current()"] I["Get resource from current namespace"] end A["Client Code"] B["Define Static Resource"] C["Create Namespace"] N["Feature enabled?"] O["Thread-local namespace"] P["Global namespace"] A --> B A --> C B --> D B --> E B --> F B --> J B --> K C --> D C --> E D --> G E --> H F --> I I --> N J --> L K --> M N --> O N --> P
Sources: src/res.rs(L70 - L76) src/res.rs(L80 - L82) src/res.rs(L90 - L92) src/res.rs(L96 - L98) src/res.rs(L102 - L104) src/lib.rs(L54 - L59)
Resource Definition
Resources are defined using the def_resource!
macro, which creates both a static Resource
instance and a corresponding ResWrapper<T>
accessor:
flowchart TD A["def_resource! macro"] B["Static Resource"] C["ResWrapper instance"] D["Memory layout"] E["Init function"] F["Drop function"] G["API to access resource in namespaces"] H["Client code"] I["Namespace"] J["ResArc to resource data"] A --> B A --> C B --> D B --> E B --> F C --> B C --> G C --> I H --> C I --> J
Sources: src/res.rs(L144 - L168)
Thread-Local vs. Global Namespace Behavior
AXNS supports both global and thread-local namespaces through a feature flag:
sequenceDiagram participant ClientCode as Client Code participant ResWrapper as ResWrapper participant ResCurrent as ResCurrent participant current_ns as current_ns() participant ThreadLocal as Thread-Local participant Global as Global ClientCode ->> ResWrapper: resource.current() ResWrapper ->> ResCurrent: create ResCurrent ResCurrent ->> current_ns: crate::current_ns() alt thread-local feature enabled current_ns ->> ThreadLocal: CurrentNsImpl::new() ThreadLocal -->> current_ns: Thread-local namespace else thread-local feature disabled current_ns ->> Global: global_ns() Global -->> current_ns: Global namespace end current_ns -->> ResCurrent: Return CurrentNsImpl ResCurrent -->> ResWrapper: Return ResCurrent<T> ResWrapper -->> ClientCode: Resource access via deref
Sources: src/lib.rs(L16 - L59) src/res.rs(L70 - L76)
Key Operational Patterns
The AXNS system revolves around several key operational patterns:
- Resource Definition: Static resources are defined using the
def_resource!
macro, which creates metadata and accessor objects. - Namespace Creation: Namespaces can be created explicitly (
Namespace::new()
) or accessed implicitly via the current namespace. - Resource Access: Resources can be accessed in four main ways:
resource.get(&ns)
: Immutable access in a specific namespaceresource.get_mut(&mut ns)
: Mutable access in a specific namespace (if not shared)resource.current()
: Access in the current namespace- Direct namespace access:
ns.get(res)
andns.get_mut(res)
- Resource Sharing: Resources can be shared between namespaces using
resource.share_from(&mut dst, &src)
. - Resource Reset: Resources can be reset to their default values using
resource.reset(&mut ns)
.
Sources: src/res.rs(L58 - L105) src/ns.rs(L22 - L46)
Memory Management Model
AXNS implements careful memory management to ensure resources are properly initialized and cleaned up:
flowchart TD A["Namespace Creation"] B["Array of ResArc pointers"] C["Initialize ResArc"] D["Resource Access"] E["Get ResArc"] F["Share safely"] G["Resource Mutation"] H["Shared?"] I["Return None"] J["Return mutable reference"] K["Namespace Destruction"] L["Decrement ref count"] M["Call resource drop fn"] N["Deallocate memory"] A --> B B --> C D --> E E --> F G --> H H --> I H --> J K --> L L --> M M --> N
Sources: src/ns.rs(L22 - L36) src/ns.rs(L55 - L62)
Summary
The core concepts of AXNS revolve around:
- Resources - System objects with defined memory layouts and lifecycle functions
- Namespaces - Collections of resources that can be managed as units
- Wrappers - Type-safe interfaces for accessing resources in namespaces
- Current Namespace - A concept that provides easy access to resources without explicit namespace references
These components work together to provide a flexible, memory-safe system for managing resources in various deployment environments, from shared global resources to fully isolated per-thread resources.
Sources: src/lib.rs(L1 - L15) src/res.rs(L11 - L15) src/res.rs(L53 - L56) src/ns.rs(L7 - L10)
Namespaces
Relevant source files
Purpose and Scope
This document provides a detailed explanation of the Namespace
struct in the AXNS system, which serves as a container for resources. It covers the internal structure, creation, resource access methods, and memory management of namespaces. For information about the resources themselves and how they're wrapped, see Resources and ResWrapper. For details on thread-local namespace features, see Thread-Local Features.
Namespace Structure
In AXNS, a Namespace
is a collection of resources, each accessed through a reference-counted pointer (ResArc
). The Namespace
struct is defined in src/ns.rs
and consists of a single pointer field that points to an array of ResArc
instances.
flowchart TD subgraph subGraph0["Namespace Structure"] Namespace["Namespace {ptr: NonNull}"] ResArcArray["Array of ResArcs (size = Resources.len())"] ResArc1["ResArc[0]"] ResArc2["ResArc[1]"] ResArcN["ResArc[n-1]"] Resource1["Resource Data 1"] Resource2["Resource Data 2"] ResourceN["Resource Data n"] end Namespace --> ResArcArray ResArc1 --> Resource1 ResArc2 --> Resource2 ResArcArray --> ResArc1 ResArcArray --> ResArc2 ResArcArray --> ResArcN ResArcN --> ResourceN
Sources: src/ns.rs(L6 - L13)
Namespace Creation and Initialization
When a new Namespace
is created using Namespace::new()
, it:
- Allocates memory for an array of
ResArc
instances (one for each resource in the system) - Initializes each
ResArc
with its corresponding resource's default value - Returns the constructed
Namespace
sequenceDiagram participant CodecreatingNamespace as "Code creating Namespace" participant Namespacenew as "Namespace::new()" participant MemoryAllocator as "Memory Allocator" participant ResourcesCollection as "Resources Collection" CodecreatingNamespace ->> Namespacenew: Call new() Namespacenew ->> ResourcesCollection: Get resources count (Resources.len()) ResourcesCollection -->> Namespacenew: Return count Namespacenew ->> MemoryAllocator: Allocate array of ResArc (size) MemoryAllocator -->> Namespacenew: Return allocated memory loop For each resource in Resources Namespacenew ->> ResourcesCollection: Get resource ResourcesCollection -->> Namespacenew: Return resource Namespacenew ->> Namespacenew: Initialize ResArc for resource end Namespacenew -->> CodecreatingNamespace: Return new Namespace
Sources: src/ns.rs(L16 - L36)
Resource Access
The Namespace
provides two primary methods for accessing resources:
get(&self, res: &'static Resource) -> &ResArc
: Returns a reference to theResArc
for a given resource.get_mut(&mut self, res: &'static Resource) -> &mut ResArc
: Returns a mutable reference to theResArc
for a given resource.
Both methods use the resource's index (obtained via res.index()
) to locate the corresponding ResArc
in the array.
Method | Description | Implementation |
---|---|---|
get | Returns a reference to a resource'sResArc | Uses the resource's index to find the correspondingResArcin the array |
get_mut | Returns a mutable reference to a resource'sResArc | Uses the resource's index to find the correspondingResArcin the array |
Sources: src/ns.rs(L38 - L46)
Global and Thread-Local Namespaces
AXNS supports two namespace access patterns:
- Global Namespace: A singleton namespace accessible from anywhere via the
global_ns()
function - Thread-Local Namespaces: When the "thread-local" feature is enabled, each thread can have its own namespace
flowchart TD subgraph subGraph0["Namespace Resolution"] A["current_ns()"] B["thread-local feature?"] C["Thread-Local CurrentNsImpl"] D["Global CurrentNsImpl"] E["Thread's Namespace"] F["Global Namespace (from global_ns())"] G["Access Resources"] end A --> B B --> C B --> D C --> E D --> F E --> G F --> G
Sources: src/lib.rs(L16 - L59)
Memory Management
The Namespace
struct carefully manages memory for all its resources. When a Namespace
is dropped:
- It calls
drop_in_place()
on the array ofResArc
instances, which decrements the reference count for each resource - It deallocates the memory used for the array itself
This ensures that resources are properly cleaned up when they're no longer needed.
Sources: src/ns.rs(L55 - L63)
Namespace in the AXNS Architecture
The Namespace
is a central component in the AXNS system, working closely with other components:
flowchart TD subgraph subGraph0["AXNS Component Relationships"] Namespace["Namespace(Container for resources)"] ResArc["ResArc(Reference-counted resource)"] Resource["Resource(Resource metadata)"] ResWrapper["ResWrapper(Type-safe resource access)"] GlobalNs["global_ns()(Returns static Namespace)"] CurrentNs["current_ns()(Thread-local or global)"] end CurrentNs --> Namespace GlobalNs --> Namespace Namespace --> ResArc ResArc --> Resource ResWrapper --> Namespace ResWrapper --> Resource
Sources: src/lib.rs(L10 - L14) src/ns.rs(L1 - L4)
Implementation Details
The Namespace
implementation includes several important features:
- Memory Efficiency: Uses a single pointer to an array rather than a standard Rust collection to minimize overhead
- Safety Markers: Implements
Send
andSync
traits to indicate thread safety - Default Implementation: Provides a
Default
implementation that callsnew()
- Manual Memory Management: Performs explicit allocation and deallocation to maintain control over memory layout
API Summary
Method | Description | Example Usage |
---|---|---|
Namespace::new() | Creates a newNamespacewith default values | let ns = Namespace::new(); |
ns.get(resource) | Gets a reference to a resource | let r = ns.get(&MY_RESOURCE); |
ns.get_mut(resource) | Gets a mutable reference to a resource | let r = ns.get_mut(&MY_RESOURCE); |
global_ns() | Gets the global namespace | let ns = global_ns(); |
current_ns() | Gets the current namespace (global or thread-local) | let ns = current_ns(); |
Sources: src/ns.rs(L15 - L63) src/lib.rs(L16 - L59)
Resources and ResWrapper
Relevant source files
This page documents the resource system in AXNS, focusing on the Resource
struct and ResWrapper<T>
container that provide typed access to resources in namespaces. For information about how resources are defined using macros, see The def_resource! Macro.
Resource System Overview
Resources in AXNS are statically defined objects that can be accessed through namespaces. The resource system provides:
- Type-safe access to resources
- Reference counting for memory management
- Sharing capabilities between namespaces
- Thread-local or global contexts depending on configuration
Sources: src/res.rs(L11 - L43) src/res.rs(L53 - L105) src/res.rs(L115 - L128) src/arc.rs(L17 - L47) src/arc.rs(L49 - L120)
The Resource Struct
The Resource
struct is the foundational building block of the resource system:
flowchart TD subgraph subGraph1["Storage Mechanism"] E["Resources Collection"] F["Link-Time Section"] G["Resources::deref()"] end subgraph subGraph0["Resource Definition"] A["Resource Struct"] B["Memory Layout"] C["Init Function"] D["Drop Function"] end A --> B A --> C A --> D A --> E E --> F F --> G
Sources: src/res.rs(L11 - L15) src/res.rs(L17 - L44)
The Resource
struct is defined in src/res.rs
and contains three key components:
layout
: Specifies the memory layout for the resource typeinit
: A function pointer to initialize the resourcedrop
: A function pointer to clean up the resource when it's dropped
Resources are stored in a special link-time section named "axns_resources", which allows them to be accessed as a collection through the Resources
struct. This implementation mimics the behavior of the linkme
crate.
ResWrapper: Type-Safe Resource Access
The ResWrapper<T>
struct provides a typed interface to access resources:
flowchart TD A["Client Code"] B["ResWrapper::get()"] C["Namespace"] D["ResArc"] E["Resource Instance"] F["ResWrapper::get_mut()"] G["Can be mutated?"] H["&mut Resource Instance"] I["None"] J["ResWrapper::current()"] K["current_ns()"] L["ResCurrent"] A --> B A --> F A --> J B --> C C --> D D --> E F --> C F --> G G --> H G --> I J --> K J --> L
Sources: src/res.rs(L53 - L105)
ResWrapper<T>
is a wrapper around a static Resource
that provides type-safe access methods:
get
: Obtains a reference to the resource in a given namespaceget_mut
: Attempts to get a mutable reference (only if the resource isn't shared)current
: Creates aResCurrent<T>
that references the resource in the current namespaceshare_from
: Shares a resource from one namespace to anotherreset
: Resets a resource to its default value in a namespace
ResCurrent: Accessing the Current Namespace
ResCurrent<T>
provides a convenient way to access resources in the current namespace:
flowchart TD A["ResCurrent<T>"] B["Deref<Target=T>"] C["res: &'static Resource"] D["ns: CurrentNsImpl"] E["Client Code"] F["creates ResCurrent"] G["auto-derefs to T"] H["current_ns()"] I["Thread-local or global namespace"] A --> B A --> C A --> D E --> F E --> G F --> H H --> I
Sources: src/res.rs(L115 - L128)
ResCurrent<T>
implements Deref<Target = T>
, which allows for transparent access to the underlying resource value. When you dereference a ResCurrent<T>
(using *
or accessing fields/methods), it automatically retrieves the resource from the current namespace.
Memory Management with ResArc
Resources are managed using a reference-counted smart pointer called ResArc
:
flowchart TD subgraph subGraph0["ResArc Structure"] A["ResArc"] B["NonNull<ResInner>"] C["ResInner"] D["res: &'static Resource"] E["strong: AtomicUsize"] F["Resource Data"] end subgraph subGraph1["Memory Layout"] G["Memory Block"] H["ResInner Header"] I["Resource Value"] end J["clone()"] K["drop()"] L["deallocate memory"] A --> B B --> C C --> D C --> E C --> F G --> H H --> I J --> E K --> E K --> L
Sources: src/arc.rs(L17 - L47) src/arc.rs(L49 - L120)
ResArc
uses a pattern similar to Rust's standard Arc<T>
to implement reference counting:
- When a new
ResArc
is created, it allocates memory for both theResInner
header and the resource data - The
ResInner
contains a reference to the staticResource
definition and an atomic reference counter - Cloning an
ResArc
increments the reference counter - Dropping an
ResArc
decrements the counter; when it reaches zero, the memory is deallocated
This system ensures that resources are properly cleaned up when they're no longer needed, while allowing efficient sharing between namespaces.
Resource Sharing and Reset
ResWrapper<T>
provides methods to share resources between namespaces and reset them to their default values:
sequenceDiagram participant ClientCode as "Client Code" participant SourceNamespace as "Source Namespace" participant DestNamespace as "Dest Namespace" participant ResWrapper as "ResWrapper" ClientCode ->> ResWrapper: share_from(dst, src) ResWrapper ->> SourceNamespace: get(res) SourceNamespace -->> ResWrapper: Return ResArc ResWrapper ->> ResWrapper: clone ResArc ResWrapper ->> DestNamespace: get_mut(res) ResWrapper ->> DestNamespace: Replace with cloned ResArc ClientCode ->> ResWrapper: reset(ns) ResWrapper ->> ResWrapper: ResArc::new(res) ResWrapper ->> DestNamespace: get_mut(res) ResWrapper ->> DestNamespace: Replace with new ResArc
Sources: src/res.rs(L94 - L104)
share_from
: This method clones theResArc
from the source namespace and assigns it to the destination namespace. This creates a shared reference to the same resource value.reset
: This method creates a newResArc
with the default value for the resource and assigns it to the namespace, effectively resetting it to its initial state.
Resource Collection System
Resources are stored in a special link-time section and accessed through the Resources
collection:
flowchart TD subgraph subGraph0["Link-Time Storage"] A["__start_axns_resources"] B["Resource Array"] C["__stop_axns_resources"] end D["Resources::deref()"] E["Array Length"] F["&[Resource]"] G["Resource::index()"] H["Resource Index"] I["Namespace"] A --> B B --> C D --> E D --> F G --> H H --> I
Sources: src/res.rs(L17 - L44)
The resource collection system uses link-time sections to create an array of Resource
objects:
Resources::deref()
calculates the length of the resource array by finding the difference between the start and end addresses- The
Resource::index()
method calculates the index of a resource in this array - The index is used by the
Namespace
to store and retrieve resource instances
This approach allows for efficient storage and lookup of resources with minimal runtime overhead.
Complete Resource Flow
The complete flow of defining and accessing resources involves several components:
flowchart TD A["def_resource! macro"] B["static Resource"] C["static ResWrapper"] D["Client Code"] E["ns.get(res)"] F["ResArc"] G["&T"] H["ns.get_mut(res)"] I["&mut ResArc"] J["Option<&mut T>"] K["ResCurrent"] L["&T"] A --> B A --> C C --> E C --> H C --> K D --> C D --> K E --> F F --> G H --> I I --> J K --> L
Sources: src/res.rs(L53 - L128) src/res.rs(L144 - L168)
This diagram illustrates the complete flow from resource definition to access:
- Resources are defined using the
def_resource!
macro - The macro generates a static
Resource
object and aResWrapper<T>
to access it - Client code can access the resource in multiple ways:
- Using
get()
to get a reference in a specific namespace - Using
get_mut()
to get a mutable reference if possible - Using
current()
to access the resource in the current namespace
Summary
The Resource system in AXNS provides a flexible and type-safe way to define and access resources in different namespaces. Key components include:
Component | Purpose |
---|---|
Resource | Defines the layout and lifecycle functions for a resource |
ResWrapper | Provides type-safe access to resources in namespaces |
ResCurrent | Enables access to resources in the current namespace |
ResArc | Implements reference counting for resource management |
This system enables efficient sharing of resources between namespaces while maintaining memory safety and proper cleanup when resources are no longer needed.
The def_resource! Macro
Relevant source files
The def_resource!
macro is a core component of the AXNS system that provides a declarative syntax for defining static resources that can be managed within namespaces. This page explains how the macro works, its syntax, and how it integrates with the rest of the AXNS resource namespace system.
For information about the general resource system and ResWrapper
structure, see Resources and ResWrapper.
Purpose and Function
The def_resource!
macro serves as the primary entry point for users to define resources that can be managed by the AXNS namespace system. It:
- Creates static resources with type safety and initialization logic
- Registers these resources in a global resource registry
- Generates wrapper objects that provide a consistent interface for resource access
- Handles proper memory layout and lifecycle management for resources
Sources: src/res.rs(L144 - L168)
Macro Syntax and Usage
The def_resource!
macro follows this syntax pattern:
def_resource! {
/// Optional documentation
[visibility] static RESOURCE_NAME: ResourceType = default_value;
// Multiple resources can be defined in a single macro invocation
[visibility] static ANOTHER_RESOURCE: AnotherType = another_default_value;
}
Key components:
- Visibility modifier (
pub
,pub(crate)
, etc.) - controls access to the resource - Resource name - a static identifier for accessing the resource
- Resource type - any valid Rust type
- Default value - the initial value assigned to the resource in each namespace
Sources: src/res.rs(L144 - L168) tests/all.rs(L11 - L13) tests/all.rs(L31 - L33)
How the Macro Works
flowchart TD subgraph subGraph0["For each resource definition"] B["Generate static Resource struct"] C["Place in 'axns_resources' section"] D["Create ResWrapper<T> static variable"] E["Expose resource access methods"] end A["def_resource! macro invocation"] F["Resources collected in global registry"] G["Resource accessible via namespaces"] A --> B A --> D B --> C C --> F D --> E E --> G
Sources: src/res.rs(L144 - L168) src/res.rs(L17 - L34)
When you use the def_resource!
macro, it expands to create two key components for each defined resource:
- A static
Resource
structure that contains:
- Memory layout information for the resource type
- Initialization function that creates the default value
- Drop function that handles cleanup when resources are deallocated
- A static
ResWrapper<T>
that provides methods to access the resource in different namespaces
The macro places each Resource
structure in a special ELF section named "axns_resources", which allows the system to discover all resources at runtime without requiring explicit registration.
Sources: src/res.rs(L144 - L168) src/res.rs(L10 - L15)
Generated Code Structure
The expanded macro code generates:
- A static
Resource
instance with fixed layout and lifecycle functions - A static
ResWrapper<T>
instance that wraps the resource - Assertions to ensure the resource type isn't zero-sized
Sources: src/res.rs(L144 - L168) src/res.rs(L53 - L56)
Generated Code Example
When you write:
def_resource! {
pub static COUNTER: AtomicUsize = AtomicUsize::new(0);
}
The macro expands to something equivalent to:
pub static COUNTER: ResWrapper<AtomicUsize> = {
#[unsafe(link_section = "axns_resources")]
static RES: Resource = Resource {
layout: Layout::new::<AtomicUsize>(),
init: |ptr| {
let val: AtomicUsize = AtomicUsize::new(0);
unsafe { ptr.cast().write(val) }
},
drop: |ptr| unsafe {
ptr.cast::<AtomicUsize>().drop_in_place();
},
};
assert!(RES.layout.size() != 0, "Resource has zero size");
ResWrapper::new(&RES)
};
This generated code creates a static Resource
structure with the proper memory layout and lifecycle functions for an AtomicUsize
, then wraps it in a ResWrapper<AtomicUsize>
that provides type-safe access methods.
Sources: src/res.rs(L144 - L168)
Memory Layout and Resource Registry
flowchart TD subgraph subGraph0["axns_resources section"] A["Resource #1"] B["Resource #2"] C["Resource #3"] D["..."] end E["__start_axns_resources"] F["__stop_axns_resources"] G["Resources struct"] H["Slice of all resources"] I["User code"] J["Resources.as_ptr()"] A --> B B --> C C --> D D --> F E --> A G --> H H --> E H --> F I --> J J --> H
The def_resource!
macro places all resources in a dedicated ELF section called "axns_resources". The AXNS system uses special linker symbols (__start_axns_resources
and __stop_axns_resources
) to locate the beginning and end of this section, allowing it to create a slice of all defined resources.
The Resources
struct (implementing Deref<Target=[Resource]>
) provides access to all registered resources as a contiguous array, which is used for resource indexing and lookup.
Sources: src/res.rs(L17 - L34) src/res.rs(L36 - L44)
Resource Access Methods
The ResWrapper<T>
generated by the macro provides several methods to access the wrapped resource:
Method | Purpose | Return Type |
---|---|---|
get(ns) | Get immutable reference from namespace | &T |
get_mut(ns) | Get mutable reference if not shared | Option<&mut T> |
current() | Access resource in current namespace | ResCurrent |
share_from(dst, src) | Share resource between namespaces | () |
reset(ns) | Reset resource to default value | () |
These methods provide a complete interface for interacting with resources in both specific and current namespaces.
Sources: src/res.rs(L58 - L105)
Usage Examples
Basic Resource Definition and Access
// Define resources
def_resource! {
pub static CONFIG: Config = Config::default();
pub static COUNTER: AtomicUsize = AtomicUsize::new(0);
}
// Create a namespace
let mut ns = Namespace::new();
// Access resources
let config = CONFIG.get(&ns);
let counter_val = COUNTER.get(&ns).load(Relaxed);
// Modify resources (if not shared)
if let Some(counter) = COUNTER.get_mut(&mut ns) {
counter.store(42, Relaxed);
}
Sources: tests/all.rs(L11 - L24) tests/all.rs(L31 - L38)
Using the Current Namespace
The current()
method provides convenient access to resources in the current namespace, which is determined by the CurrentNs
implementation (globally shared or thread-local depending on feature flags):
// Access and modify resource in current namespace
let counter = COUNTER.current();
counter.fetch_add(1, Relaxed);
Sources: src/res.rs(L69 - L76) tests/all.rs(L35 - L37)
Resource Sharing and Resetting
// Create two namespaces
let mut ns1 = Namespace::new();
let mut ns2 = Namespace::new();
// Modify resource in ns1
if let Some(counter) = COUNTER.get_mut(&mut ns1) {
counter.store(42, Relaxed);
}
// Share the resource from ns1 to ns2
COUNTER.share_from(&mut ns2, &ns1);
// Later, reset the resource in ns2 to its default value
COUNTER.reset(&mut ns2);
Sources: src/res.rs(L94 - L104) tests/all.rs(L96 - L123)
Thread-Local Considerations
When using the thread-local
feature flag, resources defined with def_resource!
can be accessed in a thread-local context, providing isolation between threads.
sequenceDiagram participant Thread1 as Thread 1 participant Thread2 as Thread 2 participant RESOURCE as RESOURCE Note over Thread1,Thread2: With thread-local feature enabled Thread1 ->> RESOURCE: RESOURCE.current() RESOURCE -->> Thread1: Access Thread 1's instance Thread2 ->> RESOURCE: RESOURCE.current() RESOURCE -->> Thread2: Access Thread 2's instance Note over Thread1,Thread2: Without thread-local feature Thread1 ->> RESOURCE: RESOURCE.current() RESOURCE -->> Thread1: Access global instance Thread2 ->> RESOURCE: RESOURCE.current() RESOURCE -->> Thread2: Access same global instance
The current()
method on resources defined with def_resource!
respects the thread-local configuration of the AXNS system, making it easier to write code that works in both shared and isolated resource modes.
Sources: src/res.rs(L69 - L76) tests/all.rs(L40 - L159)
Implementation Details
Resource Indexing
Each Resource
defined by the macro has an index()
method that computes its position in the global resource registry. This index is used by the Namespace
to efficiently store and retrieve resources without requiring hash lookups.
Sources: src/res.rs(L36 - L44)
Zero-Sized Types
The macro includes an assertion to ensure the resource type isn't zero-sized, as this would cause issues with the memory management system:
assert!(RES.layout.size() != 0, "Resource has zero size");
This prevents potential errors when defining resources with types like ()
or empty structs.
Sources: src/res.rs(L162)
Resource Initialization
The init
function generated by the macro creates the default value and writes it to memory when a namespace is initialized. This ensures that every namespace starts with properly initialized resources.
Sources: src/res.rs(L153 - L156)
Summary
The def_resource!
macro is the foundation of the AXNS resource system, providing a clean, type-safe interface for defining resources that can be managed within namespaces. It generates the necessary code to handle resource lifecycle management, access control, and integration with the namespace system.
By placing all resources in a special ELF section and providing wrapper types with access methods, it enables efficient resource lookup and manipulation while maintaining a simple user interface.
Thread-Local Features
Relevant source files
Purpose and Scope
This page documents the thread-local namespace functionality in AXNS, which provides isolation of resources between threads. When enabled, this feature allows each thread to maintain its own separate namespace, as opposed to sharing a global namespace. For information about namespaces in general, see Namespaces; for information about resource lifecycle management, see Resource Lifecycle.
Sources: src/lib.rs(L20 - L59) Cargo.toml(L14 - L15)
Feature Flag Overview
Thread-local functionality is controlled by the thread-local
feature flag in AXNS. When this feature is enabled, it adds the ability for each thread to have its own isolated namespace for resources.
flowchart TD subgraph subGraph0["Implementation Details"] B["Thread-local Namespaces"] C["Global Namespace Only"] F["CurrentNs trait with extern_trait"] G["Simple CurrentNsImpl struct"] end A["Feature Check: thread-local"] D["Each thread has isolated resources"] E["All threads share the same resources"] A --> B A --> C B --> D B --> F C --> E C --> G
Thread-local Feature Control Flow
Sources: Cargo.toml(L14 - L15) src/lib.rs(L35 - L59)
Implementation Architecture
The thread-local feature implementation centers around the CurrentNs
trait, which is only defined when the feature flag is enabled. This trait abstracts the retrieval of the current namespace for a thread.
With Thread-Local Feature Disabled
When the thread-local feature is disabled, all resource accesses use the global namespace. This is implemented through a simple CurrentNsImpl
struct that returns the global namespace when asked for the current namespace.
classDiagram class CurrentNsImpl { +as_ref() &Namespace } class current_ns { +() CurrentNsImpl } class global_ns { +() &'static Namespace } class Namespace { +new() Namespace } current_ns --> CurrentNsImpl : returns CurrentNsImpl --> global_ns : calls global_ns --> Namespace : returns static ref
Implementation without Thread-Local Feature
Sources: src/lib.rs(L44 - L59) src/lib.rs(L17 - L25)
With Thread-Local Feature Enabled
When the thread-local feature is enabled, the system uses the CurrentNs
trait which can be implemented by user code to provide thread-local namespaces. The trait is marked as unsafe because implementations must ensure proper thread safety.
classDiagram class CurrentNs { <<trait>> +new() Self +as_ref() &Namespace } class CurrentNsImpl { +impl CurrentNs } class current_ns { +() CurrentNsImpl } class external_implementation { user provided } current_ns --> CurrentNs : returns CurrentNsImpl --|> CurrentNs : implements external_implementation --|> CurrentNs : implements
Implementation with Thread-Local Feature
Sources: src/lib.rs(L27 - L42) src/lib.rs(L54 - L59)
Accessing Resources with Thread-Local Namespaces
When a resource is accessed using its current()
method, the behavior differs based on whether the thread-local feature is enabled:
sequenceDiagram participant ClientCode as "Client Code" participant ResCurrentT as "ResCurrent<T>" participant CurrentNsImpl as "CurrentNsImpl" participant ThreadLocalStorage as "Thread Local Storage" participant GlobalNamespace as "Global Namespace" ClientCode ->> ResCurrentT: current() ResCurrentT ->> CurrentNsImpl: current_ns() alt thread-local feature enabled CurrentNsImpl ->> ThreadLocalStorage: Check if thread has namespace alt First access ThreadLocalStorage ->> ThreadLocalStorage: Initialize thread-local namespace end ThreadLocalStorage -->> CurrentNsImpl: Return thread's namespace else thread-local feature disabled CurrentNsImpl ->> GlobalNamespace: global_ns() GlobalNamespace -->> CurrentNsImpl: Return global static namespace end CurrentNsImpl -->> ResCurrentT: as_ref() -> &Namespace ResCurrentT -->> ClientCode: Return resource in current namespace
Resource Access Flow with Thread-Local Feature
Sources: src/lib.rs(L17 - L59) tests/all.rs(L40 - L159)
Implementation Example
The tests in the codebase provide a practical example of implementing thread-local namespaces. This implementation uses a thread-local Once
value to store an Arc<RwLock<Namespace>>
.
Example Thread-Local Implementation
classDiagram class ThreadLocal { static NS: Once~~ } class CurrentNsImpl { -Option~ guard +new() Self +as_ref() &Namespace } class CurrentNs { <<trait>> +new() Self +as_ref() &Namespace } CurrentNs --|> CurrentNsImpl : implements CurrentNs --> ThreadLocal : accesses
Thread-Local Implementation Example
Sources: tests/all.rs(L49 - L70)
Resource Isolation Between Threads
When the thread-local feature is enabled, resources can be isolated between threads. Each thread can create its own namespace and modify resources independently of other threads.
flowchart TD subgraph subGraph1["Thread 2"] D["Namespace 2"] E["Resource A: value=100"] F["Resource B: value=456"] end subgraph subGraph0["Thread 1"] A["Namespace 1"] B["Resource A: value=42"] C["Resource B: value=123"] end G["Thread-Local Storage"] H["Resource Definitions"] I["static DATA_A: Type = initial_value"] J["static DATA_B: Type = initial_value"] A --> B A --> C D --> E D --> F G --> A G --> D H --> I H --> J I --> B I --> E J --> C J --> F
Thread Isolation with Thread-Local Namespaces
Sources: tests/all.rs(L40 - L159)
Resource Sharing Between Thread-Local Namespaces
Even with thread isolation, AXNS allows for controlled sharing of resources between namespaces using the share_from
method. This creates a shared reference to the same resource instance across different namespaces.
sequenceDiagram participant Thread1 as "Thread 1" participant Namespace1 as "Namespace 1" participant Thread2 as "Thread 2" participant Namespace2 as "Namespace 2" participant Resource as "Resource" Thread1 ->> Namespace1: Create namespace Thread1 ->> Namespace1: Modify resource Thread2 ->> Namespace2: Create namespace Thread2 ->> Resource: share_from(&mut NS2, &NS1) Note over Namespace1,Namespace2: Both namespaces now reference the same resource instance Thread1 ->> Namespace1: Access resource Thread2 ->> Namespace2: Access resource (sees same value)
Resource Sharing Between Thread-Local Namespaces
Sources: tests/all.rs(L125 - L159)
Thread-Local Features in Action
The following table summarizes key operations and their behavior with thread-local features enabled:
Operation | With Thread-Local Enabled | With Thread-Local Disabled |
---|---|---|
resource.current() | Returns resource from thread's namespace | Returns resource from global namespace |
current_ns() | Returns thread-specificCurrentNsImpl | Returns global-referencingCurrentNsImpl |
Resource Creation | Created in thread-local namespace | Created in global namespace |
Resource Sharing | Must explicitly useshare_from() | Automatically shared (single namespace) |
Resource Reset | Only affects thread's namespace | Affects all threads |
Sources: src/lib.rs(L17 - L59) tests/all.rs(L40 - L159)
Best Practices for Thread-Local Features
When implementing thread-local namespaces, consider the following best practices:
- Initialization: Initialize thread-local namespaces lazily (on first access)
- Thread Safety: Ensure proper locking or synchronization when accessing the namespace
- Resource Management: Be mindful of resource lifecycle with thread-local namespaces
- Custom Implementation: Implement
CurrentNs
trait for your specific thread-local storage needs
flowchart TD A["Define Resources"] B["Implement CurrentNs trait"] C["Thread-Local Storage Setup"] D["Lazy Initialization"] E["Resource Access Patterns"] F["Thread-Local Access"] G["Cross-Thread Sharing"] H["Resource Cleanup"] I["Reset Resources"] J["Drop Thread-Local Storage"] A --> B B --> C C --> D E --> F E --> G H --> I H --> J
Thread-Local Feature Usage Guide
Sources: tests/all.rs(L40 - L159) src/lib.rs(L27 - L42)
Implementation Considerations
When designing an implementation of thread-local namespaces, you should address:
- Storage Strategy: How thread-local namespaces are stored and retrieved
- Initialization Logic: When and how thread-local namespaces are created
- Default Behavior: What happens when a thread doesn't have a namespace
- Thread Cleanup: Ensuring resources are properly cleaned up when threads exit
The test implementation provides one approach using thread-local storage with Once
and Arc<RwLock<Namespace>>
, but other approaches may be more suitable depending on your specific requirements.
Sources: tests/all.rs(L49 - L70) src/lib.rs(L27 - L42)
Resource Lifecycle
Relevant source files
Purpose and Scope
This document explains in detail how resources are created, accessed, shared, and cleaned up throughout their lifecycle in the AXNS system. Understanding the resource lifecycle is crucial for effectively utilizing the namespace system and ensuring proper resource management.
For information about the specific implementation of reference counting used for resource management, see Resource Reference Counting.
Resource Creation Process
Definition and Initialization
Resources in AXNS begin their lifecycle when they are defined using the def_resource!
macro. This macro generates static resource definitions along with their accessor wrappers.
flowchart TD A["def_resource! macro"] B["Static Resource object"] C["Layout information"] D["init function"] E["drop function"] F["def_resource! macro"] G["ResWrapper instance"] A --> B B --> C B --> D B --> E F --> G G --> B
Sources: src/res.rs(L144 - L168)
When a resource is defined using the def_resource!
macro, several key components are created:
- A
static Resource
object with:
- Memory layout information for the resource type via
Layout::new::<T>()
- An initialization function that creates the default value
- A drop function that properly cleans up the resource when no longer needed
- A static
ResWrapper<T>
that provides the API for accessing this resource across namespaces
Namespace Resource Initialization
When a namespace is created, it initializes all resources with their default values.
flowchart TD A["Namespace::new()"] B["Memory block for all ResArcs"] C["ResArc::new(res)"] D["ResInner + Resource data"] E["res.init function"] F["strong count = 1"] A --> B A --> C C --> D C --> E D --> F
Sources: src/ns.rs(L22 - L36) src/arc.rs(L57 - L72)
The initialization flow works as follows:
Namespace::new()
allocates a single contiguous memory block to hold allResArc
instances for every defined resource- For each resource in the resources list:
- It creates a new
ResArc
usingResArc::new(res)
ResArc::new
allocates memory for both theResInner
structure and the resource data- The resource is initialized by calling its init function with the allocated memory
- The reference count (strong count) is set to 1
Resource Access Patterns
AXNS provides several ways to access resources, each designed for different use cases.
sequenceDiagram participant ClientCode as "Client Code" participant ResWrapperT as "ResWrapper<T>" participant Namespace as "Namespace" participant ResArc as "ResArc" participant ResourceData as "Resource Data" ClientCode ->> ResWrapperT: get(namespace) ResWrapperT ->> Namespace: ns.get(resource) Namespace ->> ResArc: Get ResArc reference ResArc -->> ClientCode: Immutable &T reference ClientCode ->> ResWrapperT: get_mut(namespace) ResWrapperT ->> Namespace: ns.get_mut(resource) Namespace ->> ResArc: Get mutable ResArc ResArc ->> ResArc: Check if strong count == 1 alt Strong count == 1 ResArc -->> ClientCode: Mutable &mut T reference else Resource is shared ResArc -->> ClientCode: None (cannot mutate shared resource) end ClientCode ->> ResWrapperT: current() ResWrapperT ->> ResWrapperT: Gets current namespace ResWrapperT -->> ClientCode: ResCurrent<T> for current namespace access
Sources: src/res.rs(L69 - L128) src/arc.rs(L79 - L85)
Access Methods
AXNS provides three primary methods for accessing resources:
- Immutable Access -
ResWrapper::get(&Namespace) -> &T
:
- Always succeeds, providing read-only access to the resource
- Safe to use regardless of whether the resource is shared
- Mutable Access -
ResWrapper::get_mut(&mut Namespace) -> Option<&mut T>
:
- Only succeeds if the resource has a reference count of 1 (not shared)
- Returns
None
if the resource is shared with other namespaces - Ensures memory safety by preventing concurrent mutation
- Current Namespace Access -
ResWrapper::current() -> ResCurrent<T>
:
- Provides access to the resource in the current namespace
- The current namespace is determined by the thread-local feature status
- Returns a
ResCurrent<T>
that implementsDeref
for transparent access
Resource Sharing Mechanism
AXNS allows resources to be shared between namespaces, which is useful for both memory efficiency and communication.
flowchart TD A["Namespace A"] B["ResArc A"] C["Resource instance X"] D["Not shared"] E["Namespace B"] F["ResArc B"] G["Resource instance Y"] H["Not shared"] I["share_from(B, A)"] J["ResArc A'"] K["A.strong count = 2"] L["B now points to X"] M["If count=0, Y is freed"] A --> B B --> C B --> D E --> F F --> G F --> H I --> J I --> L J --> K L --> M
Sources: src/res.rs(L96 - L98) src/arc.rs(L95 - L102)
The sharing process works as follows:
- Initially, each namespace has its own independent resource instances
- When
ResWrapper::share_from(dst, src)
is called:
- It gets the
ResArc
from the source namespace - Clones it, which increments the reference count
- Replaces the destination namespace's existing
ResArc
with this clone - The destination's original resource may be freed if no other references exist
This creates a situation where multiple namespaces point to the same underlying resource data, with reference counting ensuring it remains alive until all namespaces are done with it.
Resource Cleanup Process
Resources are automatically cleaned up when they are no longer needed, which happens when their reference count reaches zero.
flowchart TD A["Namespace dropped"] B["ResArc::drop() called"] C["ResArc::drop()"] D["strong.fetch_sub(1)"] E["Last reference gone"] F["Memory barrier"] G["Clean up resource data"] H["Free memory"] I["Other references exist"] J["Resource stays alive"] A --> B C --> D D --> E D --> I E --> F F --> G G --> H I --> J
Sources: src/arc.rs(L104 - L120) src/ns.rs(L55 - L63)
The cleanup process has several stages:
- When a
Namespace
is dropped:
- It drops all its
ResArc
instances in its destructor - The memory for the namespace's
ResArc
array is deallocated
- When a
ResArc
is dropped:
- Its reference count is atomically decremented
- If the result is 1 (meaning this was the last reference):
- A memory fence is executed for proper synchronization
- The resource's drop function is called to clean up the resource data
- The memory for the
ResInner
and resource data is deallocated - If the count remains positive, the resource stays alive for other references
Resource Reinitialization
AXNS allows resources to be reset to their default values using the reset
method.
flowchart TD A["ResWrapper::reset(&mut ns)"] B["New ResArc"] C["Fresh resource instance"] D["*ns.get_mut(res) = new ResArc"] E["Decrement reference count"] F["Clean up old resource"] A --> B A --> D B --> C D --> E E --> F
Sources: src/res.rs(L100 - L104)
The reset process works as follows:
- A new
ResArc
is created with a fresh instance of the resource - The namespace's existing
ResArc
is replaced with this new one - The reference count of the old
ResArc
is decremented - If the old reference count reaches 0, that resource instance is cleaned up
This provides a way to return resources to their initial state without affecting other namespaces that might be sharing the previous instance.
Memory Layout and Management
Understanding the memory layout of resources is important for comprehending the complete lifecycle.
flowchart TD A["Memory Layout"] B["ResInner structure"] C["res: &'static Resource"] D["strong: AtomicUsize"] E["Resource Data"] F["Memory Management"] G["Allocation"] H["Single contiguous block"] I["ResInner at start"] J["Resource data after offset"] K["Access"] L["Calculate offset"] M["body() returns pointer to data"] N["Deallocation"] O["One operation frees both"] P["ResInner and resource data"] A --> B A --> E B --> C B --> D F --> G F --> K F --> N G --> H H --> I H --> J K --> L L --> M N --> O O --> P
Sources: src/arc.rs(L17 - L47) src/arc.rs(L23 - L27)
The resource memory system works as follows:
- Memory Layout: Each resource allocation consists of:
- A
ResInner
structure containing the metadata and reference count - The actual resource data, placed after the
ResInner
at a calculated offset
- Memory Allocation:
- A single contiguous memory block is allocated for both the
ResInner
and resource data - The layout is calculated using
Layout::new::<ResInner>().extend(body)
- This approach minimizes allocations and improves memory locality
- Memory Access:
- The
body()
method calculates the offset to the resource data - This provides direct access to the data portion without extra indirection
- Memory Deallocation:
- When the reference count reaches 0, the entire memory block is deallocated
- The resource's drop function is called first to clean up any internal resources
This approach minimizes allocations while providing safe, efficient memory management for resources.
Reference Counting Safeguards
AXNS implements several safeguards to ensure reference counting works correctly:
- Overflow Prevention: The reference counter is checked against
MAX_REFCOUNT
to prevent overflow - Safe Mutation: Mutable access is only allowed when a resource has a reference count of 1
- Atomic Operations: All reference count operations use atomic operations with appropriate ordering
- Memory Fences: Proper memory barriers ensure visibility across threads
These safeguards ensure that the resource lifecycle is managed correctly and safely, preventing memory leaks and use-after-free errors.
Resource Reference Counting
Relevant source files
Purpose and Scope
This document details how AXNS implements reference counting for resources through the ResArc
type, which provides memory management and safe resource sharing between namespaces. This page covers the internal memory layout of resources, the reference counting mechanism, and the resource lifecycle from allocation to deallocation.
For information about the broader resource lifecycle, see Resource Lifecycle. For details on how resources are defined, see Resources and ResWrapper.
Memory Layout
The AXNS resource system uses a custom memory layout that combines metadata with the actual resource data in a contiguous memory block.
Resource Memory Structure
flowchart TD subgraph subGraph1["ResInner Structure"] C["res: &'static Resource"] D["strong: AtomicUsize"] end subgraph subGraph0["Memory Block"] A["ResInner (Metadata)"] B["Resource Data"] end A --> B A --> C A --> D
The memory layout consists of two key sections:
- Metadata Section (ResInner): Contains:
- A reference to the static resource descriptor
- An atomic counter for tracking references
- Resource Data Section: Contains the actual data of the resource, with its layout defined during resource creation
Sources: src/arc.rs(L17 - L21) src/arc.rs(L23 - L27)
Memory Allocation Process
sequenceDiagram participant ClientCode as "Client Code" participant ResArcnew as "ResArc::new" participant MemoryAllocator as "Memory Allocator" participant ResInner as "ResInner" ClientCode ->> ResArcnew: "new(&'static Resource)" ResArcnew ->> ResArcnew: "Calculate layout requirements" ResArcnew ->> MemoryAllocator: "alloc(layout)" MemoryAllocator -->> ResArcnew: "raw memory pointer" ResArcnew ->> ResInner: "Initialize metadata" ResArcnew ->> ResInner: "Initialize resource data (res.init)" ResArcnew -->> ClientCode: "Return ResArc instance"
When a resource is created, ResArc::new
:
- Calculates the combined layout of the metadata and resource data
- Allocates a single memory block
- Initializes the metadata section with a reference count of 1
- Calls the resource's init function to initialize the data section
Sources: src/arc.rs(L57 - L72)
Reference Counting Implementation
The ResArc
type implements a custom atomic reference counting mechanism to track resource usage.
ResArc Architecture
classDiagram class ResArc { +NonNull~ResInner~ ptr +new(Resource) Self +get_mut() Option~&mut T~ +as_ref() &T +clone() Self } class ResInner { +&'static Resource res +AtomicUsize strong +body() NonNull~() ~ } ResArc --> ResInner : references
The reference counting is implemented through the AtomicUsize strong
field in ResInner
, which ensures thread-safe operations on the reference count.
Sources: src/arc.rs(L49 - L51) src/arc.rs(L18 - L21)
Reference Counter Operations
flowchart TD A["Start"] B["ResArc::clone()"] C["Atomically increment strong count"] D["Count > MAX_REFCOUNT?"] E["Panic: Counter overflow"] F["Return new ResArc with same ptr"] G["ResArc::drop()"] H["Atomically decrement strong count"] I["Count == 0?"] J["Return - Resource still in use"] K["Call resource drop function"] L["Deallocate memory"] A --> B A --> G B --> C C --> D D --> E D --> F G --> H H --> I I --> J I --> K K --> L
Key aspects of the reference counting implementation:
- Incrementing: When
clone()
is called, the strong count is atomically incremented with a relaxed ordering - Overflow Protection: Checks ensure the counter doesn't exceed
MAX_REFCOUNT
(isize::MAX) - Decrementing: When
drop()
is called, the strong count is atomically decremented - Cleanup: When the count reaches zero, the resource's drop function is called and memory is deallocated
Sources: src/arc.rs(L95 - L102) src/arc.rs(L104 - L120)
Thread Safety
ResArc
implements both Send
and Sync
traits, allowing it to be safely shared between threads. The atomic operations ensure that reference counting works correctly in a multi-threaded environment.
flowchart TD subgraph subGraph1["Ordering Used"] E["Relaxed: For increment"] F["Release: For decrement"] G["Acquire: For get_mut and drop fence"] end subgraph subGraph0["Thread Safety Mechanisms"] A["impl Send for ResArc"] B["impl Sync for ResArc"] C["AtomicUsize for reference counting"] D["Memory ordering guarantees"] end C --> E C --> F C --> G
The implementation uses specific memory orderings to balance performance and correctness:
Relaxed
ordering for incrementing the counter (lighter weight)Release
ordering when decrementing to ensure visibility of all previous operationsAcquire
fence after the last reference is dropped to ensure all operations complete before deallocation
Sources: src/arc.rs(L53 - L54) src/arc.rs(L5 - L9) src/arc.rs(L95 - L102) src/arc.rs(L104 - L120)
Resource Access Patterns
Accessing and Mutating Resources
flowchart TD A["Start"] B["ResArc::as_ref()"] C["Access resource data immutably"] D["ResArc::get_mut()"] E["strong count == 1?"] F["Return mutable reference to resource data"] G["Return None - Resource is shared"] A --> B A --> D B --> C D --> E E --> F E --> G
ResArc
provides two primary access patterns:
- Immutable access (
as_ref()
): Always available, returns a reference to the resource data - Mutable access (
get_mut()
): Only available when the reference count is exactly 1, ensuring exclusive access
Sources: src/arc.rs(L79 - L85) src/arc.rs(L88 - L93)
Integration with Namespace System
The reference counting mechanism integrates with the namespace system to enable safe resource sharing between namespaces.
sequenceDiagram participant ClientCode as "Client Code" participant ResWrapper as "ResWrapper" participant SourceNamespace as "Source Namespace" participant DestinationNamespace as "Destination Namespace" participant ResArc as "ResArc" ClientCode ->> ResWrapper: "share_from(dst, src)" ResWrapper ->> SourceNamespace: "get(resource)" SourceNamespace -->> ResWrapper: "ResArc reference" ResWrapper ->> ResArc: "clone()" ResArc -->> ResWrapper: "New ResArc with incremented count" ResWrapper ->> DestinationNamespace: "Update with cloned ResArc" Note over SourceNamespace,DestinationNamespace: Both namespaces now share the same resource instance
When a resource is shared between namespaces:
- The
share_from
method obtains a reference to the resource in the source namespace - This reference is cloned, incrementing the strong count
- The destination namespace's reference is replaced with the cloned reference
- Both namespaces now point to the same underlying resource data
Sources: src/res.rs(L94 - L98)
Resource Reset and Memory Management
flowchart TD A["Start"] B["ResWrapper::reset()"] C["Get mutable reference to ResArc in namespace"] D["Create new ResArc instance"] E["Replace old ResArc with new one"] F["Old ResArc is dropped"] G["Was this the last reference?"] H["Resource memory is deallocated"] I["Resource continues to exist for other namespaces"] A --> B B --> C C --> D D --> E E --> F F --> G G --> H G --> I
When a resource is reset in a namespace:
- A new
ResArc
instance is created with a fresh copy of the resource - The old
ResArc
in the namespace is replaced and dropped - If this was the last reference to the old resource, its memory is deallocated
- Otherwise, the resource continues to exist for other namespaces that share it
Sources: src/res.rs(L100 - L104)
Technical Limitations and Safeguards
The ResArc implementation includes several important safeguards:
- Reference Count Maximum: Limited to isize::MAX to prevent overflow
- Mutable Access Safety: Mutable access is only granted when the reference count is exactly 1
- Memory Layout Handling: Carefully manages the layout and offset calculations for resource data
- Drop Sequence: Ensures proper ordering of deallocation operations
Sources: src/arc.rs(L14 - L15) src/arc.rs(L79 - L85)
Summary
The resource reference counting system in AXNS provides:
- Memory Safety: Ensuring resources are only deallocated when all references are dropped
- Thread Safety: Allowing resources to be safely shared between threads
- Efficient Sharing: Enabling namespaces to share resources without unnecessary duplication
- Controlled Mutability: Preventing data races by restricting mutable access to unshared resources
This system forms the foundation for the resource lifecycle management in AXNS, balancing safety, performance, and flexibility.
Usage Guide
Relevant source files
This guide provides practical instructions for using the AXNS library to manage resources across namespaces. It covers how to define resources, create namespaces, access and modify resources, and advanced operations like sharing and resetting resources. For conceptual understanding of the AXNS architecture, see Core Concepts, and for deeper details on resource lifecycle, see Resource Lifecycle.
Defining Resources
The first step to using AXNS is defining your resources using the def_resource!
macro. This macro creates static resource instances with proper initialization and cleanup.
def_resource! {
/// A simple integer resource
pub static COUNTER: i32 = 0;
/// A more complex resource with custom type
pub static CONFIG: Configuration = Configuration::default();
}
The macro creates a static ResWrapper<T>
for each resource, providing methods to access and manipulate the resource within namespaces.
Sources: src/res.rs(L144 - L168)
Creating and Managing Namespaces
A namespace is a container for resource instances. Create a new namespace using the Namespace::new()
constructor:
let mut my_namespace = Namespace::new();
When using the thread-local feature, you can also access the current namespace through resource wrappers:
let counter_ref = COUNTER.current(); // Gets the resource in the current namespace
Sources: src/lib.rs(L16 - L59)
flowchart TD A["Application Code"] B["Create Namespace"] C["Namespace::new()"] D["Define Resources"] E["def_resource! macro"] F["Access Resources"] G["resource.get(&ns)"] H["resource.get_mut(&mut ns)"] I["resource.current()"] A --> B A --> D A --> F B --> C D --> E F --> G F --> H F --> I
Basic Resource Access
AXNS provides several methods to access resources within namespaces:
Read-only Access
To get a reference to a resource in a namespace:
// Get reference to the resource in the specific namespace
let counter = COUNTER.get(&my_namespace);
Mutable Access
To modify a resource in a namespace:
// Get mutable reference if the resource isn't shared
if let Some(counter) = COUNTER.get_mut(&mut my_namespace) {
*counter += 1;
}
Note that get_mut()
returns Option<&mut T>
because it will return None
if the resource is shared with other namespaces.
Current Namespace Access
When the thread-local feature is enabled, you can access resources in the current namespace:
// Access resource in current namespace
let current_value = COUNTER.current();
Sources: src/res.rs(L69 - L92) tests/all.rs(L15 - L25)
sequenceDiagram participant Application as "Application" participant ResWrapperT as "ResWrapper<T>" participant Namespace as "Namespace" participant ResArcT as "ResArc<T>" Application ->> ResWrapperT: get(&namespace) ResWrapperT ->> Namespace: ns.get(self.res) Namespace ->> ResArcT: Returns ResArc<T> ResArcT ->> Application: Returns &T Application ->> ResWrapperT: get_mut(&mut namespace) ResWrapperT ->> Namespace: ns.get_mut(self.res) Namespace ->> ResArcT: Returns ResArc<T> ResArcT -->> Application: Returns Option<&mut T>
Advanced Operations
Sharing Resources Between Namespaces
To share a resource from one namespace to another:
let src_namespace = Namespace::new();
let mut dst_namespace = Namespace::new();
// Share the COUNTER resource from src to dst
COUNTER.share_from(&mut dst_namespace, &src_namespace);
This creates a shared reference to the same resource instance in both namespaces. Note that once shared, you won't be able to get mutable access to the resource in either namespace via get_mut()
.
Resetting Resources
To reset a resource in a namespace to its default value:
// Reset the COUNTER resource to its default value
COUNTER.reset(&mut my_namespace);
This discards the current resource instance and creates a new one with the default value specified in the def_resource!
macro.
Sources: src/res.rs(L94 - L104) tests/all.rs(L96 - L123)
Using Thread-Local Namespaces
When the thread-local
feature is enabled, AXNS provides thread-local namespaces for resource isolation:
// With thread-local feature enabled:
let counter = COUNTER.current(); // Gets from thread-local namespace
// You can implement the CurrentNs trait to define how thread-local
// namespaces are managed
The thread-local feature is particularly useful in multi-threaded applications where you want to isolate resources between threads.
Sources: src/lib.rs(L35 - L42) tests/all.rs(L40 - L159)
flowchart TD subgraph subGraph2["Resource Access Patterns"] A["Application Code"] F["Feature Check"] G["Current Thread?"] subgraph subGraph1["Feature: thread-local ON"] C["Thread 1 Namespace"] D["Thread 2 Namespace"] E["Thread 3 Namespace"] end subgraph subGraph0["Feature: thread-local OFF"] B["Global Namespace"] end end A --> F F --> B F --> G G --> C G --> D G --> E
Working with Custom Resource Types
AXNS can work with any type as a resource, including custom structs, atomics, and reference-counted types:
#![allow(unused)] fn main() { use std::sync::atomic::{AtomicUsize, Ordering}; struct Configuration { name: String, enabled: bool, } impl Configuration { fn default() -> Self { Self { name: "default".to_string(), enabled: false, } } } def_resource! { // Integer resource static COUNTER: i32 = 0; // Atomic for thread-safe access static ATOMIC_COUNTER: AtomicUsize = AtomicUsize::new(0); // Custom struct static CONFIG: Configuration = Configuration::default(); } // Working with atomic resources ATOMIC_COUNTER.current().fetch_add(1, Ordering::SeqCst); }
Sources: tests/all.rs(L1 - L38)
Best Practices
- Use appropriate resource types: For shared resources that need concurrent access, consider atomic types or other thread-safe structures.
- Minimize resource sharing: While AXNS makes it easy to share resources between namespaces, excessive sharing can reduce the benefits of namespace isolation.
- Reset when done: Reset resources when they're no longer needed to free up memory.
- Implement CurrentNs carefully: If using the thread-local feature, ensure your
CurrentNs
implementation correctly handles namespace lifecycle. - Check return value of get_mut(): Always check if
get_mut()
returnsSome
before attempting to modify a resource, as it may be shared.
flowchart TD A["Resource Definition"] B["Namespace Creation"] C["Resource Access"] D["Modify Resource?"] E["Is Shared?"] F["Use get_mut()"] G["Can't modify directly"] H["Options"] I["Unsupported markdown: list"] J["Unsupported markdown: list"] K["Unsupported markdown: list"] L["Use get()"] A --> B B --> C C --> D D --> E D --> L E --> F E --> G G --> H H --> I H --> J H --> K
For more detailed information on specific aspects of AXNS, refer to:
- Basic Resource Access for more examples of resource access patterns
- Sharing and Resetting Resources for advanced resource management techniques
Basic Resource Access
Relevant source files
This page provides practical instructions for the fundamental operations in AXNS: defining resources, accessing them from namespaces, and modifying their values. For more advanced operations such as sharing resources between namespaces or resetting them to initial values, see Sharing and Resetting Resources.
Defining Resources
Resources in AXNS are defined using the def_resource!
macro, which creates statically allocated resources with their default values.
Syntax
def_resource! {
/// Documentation for the resource
pub static RESOURCE_NAME: ResourceType = default_value;
// Multiple resources can be defined in a single macro call
pub static ANOTHER_RESOURCE: AnotherType = another_default_value;
}
Behind the scenes, this macro creates a static ResWrapper<T>
instance that provides methods for accessing the resource in different namespaces.
flowchart TD A["def_resource! macro"] B["Define Static Resource"] C["Create Resource Struct"] D["Define layout, init and drop functions"] E["Create ResWrapper instance"] F["Resource accessible via RESOURCE_NAME"] A --> B B --> C B --> D B --> E E --> F
Sources: src/res.rs(L144 - L168)
Examples
Simple value types:
def_resource! {
/// A static integer resource
pub static MY_NUMBER: i32 = 42;
}
Complex types:
def_resource! {
/// A custom data structure
pub static MY_DATA: MyStruct = MyStruct {
field1: "default",
field2: 100
};
}
Atomic types for thread-safe access:
def_resource! {
/// An atomic counter
pub static COUNTER: AtomicUsize = AtomicUsize::new(0);
}
Sources: src/res.rs(L130 - L168) tests/all.rs(L11 - L13) tests/all.rs(L31 - L33)
Creating Namespaces
Before accessing resources, you need to create a namespace:
let mut ns = Namespace::new();
Resources will be automatically initialized with their default values when first accessed in a namespace.
sequenceDiagram participant ClientCode as "Client Code" participant Namespace as "Namespace" participant ResArc as "ResArc" participant Resource as "Resource" ClientCode ->> Namespace: "Namespace::new()" Note over Namespace: Creates empty namespace ClientCode ->> Namespace: "resource.get(&ns)" Namespace ->> ResArc: "Look up resource" alt First access Namespace ->> ResArc: "Create new ResArc" ResArc ->> Resource: "Initialize with default value" else Subsequent access Namespace ->> ResArc: "Return existing ResArc" end ResArc -->> ClientCode: "Return resource reference"
Sources: tests/all.rs(L15)
Accessing Resources
AXNS provides two main methods to access resources:
Direct Access with Namespace Reference
// Get a reference to the resource in the given namespace
let value = RESOURCE_NAME.get(&namespace);
This method requires explicitly passing the namespace reference.
Current Namespace Access
// Access the resource in the current namespace
let current_value = RESOURCE_NAME.current();
The current()
method uses the current namespace, which depends on the thread-local
feature:
- When enabled: uses a thread-local namespace
- When disabled: uses a global namespace
flowchart TD A["Client Code"] B["Access Method"] C["Direct Access with Namespace"] D["Current Namespace Access"] E["thread-local feature"] F["Thread-local Namespace"] G["Global Namespace"] A --> B B --> C B --> D D --> E E --> F E --> G
Sources: src/res.rs(L78 - L82) src/res.rs(L69 - L76) tests/all.rs(L22 - L24) tests/all.rs(L35 - L37)
Modifying Resources
To modify a resource, you need a mutable reference to the namespace and the resource must not be shared with other namespaces:
// Try to get a mutable reference
if let Some(mut_value) = RESOURCE_NAME.get_mut(&mut namespace) {
// Modify the resource
*mut_value = new_value;
} else {
// Resource is shared, cannot modify
}
For atomic types, you can modify them without a mutable reference:
// No mutable reference needed for atomic operations
COUNTER.current().fetch_add(1, Ordering::Relaxed);
flowchart TD A["Client Code"] B["Resource Type"] C["resource.get_mut(&mut ns)"] D["Is resource shared?"] E["Returns Some(&mut T)"] F["Returns None"] G["Modify resource value"] H["resource.current()"] I["Call atomic methods"] J["Resource modified safely"] A --> B B --> C B --> H C --> D D --> E D --> F E --> G H --> I I --> J
Sources: src/res.rs(L89 - L92) tests/all.rs(L17 - L20) tests/all.rs(L36 - L37)
Complete Example
Here's a complete example demonstrating resource definition, access, and modification:
use axns::{Namespace, def_resource};
// Define resources
def_resource! {
/// A custom data structure
static DATA: MyStruct = MyStruct {
value: 100,
name: "hello".to_string()
};
}
// Create a namespace
let mut ns = Namespace::new();
// Access the resource (will have default value)
let data = DATA.get(&ns);
assert_eq!(data.value, 100);
assert_eq!(data.name, "hello");
// Modify the resource
if let Some(mut_data) = DATA.get_mut(&mut ns) {
mut_data.value = 42;
mut_data.name = "world".to_string();
}
// Verify changes
let modified_data = DATA.get(&ns);
assert_eq!(modified_data.value, 42);
assert_eq!(modified_data.name, "world");
// Access via current() method
let current_data = DATA.current();
// Note: If using the default global namespace, this would still
// have the default values, not the modified ones from our local namespace
Sources: tests/all.rs(L4 - L25)
Key Considerations
- Resource Initialization: Resources are initialized with their default values when first accessed.
- Thread Safety:
- Use atomic types for thread-safe modifications.
- Regular types require a mutable reference and cannot be modified if shared.
- Current Namespace:
- The behavior of
current()
depends on thethread-local
feature. - Be aware of which namespace you're accessing when using
current()
.
- Resource Sharing:
- Shared resources cannot be modified through
get_mut()
. - For resource sharing, see Sharing and Resetting Resources.
Sources: src/res.rs(L53 - L105)
Sharing and Resetting Resources
Relevant source files
This page explains how to share resources between namespaces and how to reset resources to their default values in the AXNS system. These operations are essential for effective resource management in multi-namespace environments. For information about basic resource access and modification, see Basic Resource Access.
Resource Sharing
AXNS allows resources to be shared between namespaces using the share_from
method. When a resource is shared, both namespaces reference the same underlying data, though through separate ResArc
instances. This means changes to the resource will be visible across all namespaces that share it.
How Resource Sharing Works
Resource sharing transfers the reference from one namespace to another by cloning the ResArc
that wraps the resource:
flowchart TD subgraph subGraph0["After Sharing"] A2["Namespace A"] ResA2["ResArc (ref_count=2)"] B2["Namespace B"] ResB2["Cloned ResArc"] RD["Shared Resource Data X"] end A["Namespace A"] ResA["ResArc"] B["Namespace B"] ResB["Different ResArc"] RD1["Resource Data X"] RD2["Resource Data Y"] S["ResWrapper.share_from(&mut B, &A)"] C["Clone operation"] A --> ResA A2 --> ResA2 B --> ResB B2 --> ResB2 ResA --> RD1 ResA2 --> RD ResB --> RD2 ResB2 --> RD S --> C
The share_from
method implementation is straightforward:
#![allow(unused)] fn main() { pub fn share_from<'ns>(&self, dst: &'ns mut Namespace, src: &'ns Namespace) { *dst.get_mut(self.res) = src.get(self.res).clone(); } }
This method clones the ResArc
from the source namespace and replaces the existing ResArc
in the destination namespace. The reference count for the resource increases, ensuring it won't be deallocated until all namespaces release their references.
Sources: src/res.rs(L96 - L98)
Example of Resource Sharing
Here's a practical example of sharing a resource between namespaces:
// Define a resource
def_resource! {
static SHARED_DATA: Arc<()> = Arc::new(());
}
// Set up source namespace with custom data
let mut src_ns = Namespace::new();
DATA.get_mut(&mut src_ns).unwrap().clone_from(&MY_CUSTOM_DATA);
// Create destination namespace
let mut dst_ns = Namespace::new();
// Share the resource from source to destination
DATA.share_from(&mut dst_ns, &src_ns);
// Now both namespaces reference the same underlying data
// Changes in one namespace will be visible in the other
Sources: tests/all.rs(L126 - L158)
Resource Resetting
The reset
method allows you to discard the current state of a resource in a namespace and reinitialize it to the default value defined in the def_resource!
macro.
How Resource Resetting Works
Resetting creates a new ResArc
instance with freshly initialized resource data:
flowchart TD subgraph subGraph1["After Reset"] OLDRA["Old ResArc (ref_count=n-1)"] OLDRD["Original Resource Data"] subgraph subGraph0["Before Reset"] NS2["Namespace"] RA3["New ResArc"] RD3["Default Resource Data"] NS1["Namespace"] RA1["ResArc (ref_count=n)"] RD1["Modified Resource Data"] end end RST["ResWrapper.reset(&mut Namespace)"] RA2["New ResArc (ref_count=1)"] RD2["Default Resource Data"] NS1 --> RA1 NS2 --> RA3 OLDRA --> OLDRD RA1 --> RD1 RA2 --> RD2 RA3 --> RD3 RST --> RA2
The implementation of reset
is equally straightforward:
#![allow(unused)] fn main() { pub fn reset(&self, ns: &mut Namespace) { *ns.get_mut(self.res) = ResArc::new(self.res); } }
This method creates a new ResArc
for the resource with default initialization and replaces the existing one in the namespace. The reference count of the old ResArc
decreases, potentially triggering deallocation if no other namespaces reference it.
Sources: src/res.rs(L102 - L104)
Example of Resource Resetting
Here's how to reset a resource to its default value:
// Define a resource
def_resource! {
static DATA: Arc<()> = Arc::new(());
}
// Get a namespace and potentially modify the resource
let mut ns = Namespace::new();
// ...modify resource...
// Reset the resource to its default value
DATA.reset(&mut ns);
// Now the resource has been reinitialized with the default value
// Any previous sharing relationships are broken for this namespace
Sources: tests/all.rs(L97 - L123)
Use Cases and Considerations
When to Use Sharing vs. Resetting
-
Use sharing when:
-
Multiple namespaces need to access the same resource instance
-
You want to propagate changes from one namespace to others
-
You need to conserve memory by avoiding duplication of large resources
-
Use resetting when:
-
You need to return a resource to its initial state
-
You want to break sharing relationships with other namespaces
-
You're reinitializing a namespace for reuse
Memory Management Implications
Both operations affect resource reference counting:
Operation | Effect on Reference Count | Memory Impact |
---|---|---|
share_from | Increases reference count for source resource | Prevents deallocation as long as any namespace references it |
reset | Decreases reference count of old resource | May trigger deallocation if count reaches zero |
Thread Safety Considerations
When using these operations in a multi-threaded environment:
- Ensure proper synchronization when accessing shared resources
- For thread-local namespaces (with the
thread-local
feature enabled), be aware that resources are isolated by default - Consider using thread-safe types (atomics, mutexes) for resources that may be shared across threads
Sources: src/res.rs(L53 - L105)
Example: Resource Lifecycle with Sharing and Resetting
The following sequence diagram illustrates a typical resource lifecycle involving sharing and resetting:
sequenceDiagram participant NamespaceA as "Namespace A" participant NamespaceB as "Namespace B" participant Resource as "Resource" Note over NamespaceA,NamespaceB: Initial state: each namespace has its own resource instance NamespaceA ->> Resource: get_mut() - Modify resource NamespaceA ->> NamespaceB: share_from(&mut NS2, &NS1) Note over NamespaceA,NamespaceB: Both namespaces now reference the same resource instance NamespaceB ->> Resource: get() - Access shared resource Note over NamespaceB: Changes made in NS1 are visible in NS2 NamespaceB ->> NamespaceB: reset() Note over NamespaceB: NS2 now has a fresh resource instance Note over NamespaceA,NamespaceB: Sharing relationship is broken NamespaceA ->> Resource: get() - Still has modified resource NamespaceB ->> Resource: get() - Has default resource
Sources: src/res.rs(L96 - L104)
Best Practices
- Be mindful of sharing: Shared resources cannot be safely mutated through
get_mut()
, so plan your resource usage accordingly. - Consider reference counting: When sharing or resetting resources that hold external allocations (like
Arc
), be aware of the reference counting implications. - Use reset for cleanup: Reset resources in namespaces before reusing them to prevent resource leaks and ensure consistent initial states.
- Share judiciously: While sharing can save memory, it can make resource management more complex by creating implicit dependencies between namespaces.
Sources: src/res.rs(L90 - L92) tests/all.rs(L97 - L158)
Development and Testing
Relevant source files
This page provides comprehensive information for developers working on the AXNS resource namespace library itself. It covers the development environment, testing methodology, CI/CD workflow, and guidelines for contributing to the project. For information about using the AXNS library in your applications, please see Usage Guide.
Development Environment
AXNS is developed as a Rust library with minimal dependencies. To work on AXNS, you need:
- Rust Toolchain: The project uses the nightly Rust toolchain for development to leverage advanced features and documentation tools.
- Cargo: For building, testing, and package management.
- Git: For version control.
The repository is organized in a standard Rust project structure:
flowchart TD A["src/"] B["lib.rs (Core library code)"] C[".github/workflows/"] D["ci.yml (CI pipeline configuration)"] E["tests/"] F["all.rs (Test suite)"] G["Cargo.toml (Dependencies and metadata)"] A --> B C --> D E --> F
Sources: .github/workflows/ci.yml tests/all.rs
Testing Methodology
AXNS employs a comprehensive testing methodology to ensure correctness and reliability of the namespace system. The test suite in tests/all.rs
validates the core functionality through various test cases.
Test Categories
The test suite includes several categories of tests:
Sources: tests/all.rs
Test Case Examples
The test suite validates several key aspects of the AXNS system:
- Basic namespace operations tests/all.rs(L4 - L25)
- Creating a namespace
- Defining resources with
def_resource!
- Getting and modifying resources
- Current resource access tests/all.rs(L27 - L38)
- Accessing the current value of a resource
- Modifying resources in the current namespace
- Thread-local feature tests tests/all.rs(L40 - L159)
- Resource cleanup and recycling
- Resetting resources
- Sharing resources between namespaces
Sources: tests/all.rs
Feature Flag Testing
AXNS uses feature flags to enable optional functionality. The most significant feature is thread-local
, which enables thread-local namespace support.
Thread-Local Feature Testing
The thread-local feature is tested in a dedicated module that is only compiled when the feature is enabled:
flowchart TD subgraph subGraph0["Thread-Local Tests"] D["recycle() - Tests resource cleanup in thread-local context"] E["reset() - Tests resetting resources in thread-local context"] F["clone_from() - Tests sharing resources between thread-local namespaces"] end A["Test Suite"] B["Basic Tests"] C["Basic Tests + Thread-Local Tests"] A --> B A --> C C --> D C --> E C --> F
The thread-local tests validate several important aspects:
- Resource lifecycle - Ensuring resources are properly cleaned up when threads terminate
- Resource sharing - Verifying resources can be shared between namespaces
- Resource resetting - Testing the reset functionality in thread-local contexts
Sources: tests/all.rs(L40 - L159)
CI/CD Pipeline
AXNS employs a comprehensive CI/CD pipeline implemented with GitHub Actions to ensure code quality and consistency.
CI Workflow
flowchart TD A["Push/PR to main branch"] B["CI Workflow Trigger"] C["check job"] C1["Run cargo fmt"] C2["Run cargo clippy"] D["test job"] D1["Run cargo test"] D2["Run cargo test with thread-local"] E["doc job"] E1["Build documentation"] F["deploy job"] F1["Deploy to GitHub Pages"] G["Successful Check"] H["Successful Tests"] A --> B B --> C B --> D B --> E C --> C1 C --> C2 C1 --> G C2 --> G D --> D1 D --> D2 D1 --> H D2 --> H E --> E1 E1 --> F F --> F1
The CI pipeline consists of the following jobs:
Job | Description | Commands |
---|---|---|
check | Validates code formatting and checks for linting issues | cargo fmt --all --checkcargo clippy --all-targets --all-features -- -D warnings |
test | Runs the test suite in both standard and thread-local modes | cargo test --verbosecargo test --verbose -F thread-local |
doc | Builds the documentation with all features | cargo doc --all-features --no-deps |
deploy | Deploys the documentation to GitHub Pages | GitHub Actions deployment task |
Sources: .github/workflows/ci.yml
Test-Driven Development
The AXNS development process follows test-driven development principles:
- Write Tests First: New features should be accompanied by tests that validate their behavior
- Validate Core Functionality: Tests should cover the full range of functionality
- Feature Flag Testing: Both standard and feature-enabled configurations must be tested
Test to Code Relationship
flowchart TD subgraph subGraph1["Core Components"] B["Namespace Struct"] C["Resource and ResWrapper"] D["def_resource! Macro"] E["Thread-Local Features"] F["Resource Lifecycle"] end subgraph subGraph0["Test Files"] A["tests/all.rs"] end A --> B A --> C A --> D A --> E A --> F
Sources: tests/all.rs
Testing ResArc Reference Counting
A critical aspect of AXNS is its reference counting mechanism implemented through ResArc
. The test suite verifies that reference counting works correctly to prevent memory leaks.
sequenceDiagram participant Test as Test participant Namespace as Namespace participant Resource as Resource participant ResArc as ResArc Test ->> Namespace: Create Namespace Test ->> Resource: Define Resource (def_resource!) Test ->> Namespace: Get resource (DATA.get_mut()) Namespace ->> ResArc: Clone ResArc Test ->> Namespace: Modify resource Note over Test,ResArc: Thread-local tests Test ->> ResArc: Share resource (share_from) ResArc ->> ResArc: Increment reference count Test ->> ResArc: Reset resource (reset) ResArc ->> ResArc: Decrement reference count Note over Test,ResArc: Verify reference counts Test ->> ResArc: Check strong_count matches expectations
The thread-local tests specifically validate reference counting by tracking the strong_count
of Arc
instances and ensuring they are properly incremented and decremented.
Sources: tests/all.rs(L40 - L159)
Development Guidelines
When developing AXNS, follow these guidelines:
- Write Tests: All new functionality should be accompanied by appropriate tests.
- Feature Flags: When adding features that can be optional, use feature flags and add tests for both configurations.
- Documentation: Document all public APIs with doc comments.
- Code Quality: Ensure code passes
cargo fmt
andcargo clippy
checks. - Compatibility: Maintain backward compatibility when possible.
Adding New Resources
When adding new resource types to the system:
- Define the resource using the
def_resource!
macro - Implement tests that validate the resource behavior in various scenarios
- Ensure proper cleanup and reference counting
Testing Thread-Local Features
When working with the thread-local feature:
- Place thread-local specific tests in the
#[cfg(feature = "thread-local")]
module - Verify resources are properly cleaned up when threads terminate
- Test interactions between thread-local and global namespaces
Sources: tests/all.rs(L40 - L159)
Test Code Structure
The test structure in AXNS follows a modular pattern where base functionality is tested first, followed by feature-specific tests:
flowchart TD A["tests/all.rs"] B["Base Tests"] C["#[cfg(feature = thread-local)]"] B1["ns()"] B2["current()"] C1["thread_local! { static NS }"] C2["CurrentNsImpl struct"] C3["Feature-specific tests"] D1["recycle()"] D2["reset()"] D3["clone_from()"] A --> B A --> C B --> B1 B --> B2 C --> C1 C --> C2 C --> C3 C3 --> D1 C3 --> D2 C3 --> D3
Sources: tests/all.rs
Conclusion
The development and testing infrastructure of AXNS is designed to ensure the correctness and reliability of the resource namespace system. By following the guidelines and leveraging the existing test framework, developers can contribute to AXNS while maintaining its quality standards.
For information on using AXNS in your applications, please refer to the Usage Guide section.
Overview
Relevant source files
Purpose and Scope
The axsignal
crate implements a Unix-like signal handling system for ArceOS. It provides a comprehensive framework for managing signals at both process and thread levels, supporting standard operations such as sending, blocking, and handling signals. This crate enables applications running on ArceOS to use familiar signal handling patterns similar to those found in POSIX systems.
For detailed explanations of specific components, see Signal Management System, Signal Types and Structures, and Architecture Support.
Sources: src/lib.rs(L1 - L16) Cargo.toml(L1 - L31)
High-Level Architecture
The axsignal
crate is organized into several interconnected modules that together form a complete signal handling system.
flowchart TD subgraph subGraph2["axsignal Crate"] lib["lib.rs"] subgraph subGraph1["Architecture Support"] x86_64["arch/x86_64.rs"] aarch64["arch/aarch64.rs"] riscv["arch/riscv.rs"] loongarch64["arch/loongarch64.rs"] end subgraph subGraph0["Core Modules"] action["action.rs"] types["types.rs"] pending["pending.rs"] api["api.rs"] arch["arch/mod.rs"] end end arch --> aarch64 arch --> loongarch64 arch --> riscv arch --> x86_64 lib --> action lib --> api lib --> arch lib --> pending lib --> types
High-Level Architecture of axsignal
Sources: src/lib.rs(L7 - L15)
Key Components
The axsignal
crate consists of several key components that work together to provide signal handling functionality:
Signal Types
classDiagram class Signo { +value: u8 +const SIGHUP, SIGINT, SIGQUIT, etc. +is_standard() +is_realtime() } class SignalSet { +bits: u64 +new() +add_signal(Signo) +del_signal(Signo) +contains(Signo) +is_empty() } class SignalInfo { +signo: Signo +code: i32 +errno: i32 +fields: SignalFields } class SignalStack { +ss_sp: usize +ss_flags: i32 +ss_size: usize } SignalInfo --> Signo
Signal Type Components
Signal Managers
classDiagram class ProcessSignalManager { +pending: Mutex +actions: Arc~ +send_signal(sig: SignalInfo) +dequeue_signal(mask: SignalSet) +pending() +wait_signal() } class ThreadSignalManager { -proc: Arc -pending: Mutex -blocked: Mutex -stack: Mutex +send_signal(sig: SignalInfo) +dequeue_signal(mask: SignalSet) +check_signals(tf, restore_blocked) +handle_signal(tf, restore_blocked, sig, action) } class PendingSignals { +set: SignalSet +info_std: Option[32] +info_rt: VecDeque +put_signal(sig: SignalInfo) +dequeue_signal(mask: SignalSet) } ThreadSignalManager --> ProcessSignalManager ProcessSignalManager --> PendingSignals ThreadSignalManager --> PendingSignals
Signal Management Components
Sources: src/lib.rs(L7 - L15)
Signal Processing Flow
The signal handling process in axsignal
follows a well-defined flow from generation to handling:
flowchart TD subgraph subGraph2["Signal Handling"] CheckAction["Check SignalAction"] Default["Execute Default Action"] Ignore["Ignore Signal"] Handler["Execute Custom Handler"] SaveContext["Save CPU Context"] ExecuteHandler["Run Handler Function"] RestoreContext["Restore CPU Context"] end subgraph subGraph1["Signal Queuing"] CheckBlocked["Is Signal Blocked?"] Queue["Add to PendingSignals"] Deliver["Deliver Immediately"] end subgraph subGraph0["Signal Generation"] Start["Signal Generated"] SendToProcess["ProcessSignalManager.send_signal()"] SendToThread["ThreadSignalManager.send_signal()"] end CheckPending["check_signals()"] CheckAction --> Default CheckAction --> Handler CheckAction --> Ignore CheckBlocked --> Deliver CheckBlocked --> Queue CheckPending --> Deliver Deliver --> CheckAction ExecuteHandler --> RestoreContext Handler --> SaveContext Queue --> CheckPending SaveContext --> ExecuteHandler SendToProcess --> CheckBlocked SendToThread --> CheckBlocked Start --> SendToProcess Start --> SendToThread
Signal Processing Flow
Sources: src/lib.rs(L7 - L15)
System Dependencies and Integration
The axsignal
crate integrates with several core components of ArceOS to provide its functionality:
flowchart TD subgraph subGraph1["External Dependencies"] axerrno["axerrno - Error Codes"] bitflags["bitflags - Bit Flag Manipulation"] log["log - Logging Infrastructure"] end subgraph subGraph0["ArceOS Core Components"] axconfig["axconfig - System Configuration"] axhal["axhal - Hardware Abstraction Layer"] axtask["axtask - Task Management"] end axsignal["axsignal"] axsignal --> axconfig axsignal --> axerrno axsignal --> axhal axsignal --> axtask axsignal --> bitflags axsignal --> log
Dependencies and Integration
The crate relies on:
axconfig
: For system configuration parametersaxhal
: For hardware abstraction with userspace supportaxtask
: For multitasking integrationaxerrno
: For error handlingbitflags
: For efficient signal set implementation- Additional utilities for logging and synchronization
Sources: Cargo.toml(L6 - L26)
Architecture Support
The axsignal
crate provides platform-specific implementations for different CPU architectures, ensuring proper signal context management:
flowchart TD subgraph subGraph2["Architecture-Specific Features"] context["Machine Context Management"] trampoline["Signal Trampoline Code"] setup["Signal Handler Setup"] restore["Context Restoration"] end subgraph subGraph1["Architecture Support"] arch_mod["arch/mod.rs - Common Interface"] subgraph subGraph0["Platform-Specific Implementations"] x86_64["arch/x86_64.rs"] aarch64["arch/aarch64.rs"] riscv["arch/riscv.rs"] loongarch64["arch/loongarch64.rs"] end end aarch64 --> context aarch64 --> restore aarch64 --> setup aarch64 --> trampoline arch_mod --> aarch64 arch_mod --> loongarch64 arch_mod --> riscv arch_mod --> x86_64 loongarch64 --> context loongarch64 --> restore loongarch64 --> setup loongarch64 --> trampoline riscv --> context riscv --> restore riscv --> setup riscv --> trampoline
Architecture Support System
Each architecture implementation provides specialized functionality for:
- Saving and restoring CPU registers during signal handling
- Setting up signal trampolines (code that transfers control to user-defined handlers)
- Managing signal stacks
- Handling architecture-specific signal delivery requirements
Sources: src/lib.rs(L8 - L9)
Summary
The axsignal
crate provides a comprehensive signal handling system for ArceOS, implementing familiar Unix-like functionality across multiple processor architectures. It manages signals at both process and thread levels, supports standard and real-time signals, and offers a flexible framework for defining custom signal actions.
Key features include:
- Process and thread level signal management
- Support for multiple architectures (x86_64, AArch64, RISC-V, LoongArch64)
- Signal masking and prioritization
- Custom signal handlers with context management
- Integration with ArceOS task management
For detailed information about specific components, refer to the dedicated pages on Signal Management System and Signal Types and Structures.
Sources: src/lib.rs(L1 - L16) Cargo.toml(L1 - L31)
Signal Management System
Relevant source files
Purpose and Scope
This page documents the signal management architecture in the axsignal
crate, focusing on the core components that handle signal delivery, queuing, and processing at both process and thread levels. The system implements a Unix-like signal handling framework that coordinates signal delivery across multiple threads within a process.
For information about specific signal types and structures, see Signal Types and Structures. For details on architecture-specific implementations, see Architecture Support.
Signal Management Architecture
The signal management system in axsignal
adopts a two-level architecture, consisting of:
- Process Signal Manager: Handles signals at the process level, maintaining process-wide pending signals and signal actions
- Thread Signal Manager: Manages signals at the thread level, with per-thread signal masks, stacks, and pending signals
This design allows signals to be directed either to a specific thread or to the process as a whole, following the standard Unix signal model.
classDiagram class ThreadSignalManager { -Arc proc -Mutex pending -Mutex blocked -Mutex stack +new(proc) +dequeue_signal(mask) +handle_signal(tf, restore_blocked, sig, action) +check_signals(tf, restore_blocked) +restore(tf) +send_signal(sig) } class ProcessSignalManager { +Mutex pending +Arc~ actions +WaitQueue wq +usize default_restorer +new(actions, default_restorer) +dequeue_signal(mask) +send_signal(sig) +pending() +wait_signal() } class PendingSignals { +SignalSet set +Option[32] info_std +VecDeque[33] info_rt +put_signal(SignalInfo) +dequeue_signal(SignalSet) } class SignalActions { +[SignalAction; 64] actions } class WaitQueue { <<trait>> +wait_timeout(timeout) +wait() +notify_one() +notify_all() } ThreadSignalManager --> ProcessSignalManager : references ProcessSignalManager --> PendingSignals : contains ThreadSignalManager --> PendingSignals : contains ProcessSignalManager --> SignalActions : contains ProcessSignalManager --> WaitQueue : uses
Sources: src/api/thread.rs(L20 - L240) src/api/process.rs(L32 - L82) src/api/mod.rs(L9 - L30)
Process Signal Manager
The ProcessSignalManager
is responsible for managing signals at the process level. It's a shared resource accessible by all threads within a process.
Structure and Components
flowchart TD subgraph ProcessSignalManager["ProcessSignalManager"] A["pending: Mutex"] B["Tracks process-wide pending signals"] C["actions: Arc>"] D["Defines how each signal is handled"] E["wq: WaitQueue"] F["Synchronization primitive for signal waiting"] G["default_restorer: usize"] H["Address of default signal return handler"] end A --> B C --> D E --> F G --> H
Sources: src/api/process.rs(L32 - L48)
Key Methods
Method | Purpose |
---|---|
new | Creates a new process signal manager with given actions and default restorer |
dequeue_signal | Removes and returns a pending signal that matches the given mask |
send_signal | Sends a signal to the process and notifies waiting threads |
pending | Returns the set of pending signals for the process |
wait_signal | Suspends the current thread until a signal is delivered |
Sources: src/api/process.rs(L49 - L82)
Thread Signal Manager
The ThreadSignalManager
handles signals targeted at specific threads, maintaining thread-specific signal state while coordinating with the process-level manager.
Structure and Components
flowchart TD subgraph ThreadSignalManager["ThreadSignalManager"] A["proc: Arc"] B["Reference to the process-level manager"] C["pending: Mutex"] D["Thread-specific pending signals"] E["blocked: Mutex"] F["Signals currently blocked for this thread"] G["stack: Mutex"] H["Stack used for signal handlers"] end A --> B C --> D E --> F G --> H
Sources: src/api/thread.rs(L21 - L31)
Key Methods
Method | Purpose |
---|---|
new | Creates a new thread signal manager with reference to the process manager |
dequeue_signal | Dequeues a signal from thread or process pending queues |
handle_signal | Processes a signal based on its action (default, ignore, handler) |
check_signals | Checks and handles pending signals for the thread |
restore | Restores the execution context after a signal handler returns |
send_signal | Sends a signal to the thread |
wait_timeout | Waits for a signal with optional timeout |
Sources: src/api/thread.rs(L33 - L240)
Signal Processing Flow
The signal handling flow involves coordination between the process and thread signal managers, checking signal masks, and executing the appropriate actions based on signal dispositions.
flowchart TD subgraph subGraph0["Signal Delivery (check_signals)"] CheckSignals["ThreadSignalManager::check_signals()"] GetBlocked["Get thread's blocked signals"] Mask["Create mask of unblocked signals"] DequeueLoop["Start dequeue loop"] TryDequeue["Try dequeue signal from thread, then process"] SignalFound["Signal found?"] Done["No signal to handle"] GetAction["Get SignalAction for this signal"] HandleSignal["handle_signal()"] CheckDisposition["Check disposition"] DefaultAction["Execute default action"] NextSignal["Continue to next signal"] SetupHandler["Set up signal handler"] CreateFrame["Create SignalFrame"] SetupTrapFrame["Modify trap frame"] UpdateBlocked["Update blocked signals"] Result["Return signal and action"] end Start["Signal Generated"] SendDecision["Send to Thread or Process?"] ThreadSend["ThreadSignalManager::send_signal()"] ProcessSend["ProcessSignalManager::send_signal()"] ThreadPending["Add to Thread's pending signals"] ProcessPending["Add to Process's pending signals"] NotifyWQ["Notify process wait queue"] CheckDisposition --> DefaultAction CheckDisposition --> NextSignal CheckDisposition --> SetupHandler CheckSignals --> GetBlocked CreateFrame --> SetupTrapFrame DequeueLoop --> TryDequeue GetAction --> HandleSignal GetBlocked --> Mask HandleSignal --> CheckDisposition Mask --> DequeueLoop NextSignal --> DequeueLoop ProcessPending --> NotifyWQ ProcessSend --> ProcessPending SendDecision --> ProcessSend SendDecision --> ThreadSend SetupHandler --> CreateFrame SetupTrapFrame --> UpdateBlocked SignalFound --> Done SignalFound --> GetAction Start --> SendDecision ThreadPending --> NotifyWQ ThreadSend --> ThreadPending TryDequeue --> SignalFound UpdateBlocked --> Result
Sources: src/api/thread.rs(L119 - L143) src/api/thread.rs(L43 - L48) src/api/thread.rs(L50 - L117) src/api/thread.rs(L157 - L163) src/api/process.rs(L64 - L70)
Signal Handler Execution
When a signal with a custom handler is processed, the system prepares a special execution environment for the handler:
sequenceDiagram participant KernelThread as "Kernel/Thread" participant ThreadSignalManager as "ThreadSignalManager" participant SignalHandler as "Signal Handler" participant SignalRestorer as "Signal Restorer" KernelThread ->> ThreadSignalManager: check_signals() ThreadSignalManager ->> ThreadSignalManager: dequeue_signal() ThreadSignalManager ->> ThreadSignalManager: handle_signal() Note over ThreadSignalManager: Signal has Handler disposition ThreadSignalManager ->> ThreadSignalManager: Create SignalFrame on stack ThreadSignalManager ->> ThreadSignalManager: Save current context ThreadSignalManager ->> ThreadSignalManager: Set up handler arguments ThreadSignalManager ->> SignalHandler: Jump to handler (modify trap frame) SignalHandler ->> SignalRestorer: Return from handler SignalRestorer ->> ThreadSignalManager: restore() ThreadSignalManager ->> ThreadSignalManager: Restore original trap frame ThreadSignalManager ->> ThreadSignalManager: Restore original signal mask ThreadSignalManager ->> KernelThread: Return to normal execution
Sources: src/api/thread.rs(L50 - L117) src/api/thread.rs(L145 - L155)
SignalFrame Structure
When preparing to execute a signal handler, the system creates a special SignalFrame
structure on the stack:
flowchart TD subgraph SignalFrame["SignalFrame"] A["ucontext: UContext"] B["Contains saved machine context"] C["siginfo: SignalInfo"] D["Information about the signal"] E["tf: TrapFrame"] F["Saved trap frame"] end A --> B C --> D E --> F
Sources: src/api/thread.rs(L14 - L18)
Wait Queue Interface
The WaitQueue
trait provides a synchronization mechanism for threads waiting on signals. It defines methods for waiting with an optional timeout and for notifying waiting threads.
Method | Description |
---|---|
wait_timeout | Waits for a notification with optional timeout, returns whether notification came |
wait | Waits indefinitely for a notification |
notify_one | Notifies a single waiting thread, returns whether a thread was notified |
notify_all | Notifies all waiting threads |
This interface is used by both the process and thread signal managers to coordinate waiting for and receiving signals.
Sources: src/api/mod.rs(L9 - L30)
Signal Handling Process
The signal handling process from generation to execution follows this sequence:
- A signal is generated and sent via
send_signal()
to either a thread or process - The signal is added to the appropriate pending queue
- Waiting threads are notified via the wait queue
- When a thread checks for signals, it:
- Determines which signals are not blocked
- Dequeues pending signals from thread and process queues
- For each signal, checks its action (disposition)
- Executes the appropriate handler or default action
- For custom handlers, the system:
- Creates a
SignalFrame
to save the current execution context - Sets up the stack and arguments for the handler
- Modifies the trap frame to transfer control to the handler
- When the handler returns, restores the original context
This comprehensive system allows for Unix-like signal handling with support for default actions, custom handlers, and signal masking at both process and thread levels.
Sources: src/api/thread.rs(L50 - L117) src/api/thread.rs(L119 - L143) src/api/thread.rs(L157 - L163) src/api/process.rs(L64 - L70)
Thread Signal Manager
Relevant source files
The Thread Signal Manager is a component of the axsignal crate that provides thread-level signal handling capabilities in a Unix-like signal handling system. It manages signal delivery, blocking, and handling for individual threads, working in coordination with the Process Signal Manager. For process-level signal handling, see Process Signal Manager.
Overview
The Thread Signal Manager implements thread-specific signal handling functionality including:
- Managing thread-specific pending signals
- Controlling which signals are blocked for a thread
- Setting up and managing signal handler stacks
- Handling signal delivery to user-defined handlers
- Coordination with the process-level signal manager
classDiagram class ThreadSignalManager { -Arc~ProcessSignalManager~ proc -Mutex~PendingSignals~ pending -Mutex~SignalSet~ blocked -Mutex~SignalStack~ stack +new(proc) +dequeue_signal(mask) +handle_signal(tf, restore_blocked, sig, action) +check_signals(tf, restore_blocked) +restore(tf) +send_signal(sig) +blocked() +with_blocked_mut(f) +stack() +with_stack_mut(f) +pending() +wait_timeout(set, timeout) } class ProcessSignalManager { +Mutex~PendingSignals~ pending +Arc~Mutex~SignalActions~~ actions +WaitQueue wq +usize default_restorer } class PendingSignals { +SignalSet set +Option~SignalInfo~[32] info_std +VecDeque~SignalInfo~[33] info_rt } class SignalSet { +u64 bits } class SignalStack { +usize sp +usize size +u32 flags } ThreadSignalManager --> ProcessSignalManager : references ThreadSignalManager *-- PendingSignals : contains ThreadSignalManager *-- SignalSet : contains ThreadSignalManager *-- SignalStack : contains
Sources: src/api/thread.rs(L21 - L31)
Core Components
SignalFrame
Before a signal handler is invoked, the current execution context is saved in a SignalFrame
structure on the stack:
classDiagram class SignalFrame { +UContext ucontext +SignalInfo siginfo +TrapFrame tf } class UContext { +MContext mcontext +SignalSet sigmask } class SignalInfo { +Signo signo +i32 errno +i32 code } class TrapFrame { +architecture-specific registers } SignalFrame *-- UContext SignalFrame *-- SignalInfo SignalFrame *-- TrapFrame
Sources: src/api/thread.rs(L14 - L18)
Signal Handling Flow
The ThreadSignalManager
follows a specific flow when handling signals:
flowchart TD ThreadReceive["Thread receives signal"] CheckBlocked["Is signalblocked?"] AddPending["Add to thread'spending signals"] CheckDisposition["Check signaldisposition"] Later["Later when unblocked"] DefaultAction["Execute default action(Terminate/CoreDump/Stop/Ignore/Continue)"] DoNothing["Do nothing"] SetupHandler["Setup handler execution"] SaveContext["Save current contextin SignalFrame"] ModifyTF["Modify trap framefor handler execution"] UpdateBlocked["Update blocked signals"] ExecuteHandler["Execute signal handler"] RestoreContext["Restore original contextwhen handler returns"] AddPending --> Later CheckBlocked --> AddPending CheckBlocked --> CheckDisposition CheckDisposition --> DefaultAction CheckDisposition --> DoNothing CheckDisposition --> SetupHandler ExecuteHandler --> RestoreContext Later --> CheckDisposition ModifyTF --> UpdateBlocked SaveContext --> ModifyTF SetupHandler --> SaveContext ThreadReceive --> CheckBlocked UpdateBlocked --> ExecuteHandler
Sources: src/api/thread.rs(L50 - L117) src/api/thread.rs(L119 - L143)
Key Methods
Constructor
The ThreadSignalManager
is initialized with a reference to a ProcessSignalManager
:
#![allow(unused)] fn main() { pub fn new(proc: Arc<ProcessSignalManager<M, WQ>>) -> Self { Self { proc, pending: Mutex::new(PendingSignals::new()), blocked: Mutex::new(SignalSet::default()), stack: Mutex::new(SignalStack::default()), } } }
Sources: src/api/thread.rs(L34 - L41)
Signal Dequeuing
The dequeue_signal
method attempts to retrieve a signal from the thread's pending signals. If none are found, it falls back to the process-level signal manager:
#![allow(unused)] fn main() { fn dequeue_signal(&self, mask: &SignalSet) -> Option<SignalInfo> { self.pending .lock() .dequeue_signal(mask) .or_else(|| self.proc.dequeue_signal(mask)) } }
Sources: src/api/thread.rs(L43 - L48)
Signal Handling
The handle_signal
method is responsible for processing a signal based on its disposition:
flowchart TD HandleSignal["handle_signal()"] CheckDisposition["Check disposition"] DefaultAction["Get default actionfor signal number"] ReturnOSAction["Return appropriateSignalOSAction"] ReturnNone["Return None"] SetupStack["Setup stack forsignal handler"] CreateFrame["Create SignalFrameon stack"] ModifyTF["Modify trap frameto call handler"] AddArguments["Set arguments(signo, siginfo, ucontext)"] SetRestorer["Set signal restorer"] UpdateBlockedSignals["Update blocked signals"] CheckResethand["Has RESETHANDflag?"] ResetAction["Reset signal actionto default"] Skip["Skip reset"] ReturnHandler["ReturnSignalOSAction::Handler"] AddArguments --> SetRestorer CheckDisposition --> DefaultAction CheckDisposition --> ReturnNone CheckDisposition --> SetupStack CheckResethand --> ResetAction CheckResethand --> Skip CreateFrame --> ModifyTF DefaultAction --> ReturnOSAction HandleSignal --> CheckDisposition ModifyTF --> AddArguments ResetAction --> ReturnHandler SetRestorer --> UpdateBlockedSignals SetupStack --> CreateFrame Skip --> ReturnHandler UpdateBlockedSignals --> CheckResethand
Sources: src/api/thread.rs(L50 - L117)
Check and Handle Signals
The check_signals
method checks for pending signals and handles them:
pub fn check_signals(
&self,
tf: &mut TrapFrame,
restore_blocked: Option<SignalSet>,
) -> Option<(SignalInfo, SignalOSAction)> {
let actions = self.proc.actions.lock();
let blocked = self.blocked.lock();
let mask = !*blocked;
let restore_blocked = restore_blocked.unwrap_or_else(|| *blocked);
drop(blocked);
loop {
let Some(sig) = self.dequeue_signal(&mask) else {
return None;
};
let action = &actions[sig.signo()];
if let Some(os_action) = self.handle_signal(tf, restore_blocked, &sig, action) {
break Some((sig, os_action));
}
}
}
Sources: src/api/thread.rs(L119 - L143)
Signal Frame Restoration
The restore
method restores the original context from a signal frame:
pub fn restore(&self, tf: &mut TrapFrame) {
let frame_ptr = tf.sp() as *const SignalFrame;
// SAFETY: pointer is valid
let frame = unsafe { &*frame_ptr };
*tf = frame.tf;
frame.ucontext.mcontext.restore(tf);
*self.blocked.lock() = frame.ucontext.sigmask;
}
Sources: src/api/thread.rs(L145 - L155)
Signal Waiting
The wait_timeout
method allows a thread to wait for a specific set of signals:
pub fn wait_timeout(
&self,
mut set: SignalSet,
timeout: Option<Duration>,
) -> Option<SignalInfo> {
// Non-blocked signals cannot be waited
set &= self.blocked();
if let Some(sig) = self.dequeue_signal(&set) {
return Some(sig);
}
let wq = &self.proc.wq;
let deadline = timeout.map(|dur| axhal::time::wall_time() + dur);
// There might be false wakeups, so we need a loop
loop {
match &deadline {
Some(deadline) => {
match deadline.checked_sub(axhal::time::wall_time()) {
Some(dur) => {
if wq.wait_timeout(Some(dur)) {
// timed out
break;
}
}
None => {
// deadline passed
break;
}
}
}
_ => wq.wait(),
}
if let Some(sig) = self.dequeue_signal(&set) {
return Some(sig);
}
}
// TODO: EINTR
None
}
Sources: src/api/thread.rs(L196 - L239)
Integration with the Signal Handling System
The ThreadSignalManager
is a key component in the overall signal handling architecture:
Sources: src/api/thread.rs(L21 - L31) src/api/thread.rs(L43 - L48)
Performance and Synchronization
The ThreadSignalManager
uses mutexes to protect its internal state:
Protected Resource | Purpose |
---|---|
pending | Protects the thread's pending signals queue |
blocked | Protects the set of signals blocked from delivery |
stack | Protects the signal handler stack configuration |
The manager also interacts with the process-level wait queue for signal notification across threads.
Sources: src/api/thread.rs(L21 - L31) src/api/thread.rs(L196 - L239)
Summary
The Thread Signal Manager is a crucial component of the axsignal crate that handles thread-specific signal management. It works in coordination with the Process Signal Manager to provide a complete signal handling solution, supporting both standard Unix-like signals and real-time signals across multiple architectures.
Key responsibilities include:
- Managing thread-specific pending signals
- Handling signal blocking and unblocking
- Setting up and executing signal handlers
- Managing signal stacks
- Coordinating with the process-level signal manager
- Supporting signal waiting operations with timeouts
The thread-specific nature of the ThreadSignalManager
allows for fine-grained control over signal handling within multi-threaded applications, while still maintaining compatibility with process-level signal delivery mechanisms.
Process Signal Manager
Relevant source files
Purpose and Scope
The Process Signal Manager is a core component of the axsignal
crate that handles signal management at the process level. It provides mechanisms for managing, queuing, and delivering signals to processes in a Unix-like manner. This component serves as the foundation for process-wide signal operations while working alongside thread-specific signal handling. For thread-level signal management, see Thread Signal Manager.
Sources: src/api/process.rs(L32 - L82)
Structure and Components
The Process Signal Manager consists of several key components that work together to manage signals at the process level.
classDiagram class ProcessSignalManager { +Mutex pending +Arc~ actions +WaitQueue wq +usize default_restorer +new(actions, default_restorer) +dequeue_signal(mask) +send_signal(sig) +pending() +wait_signal() } class SignalActions { +[SignalAction; 64] 0 +index(Signo) +index_mut(Signo) } class PendingSignals { +SignalSet set +Option[32] info_std +VecDeque[33] info_rt +new() +put_signal(SignalInfo) +dequeue_signal(SignalSet) } class WaitQueue { +wait() +notify_one() } ProcessSignalManager --> SignalActions : contains Arc> ProcessSignalManager --> PendingSignals : contains Mutex<> ProcessSignalManager --> WaitQueue : contains
Diagram: Process Signal Manager Structure
Sources: src/api/process.rs(L32 - L48) src/api/process.rs(L13 - L30)
Core Components
- Pending Signals: A mutex-protected
PendingSignals
instance that stores signals queued for the process. - Signal Actions: An atomic reference counted, mutex-protected
SignalActions
that defines how the process responds to different signals. - Wait Queue: Provides synchronization for tasks waiting on signals, used in operations like
rt_sigtimedwait
. - Default Restorer: A function pointer (as
usize
) that serves as the default signal handler restorer.
Sources: src/api/process.rs(L32 - L48)
Signal Actions
The SignalActions
structure maintains an array of 64 signal actions, providing indexed access to actions for each signal number.
SignalActions[1] -> Action for SIGHUP
SignalActions[2] -> Action for SIGINT
...
SignalActions[64] -> Action for signal 64
The structure implements Index
and IndexMut
traits to allow convenient access to signal actions by their signal number (Signo
).
Sources: src/api/process.rs(L13 - L30)
Signal Flow and Management
The Process Signal Manager plays a central role in the signal handling flow, serving as an intermediary between signal sources and handler execution.
flowchart TD subgraph subGraph0["Signal Flow"] SendSignal["send_signal(SignalInfo)"] AddPending["Add to pending signals"] NotifyWaiters["Notify waiting tasks"] CheckPending["pending()"] GetPendingSet["Return SignalSet of pending signals"] DequeueSignal["dequeue_signal(mask)"] CheckMask["Signal in mask?"] RemoveSignal["Remove from pending"] ReturnNone["Return None"] ReturnSignal["Return SignalInfo"] WaitSignal["wait_signal()"] SuspendTask["Suspend current task"] WakeupTask["Wake up waiting tasks"] end AddPending --> NotifyWaiters CheckMask --> RemoveSignal CheckMask --> ReturnNone CheckPending --> GetPendingSet DequeueSignal --> CheckMask NotifyWaiters --> WakeupTask RemoveSignal --> ReturnSignal SendSignal --> AddPending WaitSignal --> SuspendTask WakeupTask --> SuspendTask
Diagram: Process Signal Manager Operations
Sources: src/api/process.rs(L60 - L81)
Key Operations
- Signal Queueing: When a signal is sent to a process using
send_signal()
, it's added to the pending signals queue and triggers a notification on the wait queue. - Signal Retrieval: The
dequeue_signal()
method allows retrieving a pending signal if it's not blocked by the provided mask. - Pending Signal Management: The
pending()
method returns the set of signals currently pending for the process. - Signal Waiting: The
wait_signal()
method suspends the current task until a signal is delivered to the process.
Sources: src/api/process.rs(L60 - L81)
Integration with Signal Management System
The Process Signal Manager integrates with the broader signal management system, particularly with the Thread Signal Manager and other signal handling components.
flowchart TD subgraph subGraph0["Signal Management System"] PSM["ProcessSignalManager"] ProcessPending["Process Pending Signals"] SignalActions["Signal Actions"] TSM["ThreadSignalManager"] ThreadPending["Thread Pending Signals"] SignalSource["Signal Source"] SignalInfo["SignalInfo"] Action["SignalAction"] Pending["PendingSignals"] end PSM --> ProcessPending PSM --> SignalActions ProcessPending --> SignalInfo SignalActions --> SignalInfo SignalSource --> PSM SignalSource --> TSM TSM --> PSM TSM --> ThreadPending ThreadPending --> SignalInfo
Diagram: Process Signal Manager in the Signal System
Sources: src/api/process.rs(L32 - L82)
Key Relationships
- Thread Signal Manager: Thread Signal Managers reference a Process Signal Manager, allowing them to check for process-level signals when no thread-specific signals are pending.
- Signal Actions: The Process Signal Manager maintains the signal actions table that defines how signals are handled. This table is shared across all threads in the process.
- Wait Queue: The Process Signal Manager provides a wait queue that allows tasks to wait for signals, with potential false wakeups due to its shared nature.
- Signal Delivery: When signals are sent to a process, they're queued in the Process Signal Manager's pending signals queue. Threads can then dequeue these signals based on their signal masks.
Sources: src/api/process.rs(L32 - L82)
Implementation Details
The Process Signal Manager is a generic structure parameterized by two types:
M
: A type that implements theRawMutex
trait, used for synchronizationWQ
: A type that implements theWaitQueue
trait, used for signal waiting
This allows flexibility in the underlying synchronization mechanisms while maintaining a consistent API.
Constructor
#![allow(unused)] fn main() { pub fn new(actions: Arc<Mutex<M, SignalActions>>, default_restorer: usize) -> Self }
Creates a new Process Signal Manager with the given signal actions and default restorer function.
Signal Handling
The send_signal
method adds a signal to the pending queue and notifies waiting tasks:
#![allow(unused)] fn main() { pub fn send_signal(&self, sig: SignalInfo) { self.pending.lock().put_signal(sig); self.wq.notify_one(); } }
This simple mechanism ensures that signals are properly queued and waiting tasks are notified, allowing them to check for and potentially handle the new signal.
Sources: src/api/process.rs(L49 - L82)
Usage Considerations
When using the Process Signal Manager, consider these important points:
- Shared Access: The Process Signal Manager is shared across all threads in a process, requiring proper synchronization (provided by the mutex implementations).
- Wait Queue Behavior: The wait queue may cause false wakeups since it's shared by all threads in the process. Applications should be designed to handle this case.
- Default Restorer: The default restorer function is architecture-specific and is used when a signal handler doesn't provide its own restorer.
- Signal Actions: Signal actions define the behavior for each signal and are shared across the process, ensuring consistent handling regardless of which thread receives a signal.
Sources: src/api/process.rs(L32 - L82)
Wait Queue Interface
Relevant source files
Purpose and Scope
The Wait Queue Interface is a synchronization mechanism used within the axsignal crate to enable threads to efficiently wait for signals. It provides the fundamental building blocks for implementing signal suspension operations like sigsuspend()
and sigtimedwait()
. This document covers the Wait Queue trait definition, its implementation requirements, and how it's used within the signal management system.
For information about the overall signal management architecture, see Signal Management System, and for process-level and thread-level signal management, see Process Signal Manager and Thread Signal Manager respectively.
Wait Queue Trait Definition
The WaitQueue
trait defines an abstract interface for a thread waiting mechanism that can be used across different parts of the signal handling system.
classDiagram note for WaitQueue "Implemented by concrete wait queuecomponents in the OS" class WaitQueue { <<trait>> +wait_timeout(timeout: Option~Duration~) bool +wait() +notify_one() bool +notify_all() }
Sources: src/api/mod.rs(L9 - L30)
The trait provides four essential methods:
Method | Description | Return Value |
---|---|---|
wait_timeout | Blocks the current thread until notified or timeout expires | trueif a notification came,falseif timeout expired |
wait | Blocks the current thread indefinitely until notified | None (callswait_timeoutwithNone) |
notify_one | Wakes up a single waiting thread, if any | trueif a thread was notified |
notify_all | Wakes up all waiting threads | None (repeatedly callsnotify_one) |
Sources: src/api/mod.rs(L9 - L30)
Integration with Signal Management System
The Wait Queue is a critical component in the signal management architecture, enabling signal-based thread suspension and notification.
flowchart TD subgraph subGraph1["Signal Operations"] SendSignal["send_signal()"] WaitSignal["wait_timeout()"] Notify["notify_all()"] WaitMethod["wait_timeout()"] end subgraph subGraph0["Signal Management System"] ProcessSigMgr["ProcessSignalManager"] ThreadSigMgr["ThreadSignalManager"] WaitQ["WaitQueue"] end Notify --> WaitMethod ProcessSigMgr --> WaitQ SendSignal --> Notify ThreadSigMgr --> SendSignal ThreadSigMgr --> WaitQ ThreadSigMgr --> WaitSignal WaitQ --> Notify WaitQ --> WaitMethod WaitSignal --> WaitMethod
Sources: src/api/thread.rs(L22 - L24) src/api/thread.rs(L197 - L239) src/api/thread.rs(L157 - L163)
Wait Queue Usage in Signal Waiting
The Wait Queue is primarily used to implement signal waiting functionality in the ThreadSignalManager
:
sequenceDiagram participant Thread as Thread participant ThreadSignalManager as ThreadSignalManager participant ProcessSignalManager as ProcessSignalManager participant WaitQueue as WaitQueue Thread ->> ThreadSignalManager: wait_timeout(set, timeout) ThreadSignalManager ->> ThreadSignalManager: Check if signal already pending alt Signal already pending ThreadSignalManager -->> Thread: Return signal else No signal pending ThreadSignalManager ->> ProcessSignalManager: Access wait queue ThreadSignalManager ->> WaitQueue: wait_timeout(timeout) WaitQueue -->> ThreadSignalManager: Return (may be false wakeup) loop Until signal or timeout ThreadSignalManager ->> ThreadSignalManager: Check for pending signals alt Signal received ThreadSignalManager -->> Thread: Return signal else False wakeup or timeout alt Timeout expired ThreadSignalManager -->> Thread: Return None else Timeout not expired ThreadSignalManager ->> WaitQueue: wait_timeout(remaining) end end end end
Sources: src/api/thread.rs(L197 - L239)
Implementation Details
Signal Waiting with Timeout
The wait_timeout
method in ThreadSignalManager
demonstrates how the Wait Queue is used to implement signal waiting functionality:
- First checks if a relevant signal is already pending
- If not, calculates a deadline based on the timeout
- Enters a loop that:
- Waits on the process's wait queue with a timeout
- Checks if a relevant signal is now pending after each wakeup
- Handles cases of false wakeups by continuing to wait
- Manages the remaining timeout duration
Sources: src/api/thread.rs(L197 - L239)
Signal Notification
When a signal is sent to a thread, the wait queue is notified:
send_signal() → put_signal() → wq.notify_all()
This ensures that any threads waiting for signals are woken up to check if one of their waited-for signals is now pending.
Sources: src/api/thread.rs(L157 - L163)
Key Considerations for Wait Queue Implementations
The WaitQueue
trait is defined as a generic interface, allowing different concrete implementations to be used. Implementations must consider:
- Timeout handling: Must support both indefinite waiting and time-limited waiting
- False wakeup handling: The signal management code is designed to handle spurious wakeups by rechecking conditions
- Efficiency: Should efficiently wake only necessary threads when possible
- Fairness: Ideally should wake threads in a fair manner (e.g., FIFO order)
The default implementations of wait()
and notify_all()
are provided for convenience, but concrete implementations may override them for better performance.
Sources: src/api/mod.rs(L16 - L29)
Wait Queue in the Signal Processing Flow
The Wait Queue plays a crucial role in the overall signal processing flow:
flowchart TD subgraph subGraph2["Signal Waiting"] Wait["wait_timeout(set, timeout)"] WaitMethod["WaitQueue.wait_timeout()"] CheckSig["Check for pending signals"] DeqSig["Dequeue signal"] Return["Return None"] end subgraph subGraph1["Signal Queuing"] PendQ["PendingSignals"] Notify["WaitQueue.notify_all()"] end subgraph subGraph0["Signal Generation"] SigSend["send_signal()"] end SigInfo["SignalInfo"] CheckSig --> DeqSig CheckSig --> Return DeqSig --> SigInfo Notify --> WaitMethod PendQ --> CheckSig SigSend --> Notify SigSend --> PendQ Wait --> WaitMethod WaitMethod --> CheckSig
Sources: src/api/thread.rs(L197 - L239) src/api/thread.rs(L157 - L163)
Summary
The Wait Queue Interface provides a critical synchronization mechanism for the axsignal crate, enabling efficient signal waiting and notification. By abstracting the waiting and notification operations through a trait, the system allows for flexible implementation while maintaining a consistent interface. The ThreadSignalManager
leverages this interface to implement signal waiting functionality, with proper handling of timeouts and false wakeups.
Signal Types and Structures
Relevant source files
Purpose and Scope
This page documents the core data structures used to represent and manage signals in the axsignal crate. These structures form the foundation of the signal handling system in ArceOS, providing a Unix-like signal framework that's compatible with Linux signal interfaces. For information on how signals are managed at the process and thread levels, see Signal Management System.
Core Signal Types
The axsignal crate defines several fundamental types that represent different aspects of signals in the system.
classDiagram class Signo { +enum values(SIGHUP=1 to SIGRT32=64) +is_realtime() bool +default_action() DefaultSignalAction } class SignalSet { +u64 value +add(Signo) bool +remove(Signo) bool +has(Signo) bool +dequeue(SignalSet) Option~Signo~ +to_ctype(kernel_sigset_t) } class SignalInfo { +siginfo_t raw_value +new(Signo, i32) SignalInfo +signo() Signo +set_signo(Signo) +code() i32 +set_code(i32) } class SignalStack { +usize sp +u32 flags +usize size +disabled() bool } SignalInfo --> Signo : contains SignalSet --> Signo : operates on
Sources: src/types.rs(L12 - L77) src/types.rs(L123 - L182) src/types.rs(L185 - L215) src/types.rs(L218 - L240)
Signal Numbers (Signo)
The Signo
enum represents signal numbers compatible with Unix-like systems. It defines constants for standard signals (1-31) and real-time signals (32-64).
flowchart TD subgraph subGraph2["Real-time Signal Examples"] SIGRTMIN["SIGRTMIN (32)"] SIGRT1["SIGRT1 (33)"] SIGRT32["SIGRT32 (64)"] end subgraph subGraph1["Standard Signal Examples"] SIGHUP["SIGHUP (1)"] SIGINT["SIGINT (2)"] SIGTERM["SIGTERM (15)"] SIGKILL["SIGKILL (9)"] end subgraph subGraph0["Signal Categories"] StandardSignals["Standard Signals (1-31)"] RealTimeSignals["Real-time Signals (32-64)"] end RealTimeSignals --> SIGRT1 RealTimeSignals --> SIGRT32 RealTimeSignals --> SIGRTMIN StandardSignals --> SIGHUP StandardSignals --> SIGINT StandardSignals --> SIGKILL StandardSignals --> SIGTERM
Key features of the Signo
enum:
- Represents 64 different signal types (1-64)
- Distinguishes between standard signals (1-31) and real-time signals (32-64)
- Provides the
is_realtime()
method to identify signal categories - Associates default actions with each signal through the
default_action()
method
The default actions for signals include:
- Terminate: End the process
- CoreDump: End the process and generate a core dump
- Ignore: Do nothing
- Stop: Pause the process
- Continue: Resume a stopped process
Sources: src/types.rs(L12 - L77) src/types.rs(L80 - L119)
Signal Sets (SignalSet)
The SignalSet
structure represents a set of signals, compatible with the Linux sigset_t
type. It uses a 64-bit integer internally, where each bit corresponds to a signal number.
Key operations on SignalSet
:
add(signal)
: Adds a signal to the setremove(signal)
: Removes a signal from the sethas(signal)
: Checks if a signal is in the setdequeue(mask)
: Removes and returns a signal from the set that is also in the provided mask
The structure provides conversion to and from the Linux kernel_sigset_t
type, ensuring compatibility with Linux syscalls and ABI.
Sources: src/types.rs(L123 - L182)
Signal Information (SignalInfo)
The SignalInfo
structure encapsulates detailed information about a signal, compatible with the Linux siginfo_t
type. It provides a transparent wrapper around the raw Linux type with convenient methods for accessing and modifying signal properties.
Key features:
- Retrieves and sets the signal number (
signo
) - Retrieves and sets the signal code (
code
) - Preserves compatibility with the Linux ABI for signal handlers that expect a
siginfo_t
parameter
Sources: src/types.rs(L185 - L215)
Signal Stack (SignalStack)
The SignalStack
structure defines an alternate stack for signal handlers, compatible with the Linux sigaltstack
structure. Signal stacks provide a dedicated memory area for signal handlers to execute, which is useful for handling stack overflow situations.
Fields:
sp
: Stack pointer (address)flags
: Stack flags (e.g.,SS_DISABLE
to disable the alternate stack)size
: Size of the stack in bytes
The disabled()
method checks if the alternate stack is disabled.
Sources: src/types.rs(L218 - L240)
Signal Action Components
The signal action subsystem defines how signals are handled when they are delivered.
classDiagram class SignalAction { +SignalActionFlags flags +SignalSet mask +SignalDisposition disposition +__sigrestore_t restorer +to_ctype(kernel_sigaction) } class SignalActionFlags { +SIGINFO +NODEFER +RESETHAND +RESTART +ONSTACK +RESTORER } class SignalDisposition { <<enum>> +Default +Ignore +Handler(fn(i32)) } class DefaultSignalAction { <<enum>> +Terminate +Ignore +CoreDump +Stop +Continue } class SignalOSAction { <<enum>> +Terminate +CoreDump +Stop +Continue +Handler } class SignalSet { } SignalAction --> SignalActionFlags : contains SignalAction --> SignalSet : contains SignalAction --> SignalDisposition : contains SignalAction ..> DefaultSignalAction : uses when disposition is Default SignalDisposition ..> SignalOSAction : converted to
Sources: src/action.rs(L16 - L156)
Default Signal Actions
The DefaultSignalAction
enum defines the possible default behaviors for signals:
Action | Description |
---|---|
Terminate | End the process |
Ignore | Do nothing when the signal is received |
CoreDump | End the process and generate a core dump |
Stop | Pause the process |
Continue | Resume a stopped process |
Each signal has a predefined default action as specified by the default_action()
method in the Signo
enum.
Sources: src/action.rs(L16 - L31) src/types.rs(L84 - L119)
Signal Action Flags
The SignalActionFlags
bitflags define modifiers for signal handling behavior:
Flag | Description |
---|---|
SIGINFO | Handler expects additional signal information |
NODEFER | Signal is not blocked during handler execution |
RESETHAND | Reset handler to default after execution |
RESTART | Automatically restart interrupted system calls |
ONSTACK | Use alternate signal stack |
RESTORER | Custom signal restorer function is provided |
These flags match the Linux SA_* constants and modify how signals are handled and processed.
Sources: src/action.rs(L50 - L59)
Signal Disposition
The SignalDisposition
enum defines how a specific signal should be handled:
Default
: Use the default action for the signalIgnore
: Ignore the signalHandler(fn)
: Execute a custom handler function
This is part of the SignalAction
structure and determines the action taken when a signal is delivered.
Sources: src/action.rs(L73 - L83)
Signal Action Structure
The SignalAction
structure combines all aspects of signal handling configuration:
flags
: Bitflags that modify signal handling behaviormask
: Set of signals blocked during handler executiondisposition
: How the signal should be handledrestorer
: Function to restore context after handler execution
This structure is compatible with the Linux sigaction
structure and provides conversion methods for Linux ABI compatibility.
Sources: src/action.rs(L85 - L156)
Pending Signal Management
The pending signal subsystem manages signals that have been generated but not yet delivered or handled.
flowchart TD subgraph subGraph1["Signal Flow"] PutSignal["put_signal(SignalInfo)"] DequeueSignal["dequeue_signal(SignalSet)"] StandardSig["Standard Signal"] RTSig["Real-time Signal"] StandardSet["Set bit in SignalSet"] RTSet["Set bit in SignalSet"] StandardStore["Store in info_std array"] RTQueue["Push to info_rt queue"] end subgraph PendingSignals["PendingSignals"] SignalSet["SignalSet (all pending signals)"] StandardQueue["Standard Signal Queue (info_std)"] RealTimeQueue["Real-time Signal Queue (info_rt)"] end DequeueSignal --> SignalSet PutSignal --> RTSig PutSignal --> StandardSig RTSig --> RTQueue RTSig --> RTSet SignalSet --> RealTimeQueue SignalSet --> StandardQueue StandardSig --> StandardSet StandardSig --> StandardStore
Sources: src/pending.rs(L8 - L66)
PendingSignals Structure
The PendingSignals
structure maintains a queue of signals that are waiting to be delivered and processed:
set
: ASignalSet
indicating which signals are pendinginfo_std
: An array ofOption<SignalInfo>
for standard signals (1-31)info_rt
: An array of queues for real-time signals (32-64)
Key differences in handling standard vs. real-time signals:
- Standard signals are not queued (at most one instance of each signal can be pending)
- Real-time signals are fully queued (multiple instances of the same signal can be pending)
Sources: src/pending.rs(L8 - L29)
Signal Queueing Mechanisms
The PendingSignals
structure implements two primary operations:
put_signal(sig)
: Adds a signal to the pending queue
- For standard signals, if the signal is already pending, the new instance is ignored
- For real-time signals, each signal is queued regardless of existing pending signals of the same type
dequeue_signal(mask)
: Removes and returns a signal from the pending queue
- Only returns signals that are included in the provided mask
- For standard signals, it clears the corresponding bit in the signal set
- For real-time signals, it removes one instance from the queue and only clears the bit if the queue becomes empty
This two-tier design provides different quality-of-service levels for standard and real-time signals, matching the behavior of Unix-like systems.
Sources: src/pending.rs(L30 - L66)
Linux Compatibility Model
The signal types and structures in axsignal are designed to be binary-compatible with their Linux counterparts.
flowchart TD subgraph subGraph1["Linux Types"] LSigSet["kernel_sigset_t"] LSigInfo["siginfo_t"] LSigStack["sigaltstack"] LSigAction["kernel_sigaction"] end subgraph subGraph0["axsignal Types"] ASigSet["SignalSet"] ASigInfo["SignalInfo"] ASigStack["SignalStack"] ASigAction["SignalAction"] end ASigAction --> LSigAction ASigInfo --> LSigInfo ASigSet --> LSigSet ASigStack --> LSigStack
Key compatibility features:
#[repr(transparent)]
ensures binary compatibility forSignalSet
andSignalInfo
#[repr(C)]
ensures memory layout compatibility forSignalStack
- Conversion methods (
to_ctype
,TryFrom
) provide interoperability with the Linux ABI
This compatibility layer enables the axsignal crate to interact seamlessly with Linux syscalls and application code that expects Linux-compatible signal structures.
Sources: src/types.rs(L123 - L182) src/types.rs(L185 - L215) src/types.rs(L218 - L240) src/action.rs(L85 - L156)
Signal Numbers and Sets
Relevant source files
This document details the signal numbers and signal sets implementation in the axsignal crate, which provides the foundation for signal handling in ArceOS. For information about signal actions and handling, see Signal Actions and Dispositions. For information about pending signal management, see Pending Signals.
Overview
The signal system in axsignal implements Unix-compatible signal numbers and sets that are used throughout the signal handling framework. Signal numbers (represented by the Signo
enum) identify specific signals, while signal sets (represented by the SignalSet
struct) provide an efficient way to manage collections of signals.
flowchart TD subgraph subGraph1["Usage in System"] TSM["ThreadSignalManager"] PSM["ProcessSignalManager"] PS["PendingSignals"] SA["SignalAction"] end subgraph subGraph0["Signal Numbers and Sets"] Signo["Signo Enum"] SignalSet["SignalSet Struct"] end SignalSet --> PS SignalSet --> PSM SignalSet --> TSM Signo --> PS Signo --> SA
Sources: src/types.rs(L9 - L182)
Signal Numbers (Signo)
The Signo
enum defines all standard Unix signals and real-time signals. It is implemented as a u8
enum with explicit numeric values that correspond to standard Unix signal numbers.
Signal Categories
Signal numbers in axsignal are divided into two main categories:
- Standard Signals (1-31): Traditional Unix signals with predefined behaviors
- Real-time Signals (32-64): Extended signals for application-defined purposes
classDiagram class Signo { <<enum>> SIGHUP = 1 SIGINT = 2 ... SIGSYS = 31 SIGRTMIN = 32 ... SIGRT32 = 64 is_realtime() bool default_action() DefaultSignalAction } class DefaultSignalAction { <<enum>> Terminate CoreDump Ignore Stop Continue } Signo --> DefaultSignalAction : returns
Sources: src/types.rs(L9 - L77) src/types.rs(L79 - L120)
Standard Signals
Standard signals (1-31) represent traditional Unix signals, each with a specific purpose and default behavior:
Signal Number | Name | Default Action | Description |
---|---|---|---|
1 | SIGHUP | Terminate | Hangup detected on controlling terminal |
2 | SIGINT | Terminate | Interrupt from keyboard (Ctrl+C) |
3 | SIGQUIT | CoreDump | Quit from keyboard (Ctrl+) |
4 | SIGILL | CoreDump | Illegal instruction |
5 | SIGTRAP | CoreDump | Trace/breakpoint trap |
6 | SIGABRT | CoreDump | Abort signal |
7 | SIGBUS | CoreDump | Bus error |
8 | SIGFPE | CoreDump | Floating-point exception |
9 | SIGKILL | Terminate | Kill signal (cannot be caught or ignored) |
10 | SIGUSR1 | Terminate | User-defined signal 1 |
11 | SIGSEGV | CoreDump | Invalid memory reference |
12 | SIGUSR2 | Terminate | User-defined signal 2 |
13 | SIGPIPE | Terminate | Broken pipe |
14 | SIGALRM | Terminate | Timer signal |
15 | SIGTERM | Terminate | Termination signal |
16 | SIGSTKFLT | Terminate | Stack fault |
17 | SIGCHLD | Ignore | Child stopped or terminated |
18 | SIGCONT | Continue | Continue if stopped |
19 | SIGSTOP | Stop | Stop process (cannot be caught or ignored) |
20 | SIGTSTP | Stop | Stop typed at terminal (Ctrl+Z) |
21 | SIGTTIN | Stop | Terminal input for background process |
22 | SIGTTOU | Stop | Terminal output for background process |
23 | SIGURG | Ignore | Urgent condition on socket |
24 | SIGXCPU | CoreDump | CPU time limit exceeded |
25 | SIGXFSZ | CoreDump | File size limit exceeded |
26 | SIGVTALRM | Terminate | Virtual alarm clock |
27 | SIGPROF | Terminate | Profiling timer expired |
28 | SIGWINCH | Ignore | Window resize signal |
29 | SIGIO | Terminate | I/O now possible |
30 | SIGPWR | Terminate | Power failure |
31 | SIGSYS | CoreDump | Bad system call |
Sources: src/types.rs(L12 - L43) src/types.rs(L84 - L118)
Real-time Signals
Real-time signals (32-64) are numbered from SIGRTMIN
(32) to SIGRT32
(64) and are primarily for application-defined purposes. Unlike standard signals, real-time signals:
- Have no predefined meanings
- Default to the
Ignore
action - Are queued (multiple instances of the same signal can be pending)
flowchart TD subgraph subGraph1["Default Actions"] Terminate["Terminate Process"] CoreDump["Terminate with Core Dump"] Ignore["Ignore Signal"] Stop["Stop Process"] Continue["Continue Process"] end subgraph subGraph0["Signal Number Range"] StandardSignals["Standard Signals (1-31)"] RealTimeSignals["Real-time Signals (32-64)"] end RealTimeSignals --> Ignore StandardSignals --> Continue StandardSignals --> CoreDump StandardSignals --> Ignore StandardSignals --> Stop StandardSignals --> Terminate
Sources: src/types.rs(L44 - L76) src/types.rs(L80 - L82) src/types.rs(L117 - L118)
Signo Implementation
The Signo
enum provides two key methods:
is_realtime()
: Determines if a signal is a real-time signal by checking if its value is greater than or equal toSIGRTMIN
(32).
#![allow(unused)] fn main() { pub fn is_realtime(&self) -> bool { *self >= Signo::SIGRTMIN } }
default_action()
: Returns the default action for a signal (as aDefaultSignalAction
enum).
#![allow(unused)] fn main() { pub fn default_action(&self) -> DefaultSignalAction { match self { Signo::SIGHUP => DefaultSignalAction::Terminate, // ... other cases ... _ => DefaultSignalAction::Ignore, // For real-time signals } } }
Sources: src/types.rs(L79 - L120)
Signal Sets (SignalSet)
A SignalSet
is a bit vector representation of a set of signals, compatible with the C sigset_t
type. It provides an efficient way to represent and manipulate collections of signals.
Representation
The SignalSet
is implemented as a transparent wrapper around a u64
, where:
- Each bit position corresponds to a signal number minus 1
- Bit is set (1) if the signal is in the set
- Bit is clear (0) if the signal is not in the set
flowchart TD subgraph subGraph0["SignalSet Representation"] B1["Bit 0"] S1["SIGHUP (1)"] B2["Bit 1"] S2["SIGINT (2)"] B3["Bit 2"] S3["SIGQUIT (3)"] D["..."] DS["..."] B30["Bit 30"] S31["SIGSYS (31)"] B31["Bit 31"] S32["SIGRTMIN (32)"] B63["Bit 63"] S64["SIGRT32 (64)"] end B1 --> S1 B2 --> S2 B3 --> S3 B30 --> S31 B31 --> S32 B63 --> S64 D --> DS
Sources: src/types.rs(L122 - L126)
Operations
The SignalSet
struct provides several operations for manipulating signal sets:
- Adding a signal:
add(&mut self, signal: Signo) -> bool
- Sets the bit corresponding to the signal
- Returns true if the signal was not already in the set
- Removing a signal:
remove(&mut self, signal: Signo) -> bool
- Clears the bit corresponding to the signal
- Returns true if the signal was in the set
- Checking for a signal:
has(&self, signal: Signo) -> bool
- Returns true if the bit corresponding to the signal is set
- Dequeueing a signal:
dequeue(&mut self, mask: &SignalSet) -> Option<Signo>
- Finds and removes the lowest-numbered signal that is both in the set and in the mask
- Returns the removed signal, or None if no matching signal exists
- Bitwise operations: The struct implements
Not
,BitOr
,BitOrAssign
,BitAnd
, andBitAndAssign
- Allows combining and modifying signal sets using standard bit operations
flowchart TD subgraph subGraph1["SignalSet Operations"] A["Original Set"] B["Modified Set"] Result["Boolean Result"] Result2["Option"] subgraph Operations["Operations"] Add["add(SIGINT)"] Remove["remove(SIGHUP)"] Has["has(SIGTERM)"] Dequeue["dequeue(mask)"] BitwiseAnd["set1 & set2"] BitwiseOr["set1 | set2"] BitwiseNot["!set"] end end A --> Add A --> BitwiseAnd A --> BitwiseNot A --> BitwiseOr A --> Dequeue A --> Has A --> Remove Add --> B BitwiseAnd --> B BitwiseNot --> B BitwiseOr --> B Dequeue --> Result2 Has --> Result Remove --> B
Sources: src/types.rs(L126 - L166)
C API Compatibility
The SignalSet
includes methods for conversion to and from the C kernel_sigset_t
type, ensuring compatibility with system calls and C libraries:
to_ctype(&self, dest: &mut kernel_sigset_t)
: Converts the SignalSet to a C kernel_sigset_tFrom<kernel_sigset_t> for SignalSet
: Converts a C kernel_sigset_t to a SignalSet
flowchart TD subgraph subGraph1["C API"] kernel_sigset_t["kernel_sigset_t"] end subgraph subGraph0["Rust Code"] SignalSet["SignalSet (u64)"] end SignalSet --> kernel_sigset_t kernel_sigset_t --> SignalSet
Sources: src/types.rs(L169 - L181)
Usage in the Signal System
Signal numbers and sets form the foundation of the signal handling system in axsignal:
- Signal identification:
Signo
enumerates all possible signals that can be sent and received. - Signal masking:
SignalSet
is used to represent blocked signals inThreadSignalManager
. - Pending signals:
SignalSet
tracks which signals are pending inPendingSignals
. - Signal delivery control:
SignalSet
determines which signals can be dequeued during signal delivery.
flowchart TD subgraph subGraph1["Signal Management"] TSM["ThreadSignalManager"] PSM["ProcessSignalManager"] Pending["PendingSignals"] Action["SignalAction"] end subgraph subGraph0["Signal Numbers & Sets"] Signo["Signo"] SignalSet["SignalSet"] end SendSignal["send_signal(sig)"] TSM_Blocked["ThreadSignalManager::blocked"] Pending_Set["PendingSignals::set"] Dequeue["dequeue_signal(mask)"] Dequeue --> PSM Dequeue --> TSM Pending_Set --> Pending SendSignal --> PSM SendSignal --> TSM SignalSet --> Dequeue SignalSet --> Pending_Set SignalSet --> TSM_Blocked Signo --> Action Signo --> SendSignal TSM_Blocked --> TSM
Sources: src/types.rs(L9 - L182)
Summary
Signal numbers and sets are fundamental components of the axsignal crate:
Signo
provides a type-safe enumeration of all signal numbers, with additional functionality to determine signal characteristics and default actions.SignalSet
provides an efficient, bit-based representation of signal collections with operations for adding, removing, checking, and dequeueing signals.- Together, they form the foundation for signal identification, blocking, and delivery throughout the signal handling system.
These components follow Unix/POSIX signal conventions while providing Rust-specific advantages like type safety and clear semantics.
Signal Actions and Dispositions
Relevant source files
This document describes the signal action and disposition system in the axsignal
crate, which determines how signals are handled when they are delivered to processes or threads. It covers the core data structures that represent signal handling behaviors and how they interact with the signal processing flow.
For information about the signal numbers and signal sets, see Signal Numbers and Sets. For details about how pending signals are queued, see Pending Signals.
Signal Disposition Types
The SignalDisposition
enum defines what happens when a signal is received:
classDiagram class SignalDisposition { <<enum>> Default Ignore Handler(unsafe extern "C" fn(i32)) } class DefaultSignalAction { <<enum>> Terminate Ignore CoreDump Stop Continue } SignalDisposition "Default" --> DefaultSignalAction : maps to
- Default: Uses the predefined action for the signal (terminate, ignore, etc.)
- Ignore: The signal is completely ignored
- Handler: A custom function is called when the signal is delivered
When Default
is selected, the actual behavior depends on the signal's default action as defined by the DefaultSignalAction
enum.
Sources: src/action.rs(L15 - L31) src/action.rs(L73 - L82)
Signal Action Structure
The SignalAction
structure represents the complete configuration for how a signal should be handled:
classDiagram class SignalAction { +SignalActionFlags flags +SignalSet mask +SignalDisposition disposition +__sigrestore_t restorer +to_ctype(kernel_sigaction) void } class SignalActionFlags { <<bitflags>> +SIGINFO +NODEFER +RESETHAND +RESTART +ONSTACK +RESTORER +from_bits(value) } class SignalDisposition { <<enum>> Default Ignore Handler(extern "C" fn) } class SignalSet { } SignalAction --> SignalActionFlags : contains SignalAction --> SignalDisposition : contains SignalAction --> SignalSet : contains
- flags: Bitflags that modify the behavior of signal handlers
- mask: Set of signals to block while the handler is running
- disposition: What to do with the signal (default, ignore, or handle)
- restorer: Function to restore context after signal handler returns
Sources: src/action.rs(L84 - L112)
Signal Action Flags
The SignalActionFlags
bitflags control aspects of signal handling behavior:
Flag | Description |
---|---|
SIGINFO | Handler uses the SA_SIGINFO interface (3 arguments instead of 1) |
NODEFER | Don't block the signal when handling it |
RESETHAND | Reset to default action after handling the signal once |
RESTART | Automatically restart certain system calls interrupted by the signal |
ONSTACK | Use the alternate signal stack for the handler |
RESTORER | Therestorerfield inSignalActionis valid |
Sources: src/action.rs(L50 - L60)
OS Actions for Signal Handling
When a signal is delivered, the system must take one of several actions based on the signal disposition:
flowchart TD Signal["Signal Delivered"] Disposition["Check Signal Disposition"] DefaultAction["Check Default Action"] NoAction["No Action"] SetupHandler["Set Up Signal Handler"] TerminateProcess["OS: Terminate Process"] CoreDump["OS: Generate Core Dump"] StopProcess["OS: Stop Process"] ContinueProcess["OS: Continue Process"] ExecuteHandler["Execute Handler"] RestoreContext["Restore Context"] DefaultAction --> ContinueProcess DefaultAction --> CoreDump DefaultAction --> NoAction DefaultAction --> StopProcess DefaultAction --> TerminateProcess Disposition --> DefaultAction Disposition --> NoAction Disposition --> SetupHandler ExecuteHandler --> RestoreContext SetupHandler --> ExecuteHandler Signal --> Disposition
The SignalOSAction
enum represents the actions that the OS should take after signal disposition is determined:
- Terminate: End the process
- CoreDump: Generate a core dump and terminate the process
- Stop: Suspend the process execution
- Continue: Resume a stopped process
- Handler: A handler function has been set up (no OS action needed)
Sources: src/action.rs(L36 - L48)
Signal Handler Execution Flow
When a signal with a custom handler is delivered, the system performs these steps:
sequenceDiagram participant ThreadSignalManager as "ThreadSignalManager" participant SignalStack as "Signal Stack" participant SignalFrame as "Signal Frame" participant SignalHandler as "Signal Handler" participant TrapFrame as "Trap Frame" ThreadSignalManager ->> ThreadSignalManager: handle_signal() ThreadSignalManager ->> SignalStack: Check if stack.disabled() || !ONSTACK flag alt Use current stack SignalStack -->> ThreadSignalManager: Use tf.sp() else Use alternate stack SignalStack -->> ThreadSignalManager: Use stack.sp end ThreadSignalManager ->> SignalFrame: Create new SignalFrame ThreadSignalManager ->> SignalFrame: Store UContext (saved state) ThreadSignalManager ->> SignalFrame: Store SignalInfo ThreadSignalManager ->> SignalFrame: Store original TrapFrame ThreadSignalManager ->> TrapFrame: Set IP to handler ThreadSignalManager ->> TrapFrame: Set SP to frame location ThreadSignalManager ->> TrapFrame: Set arguments (signo, siginfo, ucontext) ThreadSignalManager ->> TrapFrame: Set return address to restorer alt If RESETHAND flag alt set ThreadSignalManager ->> ThreadSignalManager: Reset signal action to default end end alt If !NODEFER flag ThreadSignalManager ->> ThreadSignalManager: Add signal to blocked set end ThreadSignalManager -->> SignalHandler: Return (signal handler will execute) SignalHandler -->> ThreadSignalManager: Handler returns to restorer ThreadSignalManager ->> ThreadSignalManager: restore() ThreadSignalManager ->> SignalFrame: Get original TrapFrame ThreadSignalManager ->> TrapFrame: Restore original context ThreadSignalManager ->> ThreadSignalManager: Restore original signal mask ThreadSignalManager -->> ThreadSignalManager: Continue execution
This diagram shows the complete lifecycle of signal handling, from determining the disposition to executing the handler and restoring the original context.
Sources: src/api/thread.rs(L50 - L117) src/api/thread.rs(L145 - L155)
Converting Between C and Rust Types
The SignalAction
structure provides methods to convert to and from the Linux kernel's kernel_sigaction
structure:
From Rust to C Type
The to_ctype
method converts a SignalAction
to a kernel_sigaction
:
- Copies flags
- Converts the signal mask
- Sets the handler based on disposition
- Sets the restorer function if supported
From C to Rust Type
The TryFrom<kernel_sigaction>
implementation converts a kernel_sigaction
to a SignalAction
:
- Validates flags
- Interprets the handler value (None for Default, 1 for Ignore, others as Handler)
- Extracts the signal mask
- Extracts the restorer function if supported
Sources: src/action.rs(L93 - L112) src/action.rs(L115 - L156)
Signal Handler Function Execution Context
When a signal handler executes, it receives:
- Signal number (
signo
) as the first argument - Pointer to a
SignalInfo
structure as the second argument (ifSIGINFO
flag is set) - Pointer to a
UContext
structure as the third argument (ifSIGINFO
flag is set)
The UContext
contains:
- The machine context (
MContext
) with saved CPU registers - The signal mask that was in effect before the handler was called
- Information about the signal stack
Sources: src/api/thread.rs(L14 - L18) src/api/thread.rs(L85 - L95)
Pending Signals
Relevant source files
Overview
This document describes the pending signals system in the axsignal crate, which manages signals that have been delivered but not yet processed by their handlers. The pending signals system is responsible for queuing signals, maintaining their associated information, and dequeuing them when they are ready to be handled.
For information about signal types and representations, see Signal Numbers and Sets. For details on the actions taken when signals are handled, see Signal Actions and Dispositions.
Sources: src/pending.rs(L1 - L66)
PendingSignals Structure
The core of the pending signals system is the PendingSignals
structure, which manages two types of signals:
- Standard signals (1-31): At most one instance of each standard signal can be pending at any time.
- Real-time signals (32-64): Multiple instances of each real-time signal can be queued.
Data Structure Components
classDiagram class PendingSignals { +SignalSet set +Option~SignalInfo~[32] info_std +VecDeque~SignalInfo~[33] info_rt +new() +put_signal(SignalInfo) bool +dequeue_signal(SignalSet) Option~SignalInfo~ } class SignalSet { +u64 bits +add(Signo) bool +dequeue(SignalSet) Option~Signo~ } class SignalInfo { +Signo signo +int32_t si_code +union sigval si_value +pid_t si_pid +uid_t si_uid +... } PendingSignals --> SignalSet : contains PendingSignals --> SignalInfo : stores
The PendingSignals
structure consists of:
set
: A bit field representing which signals are currently pendinginfo_std
: An array storing information for standard signals (indices 1-31)info_rt
: An array of queues storing information for real-time signals (indices 32-64)
Sources: src/pending.rs(L8 - L21)
Signal Queuing Process
Adding Signals to the Queue
When a signal is sent to a process or thread, it's added to the pending queue using the put_signal
method:
flowchart TD Start["put_signal(sig)"] GetSigno["Get signal number"] AddToSet["Add to SignalSet"] IsRT["Is real-time signal?"] QueueRT["Add to info_rt queue"] ReturnTrue["Return true"] AlreadyPending["Was signalalready pending?"] ReturnFalse["Return false"] SetSTD["Store in info_std array"] AddToSet --> IsRT AlreadyPending --> ReturnFalse AlreadyPending --> SetSTD GetSigno --> AddToSet IsRT --> AlreadyPending IsRT --> QueueRT QueueRT --> ReturnTrue SetSTD --> ReturnTrue Start --> GetSigno
Key points about signal queuing:
- Standard signals (1-31) will only be queued once, with repeated signals being ignored
- Real-time signals (32-64) are queued in order of arrival, with multiple instances allowed
- The
put_signal
method returns a boolean indicating whether the signal was added to the queue
Sources: src/pending.rs(L31 - L49) src/api/thread.rs(L157 - L163) src/api/process.rs(L64 - L70)
Signal Dequeuing Process
Retrieving Signals from the Queue
Signals are dequeued when they are ready to be handled, using the dequeue_signal
method:
flowchart TD Start["dequeue_signal(mask)"] DequeueSet["Dequeue a signal number from set"] SignalFound["Signal found?"] ReturnNone["Return None"] IsRT["Is real-time signal?"] PopQueue["Pop from info_rt queue"] QueueEmpty["Queue empty?"] ResetBit["Reset bit in set"] Skip[""] ReturnRT["Return signal info"] TakeSTD["Take from info_std array"] ReturnSTD["Return signal info"] DequeueSet --> SignalFound IsRT --> PopQueue IsRT --> TakeSTD PopQueue --> QueueEmpty QueueEmpty --> ResetBit QueueEmpty --> Skip ResetBit --> ReturnRT SignalFound --> IsRT SignalFound --> ReturnNone Skip --> ReturnRT Start --> DequeueSet TakeSTD --> ReturnSTD
Key points about signal dequeuing:
- Signals are dequeued according to priority (lower signal numbers first)
- Only signals that match the provided mask are considered
- For real-time signals, the queue maintains signal delivery order
- After dequeuing, the signal is removed from the pending set unless more instances exist
Sources: src/pending.rs(L50 - L65)
Hierarchy of Pending Signal Management
The pending signals system operates at two levels:
flowchart TD subgraph subGraph1["Thread Level"] ThreadManager["ThreadSignalManager"] ThreadPending["Thread PendingSignals"] ThreadBlocked["Thread Blocked SignalSet"] end subgraph subGraph0["Process Level"] ProcessManager["ProcessSignalManager"] ProcessPending["Process PendingSignals"] ProcessWaitQueue["WaitQueue"] end DequeueSignal["dequeue_signal()"] SendSignal["send_signal()"] DequeueSignal --> ThreadPending ProcessManager --> ProcessPending ProcessManager --> ProcessWaitQueue SendSignal --> ProcessPending SendSignal --> ThreadPending ThreadManager --> ProcessManager ThreadManager --> ThreadBlocked ThreadManager --> ThreadPending ThreadPending --> ProcessPending
Process-Level Pending Signals
The ProcessSignalManager
maintains a process-wide pending signals queue that is shared among all threads in the process. Signals sent to the process are queued here.
Sources: src/api/process.rs(L33 - L35) src/api/process.rs(L60 - L62) src/api/process.rs(L64 - L70)
Thread-Level Pending Signals
Each ThreadSignalManager
maintains its own pending signals queue for thread-specific signals. When checking for signals to handle, a thread will:
- First check its own pending queue
- Then check the process-level pending queue if no signals are found
This hierarchical approach allows for both process-wide and thread-specific signal delivery.
Sources: src/api/thread.rs(L22 - L26) src/api/thread.rs(L43 - L48) src/api/thread.rs(L157 - L163) src/api/thread.rs(L185 - L188)
Signal Handling Process
When the system checks for signals to handle, it combines the pending signals system with the blocked signals mask:
flowchart TD CheckSignals["check_signals()"] GetBlocked["Get blocked signals"] CreateMask["Create mask of unblocked signals"] Loop["Loop until no signals or handler found"] DequeueSignal["Dequeue signal from thread or process queue"] SignalFound["Signal found?"] ReturnNone["Return None"] GetAction["Get signal action"] HandleSignal["Handle signal based on action"] SignalHandled["Signal handled?"] ReturnAction["Return signal info and action"] CheckSignals --> GetBlocked CreateMask --> Loop DequeueSignal --> SignalFound GetAction --> HandleSignal GetBlocked --> CreateMask HandleSignal --> SignalHandled Loop --> DequeueSignal SignalFound --> GetAction SignalFound --> ReturnNone SignalHandled --> Loop SignalHandled --> ReturnAction
Key points about signal handling:
- Only unblocked signals are considered for handling
- Signals are handled in priority order (lower signal numbers first)
- Standard signals are processed before real-time signals
- The action taken depends on the signal's disposition (default, ignore, or handler)
Sources: src/api/thread.rs(L119 - L143)
Waiting for Signals
The signal system provides mechanisms to wait for signals, implemented through wait queues:
flowchart TD WaitTimeout["wait_timeout(set, timeout)"] CheckDequeue["Check if signal already pending"] Found["Signal found?"] ReturnSignal["Return signal"] SetupWait["Setup wait with timeout"] WaitLoop["Wait on process wait queue"] WakeUp["Woken up"] Timeout["Timed out?"] ReturnNone["Return None"] CheckAgain["Check for pending signal"] SignalFound["Signal found?"] ReturnFound["Return signal"] CheckAgain --> SignalFound CheckDequeue --> Found Found --> ReturnSignal Found --> SetupWait SetupWait --> WaitLoop SignalFound --> ReturnFound SignalFound --> WaitLoop Timeout --> CheckAgain Timeout --> ReturnNone WaitLoop --> WakeUp WaitTimeout --> CheckDequeue WakeUp --> Timeout
When waiting for signals:
- The thread first checks if any of the requested signals are already pending
- If not, it waits on the process wait queue
- When a signal arrives, the queue is notified and the thread wakes up
- The thread checks again for the requested signals
- If found, it returns; otherwise, it continues waiting until timeout
Sources: src/api/thread.rs(L190 - L239) src/api/process.rs(L76 - L81)
Standard vs. Real-Time Signals Comparison
Feature | Standard Signals (1-31) | Real-Time Signals (32-64) |
---|---|---|
Storage | Single slot per signal number | Queue for each signal number |
Queuing | At most one instance pending | Multiple instances can be queued |
Overwriting | New signals overwrite older ones | Signals queued in arrival order |
Information | Minimal signal info stored | Full signal info preserved for each instance |
Typical Use | Common system signals (SIGINT, SIGTERM, etc.) | Application-specific signals with data |
Sources: src/pending.rs(L8 - L21) src/pending.rs(L31 - L49) src/pending.rs(L50 - L65)
Architecture Support
Relevant source files
This document covers the architecture-specific implementation layer of the axsignal
crate, which enables signal handling across multiple CPU architectures. The architecture support subsystem provides platform-specific code for handling CPU context during signal delivery and processing, allowing the signal handling system to work consistently across different hardware platforms.
For information about specific architecture implementations, see:
Architecture Abstraction Layer
The architecture support subsystem employs conditional compilation to select the appropriate implementation based on the target architecture. It provides a consistent interface to the rest of the signal handling system while handling architecture-specific details internally.
flowchart TD subgraph subGraph1["arch Module"] arch_mod["arch/mod.rs"] signal_trampoline["signal_trampoline()"] signal_addr["signal_trampoline_address()"] subgraph subGraph0["Architecture-Specific Implementations"] x86_64["x86_64.rs"] riscv["riscv.rs"] aarch64["aarch64.rs"] loongarch64["loongarch64.rs"] end end trampoline_extern["Arch-specific assembly implementation"] arch_mod --> aarch64 arch_mod --> loongarch64 arch_mod --> riscv arch_mod --> signal_addr arch_mod --> signal_trampoline arch_mod --> x86_64 signal_addr --> signal_trampoline signal_trampoline --> trampoline_extern
Diagram: Architecture Module Structure
Sources: src/arch/mod.rs(L1 - L25) src/lib.rs(L8 - L9)
The architecture abstraction layer is implemented using Rust's conditional compilation feature through the cfg_if
macro. Each supported architecture has its own implementation file that is selected at compile time based on the target architecture.
Common Architecture Interface
Every architecture-specific implementation must provide the following key components:
Component | Purpose |
---|---|
MContext | Machine context - architecture-specific CPU state |
UContext | User context - complete execution context including signal mask |
signal_trampoline | Assembly routine for calling signal handlers |
Context manipulation functions | Save/restore CPU state during signal handling |
Sources: src/arch/mod.rs(L19 - L25)
Signal Context Management
One of the crucial aspects of signal handling is saving and restoring the execution context. The architecture support layer defines two main structures for this purpose:
classDiagram class UContext { +MContext mcontext +SignalStack stack +SignalSet mask +usize flags } class MContext { +registers +program_counter +stack_pointer +other arch-specific state } class TrapFrame { +architecture-specific +register state } UContext "1" *-- "1" MContext : contains MContext "1" --> TrapFrame : converts to/from
Diagram: Signal Context Data Structures
When a signal is delivered to a process or thread, the current execution context must be saved to allow the signal handler to run. After the signal handler completes, the original context is restored. The architecture-specific implementation handles how CPU registers and other hardware state are saved and restored.
Signal Trampoline Mechanism
A critical component provided by the architecture layer is the signal trampoline:
sequenceDiagram participant KernelMode as "Kernel Mode" participant ThreadSignalManager as "ThreadSignalManager" participant signal_trampoline as "signal_trampoline" participant UserSignalHandler as "User Signal Handler" KernelMode ->> ThreadSignalManager: Trap/Exception ThreadSignalManager ->> ThreadSignalManager: check_signals() ThreadSignalManager ->> ThreadSignalManager: handle_signal() ThreadSignalManager ->> KernelMode: Save current context ThreadSignalManager ->> KernelMode: Set up stack for handler ThreadSignalManager ->> signal_trampoline: Jump to signal_trampoline signal_trampoline ->> UserSignalHandler: Call user handler UserSignalHandler ->> signal_trampoline: Return signal_trampoline ->> KernelMode: Call sigreturn syscall KernelMode ->> ThreadSignalManager: restore() ThreadSignalManager ->> KernelMode: Restore original context
Diagram: Signal Trampoline Flow
Sources: src/arch/mod.rs(L19 - L25)
The signal_trampoline
function is a small assembly routine that:
- Calls the user's signal handler with appropriate arguments
- After the handler returns, performs a
sigreturn
syscall to restore the original execution context
This function is critical because it bridges between the kernel's signal delivery mechanism and the user-space signal handler, ensuring proper setup and cleanup.
Build System Integration
The architecture support layer also interacts with the build system to enable or disable certain features based on the target architecture:
flowchart TD subgraph build.rs["build.rs"] target_detection["Detect Target Architecture"] sa_restorer_check["Check sa_restorer Support"] cfg_alias["Set Cargo Configuration"] end architecture_code["Architecture-specific Code"] cfg_alias --> architecture_code sa_restorer_check --> cfg_alias target_detection --> sa_restorer_check
Diagram: Build System Integration
Sources: build.rs(L1 - L25)
The build script (build.rs
) checks whether the target architecture supports the sa_restorer
feature, which is needed for proper signal handler return in some architectures. This configuration is used by the architecture-specific code to adapt its implementation.
Architecture-Specific Features
While all architectures implement the common interface, they differ in several important ways:
Feature | Variations Across Architectures |
---|---|
Register Set | Number and types of registers vary by architecture |
Context Size | x86_64 and ARM64 typically have more registers than RISC-V |
Signal Frame | Different memory layout for saved context |
Return Mechanism | Some usesa_restorer, others use direct jumps |
Stack Alignment | Requirements differ (e.g., 16-byte for x86_64) |
Sources: src/arch/mod.rs(L1 - L17) build.rs(L3 - L15)
Integration with Signal Managers
The architecture support layer integrates with the signal management system as follows:
flowchart TD subgraph subGraph1["Architecture Support Layer"] signal_trampoline["signal_trampoline"] save_context["Save Context Functions"] restore_context["Restore Context Functions"] end subgraph subGraph0["Thread Signal Manager"] check_signals["check_signals()"] handle_signal["handle_signal()"] restore["restore()"] end check_signals --> handle_signal handle_signal --> save_context handle_signal --> signal_trampoline restore --> restore_context
Diagram: Integration with Signal Management
When the ThreadSignalManager
needs to deliver a signal, it uses the architecture-specific functions to:
- Save the current execution context
- Set up the stack frame for the signal handler
- Jump to the architecture-specific
signal_trampoline
- Upon return from the signal handler, restore the original context
This design allows the higher-level signal management logic to remain architecture-independent while delegating platform-specific operations to the architecture support layer.
Summary
The architecture support subsystem provides a critical abstraction layer that enables the signal handling system to work consistently across different CPU architectures. By encapsulating architecture-specific details and providing a uniform interface, it allows the rest of the system to operate in an architecture-agnostic manner while still benefiting from hardware-specific optimizations.
Each architecture implementation provides specialized routines for:
- Context saving and restoration
- Signal trampoline implementation
- Conversion between trap frames and user contexts
- Stack management for signal handlers
This modular design makes it easier to add support for new architectures while maintaining compatibility with existing code.
Sources: src/arch/mod.rs(L1 - L25) src/lib.rs(L8 - L9) build.rs(L1 - L25)
x86_64 Implementation
Relevant source files
This page documents the x86_64-specific implementation of the signal handling mechanism in the AxSignal crate. It covers the architecture-specific data structures, context management, and assembly code used for handling signals on the x86_64 architecture. For information about other architectures, see ARM64 Implementation, RISC-V Implementation, or LoongArch64 Implementation.
Overview
The x86_64 implementation provides architecture-specific components required for signal handling, including:
- The signal trampoline assembly code
- Machine context (MContext) for saving/restoring CPU registers
- User context (UContext) structure for the complete signal handling context
These components work together to allow saving the current execution state when a signal occurs, executing a signal handler, and then restoring the original state to resume normal execution.
Sources: src/arch/mod.rs(L1 - L26) src/arch/x86_64.rs(L1 - L4)
Signal Trampoline
The signal trampoline is a small assembly function that serves as the return mechanism after a signal handler completes execution. It's designed to be a fixed-address function that can be reliably used by the signal handling system.
flowchart TD A["Signal Handler"] B["signal_trampoline"] C["syscall(15)"] D["Return to Original Execution"] A --> B B --> C C --> D
Implementation Details
The signal trampoline is implemented in assembly and placed in its own 4KB-aligned section:
- It executes syscall 15 (0xF), which is designated for signal return
- The assembly code is padded to occupy a full 4KB page
The trampoline's address is exposed through the signal_trampoline_address()
function, allowing the signal handling system to set up the return address for signal handlers.
Sources: src/arch/mod.rs(L19 - L25) src/arch/x86_64.rs(L5 - L17)
Machine Context (MContext)
The MContext
structure represents the complete CPU register state for x86_64 architecture. This structure is crucial for:
- Saving the processor state when a signal is delivered
- Restoring the processor state when returning from a signal handler
Structure Layout
The MContext
structure contains all general-purpose registers, instruction pointer, stack pointer, flags, segment registers, and other CPU state information:
classDiagram class MContext { +usize r8 +usize r9 +usize r10 +usize r11 +usize r12 +usize r13 +usize r14 +usize r15 +usize rdi +usize rsi +usize rbp +usize rbx +usize rdx +usize rax +usize rcx +usize rsp +usize rip +usize eflags +u16 cs +u16 gs +u16 fs +u16 _pad +usize err +usize trapno +usize oldmask +usize cr2 +usize fpstate +[usize; 8] _reserved1 +new(tf: &TrapFrame) +restore(&self, tf: &mut TrapFrame) }
Conversion Methods
The MContext
structure provides methods to convert between the trap frame format and the machine context format:
new()
: Creates a newMContext
by copying register values from aTrapFrame
restore()
: Updates aTrapFrame
with register values from theMContext
These methods enable seamless conversion between the kernel's internal representation of CPU state (TrapFrame) and the architecture-specific representation used for signal handling (MContext).
Sources: src/arch/x86_64.rs(L19 - L109)
User Context (UContext)
The UContext
structure combines the machine context with additional information needed for signal handling, providing a complete context for signal handlers.
Structure Layout
classDiagram class UContext { +usize flags +usize link +SignalStack stack +MContext mcontext +SignalSet sigmask +new(tf: &TrapFrame, sigmask: SignalSet) } class SignalStack { // Signal stack information } class SignalSet { // Signal mask information } class MContext { // Machine context(register state) } UContext --> MContext : contains UContext --> SignalStack : contains UContext --> SignalSet : contains
The UContext
structure includes:
flags
: Used for various control flagslink
: Pointer to linked context (for nested signals)stack
: Information about the signal stackmcontext
: The machine context (CPU registers)sigmask
: The signal mask to be applied during handler execution
Context Creation
The UContext::new()
method creates a new user context from a trap frame and signal mask:
- It initializes the flags and link fields to zero
- Sets up a default signal stack
- Creates a new machine context from the provided trap frame
- Stores the provided signal mask
This combined context provides all the information a signal handler needs to execute properly and allows for correct state restoration afterward.
Sources: src/arch/x86_64.rs(L111 - L131)
Signal Handling Flow on x86_64
The following diagram illustrates the complete flow of signal handling on x86_64, from signal delivery to handler execution and context restoration.
sequenceDiagram participant KernelExecution as "Kernel Execution" participant ThreadSignalManager as "ThreadSignalManager" participant MContext as "MContext" participant UContext as "UContext" participant SignalHandler as "Signal Handler" participant signal_trampoline as "signal_trampoline" KernelExecution ->> ThreadSignalManager: Trap occurs (signal generated) ThreadSignalManager ->> MContext: Save current context ThreadSignalManager ->> MContext: MContext::new(trap_frame) MContext ->> UContext: Create user context MContext ->> UContext: UContext::new(trap_frame, sigmask) ThreadSignalManager ->> KernelExecution: Modify trap frame to point to handler KernelExecution ->> SignalHandler: Resume execution (now in handler) Note over SignalHandler: Handler executes SignalHandler ->> signal_trampoline: Return when complete signal_trampoline ->> KernelExecution: syscall(15) - Signal return KernelExecution ->> ThreadSignalManager: Handle signal return ThreadSignalManager ->> UContext: Retrieve saved context UContext ->> MContext: Extract machine context MContext ->> KernelExecution: Restore trap frame MContext ->> KernelExecution: mcontext.restore(trap_frame) KernelExecution ->> KernelExecution: Resume original execution
The key steps in this process are:
- When a signal is delivered, the current CPU state is saved into an
MContext
- A full
UContext
is created, including the machine context, signal mask, and stack info - The trap frame is modified to point to the signal handler
- When the signal handler returns, it goes to
signal_trampoline
- The trampoline executes syscall 15 to return to the kernel
- The saved context is restored, and normal execution resumes
This architecture-specific implementation ensures that signals can be properly handled on x86_64 systems without corrupting the execution state of the process.
Sources: src/arch/x86_64.rs(L1 - L131)
Register State Mapping
The following table shows how register state is mapped between the TrapFrame
and MContext
structures:
TrapFrame Field | MContext Field | Description |
---|---|---|
r8 | r8 | General purpose register R8 |
r9 | r9 | General purpose register R9 |
r10 | r10 | General purpose register R10 |
r11 | r11 | General purpose register R11 |
r12 | r12 | General purpose register R12 |
r13 | r13 | General purpose register R13 |
r14 | r14 | General purpose register R14 |
r15 | r15 | General purpose register R15 |
rdi | rdi | First function argument register |
rsi | rsi | Second function argument register |
rbp | rbp | Base pointer register |
rbx | rbx | General purpose register (callee saved) |
rdx | rdx | Third function argument register |
rax | rax | Return value register |
rcx | rcx | Fourth function argument register |
rsp | rsp | Stack pointer register |
rip | rip | Instruction pointer register |
rflags | eflags | CPU flags register |
cs | cs | Code segment register |
error_code | err | Error code from exception |
vector | trapno | Interrupt/exception vector number |
This mapping ensures that all necessary register state is preserved during signal handling.
Sources: src/arch/x86_64.rs(L53 - L108)
Integration with Signal Management System
The x86_64 implementation integrates with the broader signal management system through the following mechanisms:
flowchart TD subgraph subGraph2["Architecture Interface"] AM["arch/mod.rs"] STA["signal_trampoline_address()"] end subgraph subGraph1["x86_64 Implementation"] MCT["MContext"] UCT["UContext"] ST["signal_trampoline"] end subgraph subGraph0["Signal Management System"] TSM["ThreadSignalManager"] PSM["ProcessSignalManager"] end AM --> MCT AM --> STA AM --> UCT MCT --> UCT ST --> STA STA --> TSM TSM --> MCT UCT --> MCT
Key integration points:
- The
signal_trampoline_address()
function exposes the address of the architecture-specific trampoline - The
MContext
andUContext
structures are used by theThreadSignalManager
to save and restore execution context - The architecture module (
arch/mod.rs
) selects and exports the appropriate implementation based on the target architecture
This modular design allows the signal management system to work consistently across different architectures while handling the architecture-specific details appropriately.
Sources: src/arch/mod.rs(L1 - L26) src/arch/x86_64.rs(L1 - L131)
ARM64 Implementation
Relevant source files
This document describes the ARM64 (AArch64) architecture-specific implementation of the signal handling system in ArceOS. It details how signal context management, trampolines, and architecture-specific data structures are implemented for ARM64 processors. For information about other architecture implementations, see x86_64 Implementation, RISC-V Implementation, or LoongArch64 Implementation.
Overview
The ARM64 implementation provides the architecture-specific components needed for signal handling, including:
- A signal trampoline for transferring control to user signal handlers
- Context management structures for saving and restoring CPU state
- Context conversion utilities between trap frames and signal contexts
flowchart TD subgraph subGraph1["Key Functions"] save["Context Saving"] restore["Context Restoration"] syscall["rt_sigreturn Syscall"] end subgraph subGraph0["ARM64 Signal Implementation"] trampoline["signal_trampoline()"] mcontext["MContext"] ucontext["UContext"] end mcontext --> restore mcontext --> save restore --> mcontext save --> ucontext trampoline --> syscall ucontext --> mcontext
Sources: src/arch/aarch64.rs src/arch/mod.rs
Signal Trampoline
The signal trampoline is a small piece of assembly code that serves as the return path from signal handlers. When a signal handler completes execution, the trampoline is called to restore the original execution context and return to the interrupted code.
The ARM64 signal trampoline is implemented as:
- A page-aligned assembly function that makes syscall 139 (typically
rt_sigreturn
in Unix-like systems) - The function is padded to fill an entire 4096-byte page
The implementation in assembly is:
signal_trampoline:
mov x8, #139 ; Load syscall number 139 into x8 register
svc #0 ; Trigger supervisor call (system call)
This trampoline is accessed via the signal_trampoline_address()
function, which returns its memory address for use during signal handler setup.
Sources: src/arch/aarch64.rs(L5 - L16) src/arch/mod.rs(L19 - L25)
Machine Context (MContext)
The MContext
structure is responsible for storing the complete CPU state necessary to restore execution after signal handling. It captures all registers and processor state flags.
classDiagram class MContext { +u64 fault_address +u64[31] regs +u64 sp +u64 pc +u64 pstate +MContextPadding __reserved +new(TrapFrame) MContext +restore(TrapFrame) } class MContextPadding { +u8[4096] 0 } MContext --> MContextPadding
The MContext
structure:
- Is 16-byte aligned for optimal performance on ARM64
- Contains all 31 general-purpose registers (x0-x30)
- Stores critical CPU state including stack pointer, program counter, and processor state
- Includes a large reserved padding area
- Provides methods to create from and restore to a trap frame
This structure effectively captures the entire execution state that must be preserved during signal handling.
Sources: src/arch/aarch64.rs(L18 - L51)
User Context (UContext)
The UContext
structure provides a higher-level abstraction that combines the machine context with additional signal-related information. This matches the structure expected by user-level signal handlers.
classDiagram class UContext { +usize flags +usize link +SignalStack stack +SignalSet sigmask +u8[] __unused +MContext mcontext +new(TrapFrame, SignalSet) UContext } class MContext { +u64 fault_address +u64[31] regs +u64 sp +u64 pc +u64 pstate +padding } class SignalStack { +stack attributes } class SignalSet { +signal mask bits } UContext --> MContext UContext --> SignalStack UContext --> SignalSet
The UContext
structure includes:
- Flags for context management
- A link field that can point to another context
- A
SignalStack
for defining the stack used during signal handling - A
SignalSet
representing the signal mask during handler execution - Reserved space to ensure proper sizing and alignment
- The
MContext
containing all CPU registers and state
During signal handling, this structure is used to:
- Save the current execution context before calling the handler
- Configure the signal environment for the handler execution
- Restore the original context when the handler completes
Sources: src/arch/aarch64.rs(L53 - L75)
Context Management Flow
The following diagram illustrates how the ARM64 implementation manages context during signal handling:
sequenceDiagram participant UserProcess as "User Process" participant KernelArceOS as "Kernel/ArceOS" participant SignalHandler as "Signal Handler" participant signal_trampoline as "signal_trampoline" UserProcess ->> KernelArceOS: Normal Execution KernelArceOS ->> KernelArceOS: Signal Received KernelArceOS ->> KernelArceOS: Create MContext from TrapFrame KernelArceOS ->> KernelArceOS: Create UContext with MContext KernelArceOS ->> SignalHandler: Set up and jump to handler with UContext SignalHandler ->> SignalHandler: Handle signal SignalHandler ->> signal_trampoline: Return via trampoline signal_trampoline ->> KernelArceOS: syscall rt_sigreturn (139) KernelArceOS ->> KernelArceOS: Extract MContext from UContext KernelArceOS ->> KernelArceOS: Restore TrapFrame from MContext KernelArceOS ->> UserProcess: Resume original execution
When a signal is delivered:
- The current CPU state is captured in a
TrapFrame
- This state is converted to an
MContext
- An
UContext
is built including theMContext
and signal information - The signal handler is called with this context
- When the handler returns, the signal trampoline is executed
- The syscall in the trampoline triggers the kernel to restore the original context
- Regular execution continues from where it was interrupted
Sources: src/arch/aarch64.rs(L34 - L45) src/arch/aarch64.rs(L45 - L50) src/arch/aarch64.rs(L65 - L74)
Context Conversion Process
The ARM64 implementation provides efficient methods for converting between trap frames and contexts:
Creation Process
When creating an MContext
from a TrapFrame
, the following fields are mapped:
- General registers (r0-r30) are copied directly
- The user stack pointer (usp) becomes the stack pointer (sp)
- The exception link register (elr) becomes the program counter (pc)
- The saved program status register (spsr) becomes the processor state (pstate)
Restoration Process
When restoring a TrapFrame
from an MContext
, the reverse mappings occur:
- General registers are copied back
- The stack pointer is restored to usp
- The program counter is restored to elr
- The processor state is restored to spsr
This bidirectional conversion ensures that execution context is properly preserved during signal handling.
Sources: src/arch/aarch64.rs(L34 - L45) src/arch/aarch64.rs(L45 - L50)
Integration with Signal Handling System
The ARM64 implementation integrates with the overall signal handling system through the architecture abstraction layer defined in arch/mod.rs
. This layer provides a unified interface for all supported architectures while allowing architecture-specific implementations of critical components.
flowchart TD subgraph subGraph2["Signal Handling System"] thread_manager["ThreadSignalManager"] process_manager["ProcessSignalManager"] end subgraph subGraph1["ARM64 Implementation"] aarch64["aarch64.rs"] mcontext["MContext"] ucontext["UContext"] trampoline["signal_trampoline"] end subgraph subGraph0["Architecture Module"] arch_mod["arch/mod.rs"] trampoline_addr["signal_trampoline_address()"] end aarch64 --> mcontext aarch64 --> trampoline aarch64 --> ucontext arch_mod --> aarch64 process_manager --> arch_mod thread_manager --> arch_mod trampoline_addr --> trampoline
The key integration points are:
- The architecture module exposes the
signal_trampoline_address()
function - The signal handling system uses this function to set up signal handlers
- The
MContext
andUContext
structures are used to manage execution state - The architecture-specific context conversion methods are used during signal delivery and return
This abstraction allows the core signal handling logic to remain architecture-agnostic while leveraging the ARM64-specific implementation for context management.
Sources: src/arch/mod.rs(L1 - L17) src/arch/mod.rs(L19 - L25)
Summary
The ARM64 implementation provides the architecture-specific components required for signal handling on AArch64 processors:
- Signal Trampoline: A carefully positioned assembly function that makes the
rt_sigreturn
syscall - Machine Context (MContext): A structure capturing all ARM64 CPU registers and state
- User Context (UContext): A higher-level structure combining machine context with signal information
- Context Management Methods: Functions to convert between trap frames and contexts
These components work together to ensure that signal handling can properly save and restore execution state on ARM64 platforms.
Sources: src/arch/aarch64.rs src/arch/mod.rs
RISC-V Implementation
Relevant source files
This document details the RISC-V architecture-specific implementation of signal handling in the axsignal
crate. It covers the signal trampoline mechanism, context saving/restoring operations, and the data structures specific to RISC-V processors. For information about other architectures, see the corresponding implementation pages: x86_64 Implementation, ARM64 Implementation, and LoongArch64 Implementation.
RISC-V Signal Handling Architecture
The RISC-V signal handling implementation provides the architecture-specific components needed to save CPU state before executing a signal handler and to restore that state afterward. It consists of two main components:
- A signal trampoline implementation in assembly language
- Data structures for storing CPU context
The implementation supports both 32-bit (riscv32) and 64-bit (riscv64) RISC-V architectures through a unified module.
flowchart TD subgraph subGraph1["Integration Points"] TSM["ThreadSignalManager"] TF["TrapFrame"] end subgraph subGraph0["Signal Handling System"] ARCH["arch/mod.rs"] RISCV["arch/riscv.rs"] TRAMP["signal_trampoline"] MCTX["MContext"] UCTX["UContext"] end ARCH --> RISCV MCTX --> TF MCTX --> UCTX RISCV --> MCTX RISCV --> TRAMP RISCV --> UCTX TF --> MCTX TSM --> TRAMP TSM --> UCTX
Sources: src/arch/mod.rs(L1 - L26) src/arch/riscv.rs(L1 - L64)
Signal Trampoline Implementation
The signal trampoline is a small piece of assembly code that serves as the bridge between signal handler execution and returning to normal execution. In RISC-V, it's implemented as a simple syscall wrapper that invokes syscall number 139 (sigreturn).
flowchart TD subgraph subGraph0["Signal Trampoline Flow"] SH["Signal Handler"] ST["signal_trampoline"] SC["Syscall 139 (sigreturn)"] KR["Kernel Return Processing"] RT["Return to Normal Execution"] end KR --> RT SC --> KR SH --> ST ST --> SC
The trampoline is defined in assembly and aligned to a 4096-byte page boundary:
flowchart TD ASM["Assembly Code"] TRAM["signal_trampoline"] ECALL["syscall 139 (sigreturn)"] ASM --> TRAM TRAM --> ECALL
Sources: src/arch/riscv.rs(L5 - L16) src/arch/mod.rs(L19 - L25)
Context Data Structures
The RISC-V implementation defines two key structures for context management:
MContext Structure
MContext
stores the essential machine context that needs to be saved and restored during signal handling.
classDiagram class MContext { +usize pc -GeneralRegisters regs -usize[66] fpstate +new(TrapFrame) MContext +restore(TrapFrame) void } class TrapFrame { +usize sepc +GeneralRegisters regs } MContext --> TrapFrame : converts from/to
The structure contains:
pc
: Program counter (stored assepc
in the trap frame)regs
: General-purpose registers from theGeneralRegisters
structurefpstate
: Floating-point state (66 words of storage)
UContext Structure
UContext
is a higher-level structure that encapsulates MContext
along with additional signal-related information.
The structure contains:
flags
: Context flags (not currently used, set to 0)link
: Link to another context (not currently used, set to 0)stack
: Signal stack information (typeSignalStack
)sigmask
: Signal mask (typeSignalSet
)__unused
: Padding to ensure proper structure alignmentmcontext
: The machine context described above
Sources: src/arch/riscv.rs(L18 - L63)
Context Operations
The RISC-V implementation provides two primary operations on context:
- Context Creation: Converting from a trap frame to an
MContext
/UContext
- Context Restoration: Restoring a trap frame from an
MContext
Context Creation
When a signal is delivered, the current CPU state (represented by a TrapFrame
) is saved into an MContext
and then into a UContext
.
sequenceDiagram participant ThreadSignalManager as ThreadSignalManager participant TrapFrame as TrapFrame participant MContext as MContext participant UContext as UContext ThreadSignalManager ->> UContext: new(tf, sigmask) UContext ->> MContext: new(tf) MContext ->> TrapFrame: read sepc to pc MContext ->> TrapFrame: copy regs MContext ->> MContext: initialize fpstate to zeros UContext ->> UContext: initialize other fields
Context Restoration
After signal handler execution, the saved context is restored to continue normal execution.
sequenceDiagram participant ThreadSignalManager as ThreadSignalManager participant TrapFrame as TrapFrame participant MContext as MContext ThreadSignalManager ->> MContext: restore(tf) MContext ->> TrapFrame: write pc to sepc MContext ->> TrapFrame: copy regs
Sources: src/arch/riscv.rs(L27 - L38) src/arch/riscv.rs(L53 - L62)
Integration with Signal Handling System
The RISC-V implementation integrates with the rest of the signal handling system through the architecture abstraction layer defined in arch/mod.rs
. This layer selects the appropriate architecture-specific implementation at compile time based on the target architecture.
flowchart TD subgraph subGraph1["Signal Processing"] TSM["ThreadSignalManager"] TADDR["signal_trampoline_address()"] TRAMP["signal_trampoline"] SH["Signal Handler"] ARCH["arch/mod.rs"] X86["x86_64.rs"] RISCV["riscv.rs"] ARM["aarch64.rs"] end subgraph subGraph0["Architecture Selection"] TSM["ThreadSignalManager"] TADDR["signal_trampoline_address()"] SH["Signal Handler"] ARCH["arch/mod.rs"] X86["x86_64.rs"] RISCV["riscv.rs"] ARM["aarch64.rs"] LOONG["loongarch64.rs"] end ARCH --> ARM ARCH --> LOONG ARCH --> RISCV ARCH --> X86 SH --> TRAMP TADDR --> TRAMP TSM --> SH TSM --> TADDR
Key integration points:
- The
signal_trampoline_address()
function provides the address of the architecture-specific trampoline implementation ThreadSignalManager
uses the context structures to save and restore CPU state
Sources: src/arch/mod.rs(L1 - L26)
Technical Details
Signal Trampoline Memory Layout
The signal trampoline is carefully aligned to a 4096-byte page boundary and padded to fill an entire page. This is important for security and memory protection:
.section .text
.balign 4096
.global signal_trampoline
signal_trampoline:
li a7, 139 # Load syscall number 139 (sigreturn) into a7
ecall # Execute syscall
.fill 4096 - (. - signal_trampoline), 1, 0 # Fill remainder of page with zeros
The trampoline simply loads the sigreturn syscall number (139) into register a7 and executes the syscall instruction.
RISC-V Register Handling
The MContext
structure saves the program counter (PC) separately from the general registers. During restoration:
- The program counter is restored to the
sepc
(Supervisor Exception Program Counter) field of the trap frame - The general registers are copied directly between the trap frame and
MContext
The floating-point state (fpstate
) is currently initialized to zeros but provides space for future implementations to save floating-point registers.
Sources: src/arch/riscv.rs(L5 - L16) src/arch/riscv.rs(L27 - L38)
Summary
The RISC-V implementation in the axsignal
crate provides the architecture-specific components needed for signal handling on RISC-V processors. It defines the data structures for saving and restoring CPU context (MContext
and UContext
) and implements the signal trampoline needed to return from signal handlers. The implementation supports both 32-bit and 64-bit RISC-V architectures through a single module.
The architecture-specific implementation is selected at compile time based on the target architecture, ensuring that the appropriate code is used without runtime overhead.
Sources: src/arch/mod.rs(L1 - L26) src/arch/riscv.rs(L1 - L64)
LoongArch64 Implementation
Relevant source files
Purpose and Scope
This document describes the LoongArch64-specific implementation of signal handling in the axsignal
crate. It details the architecture-specific data structures, register context management, and signal trampoline mechanism that enable Unix-like signal handling on the LoongArch64 architecture. For a general overview of the architecture support system, see Architecture Support.
Signal Context Management
The LoongArch64 implementation provides specialized structures for managing CPU context during signal handling operations. These structures are critical for preserving and restoring the CPU state when a signal handler is invoked and when it returns.
Context Structures
classDiagram class TrapFrame { +regs: [u64; 32] +era: usize +Other architecture-specific registers } class MContext { +sc_pc: u64 +sc_regs: [u64; 32] +sc_flags: u32 +new(tf: &TrapFrame) +restore(&self, tf: &mut TrapFrame) } class UContext { +flags: usize +link: usize +stack: SignalStack +sigmask: SignalSet +__unused: [u8; ...] +mcontext: MContext +new(tf: &TrapFrame, sigmask: SignalSet) } class SignalSet { } class SignalStack { } TrapFrame --> MContext : converted to MContext --> UContext : contained in SignalSet --> UContext : contained in SignalStack --> UContext : contained in
The LoongArch64 implementation defines two main context structures:
- MContext (Machine Context): Stores the CPU register state for LoongArch64
sc_pc
: Program counter (instruction pointer)sc_regs
: Array of 32 general-purpose registerssc_flags
: Context flags
- UContext (User Context): Encapsulates the complete execution context
flags
: Context flagslink
: Pointer to linked contextstack
: Signal stack informationsigmask
: Signal mask in effectmcontext
: Machine context (CPU registers)
Sources: src/arch/loongarch64.rs(L20 - L67)
Signal Trampoline
The signal trampoline is a critical piece of assembly code that provides a reliable mechanism for returning from signal handlers. It executes a system call (rt_sigreturn) to restore the original execution context.
flowchart TD SignalDelivery["Signal Delivery"] SetupStack["Set up Handler Stack"] SaveContext["Save Current Context"] InvokeHandler["Invoke Signal Handler"] SignalTrampoline["Signal Trampoline"] SyscallRtSigreturn["Syscall rt_sigreturn (139)"] RestoreContext["Restore Original Context"] ResumeExecution["Resume Original Execution"] InvokeHandler --> SignalTrampoline RestoreContext --> ResumeExecution SaveContext --> InvokeHandler SetupStack --> SaveContext SignalDelivery --> SetupStack SignalTrampoline --> SyscallRtSigreturn SyscallRtSigreturn --> RestoreContext
The LoongArch64 signal trampoline is implemented in assembly:
signal_trampoline:
li.w $a7, 139 # Load syscall number 139 (rt_sigreturn)
syscall 0 # Make syscall
The trampoline is aligned on a 4096-byte boundary and padded to fill a full page, ensuring it has a predictable memory layout. When the signal handler completes, execution flows to this trampoline, which performs syscall 139 (rt_sigreturn) to restore the original execution context.
Sources: src/arch/loongarch64.rs(L7 - L18) src/arch/mod.rs(L19 - L25)
Context Conversion and Restoration
The LoongArch64 implementation provides methods to convert between the TrapFrame
structure (used by the kernel) and the MContext
structure (used for signal handling).
sequenceDiagram participant Kernel as "Kernel" participant SignalManager as "Signal Manager" participant SignalHandler as "Signal Handler" participant SignalTrampoline as SignalTrampoline Kernel ->> SignalManager: Deliver signal with TrapFrame SignalManager ->> SignalManager: Create MContext from TrapFrame SignalManager ->> SignalManager: Create UContext with MContext SignalManager ->> SignalHandler: Invoke with UContext pointer SignalHandler -->> SignalTrampoline: Return SignalTrampoline ->> Kernel: rt_sigreturn syscall Kernel ->> SignalManager: Restore TrapFrame from UContext SignalManager ->> SignalManager: MContext.restore(TrapFrame) SignalManager ->> Kernel: Resume execution
Context Creation
When a signal is delivered, the system creates an MContext
from the current TrapFrame
:
- The
MContext::new
method creates a new machine context from a trap frame - It copies the program counter (
era
) and all 32 general-purpose registers
Context Restoration
When a signal handler returns, the system restores the original TrapFrame
from the saved MContext
:
- The
MContext::restore
method updates the trap frame with saved values - It restores the program counter (
era
) and all 32 general-purpose registers
Sources: src/arch/loongarch64.rs(L28 - L43)
Memory Layout for Signal Handling
When a signal is delivered, the system sets up a specific memory layout on the user stack to facilitate signal handling.
flowchart TD subgraph subGraph0["Signal Handler Stack Layout"] signalHandler["Signal Handler Function"] signoArg["Signal Number (Argument 1)"] siginfoArg["SignalInfo Pointer (Argument 2)"] ucontextArg["UContext Pointer (Argument 3)"] returnAddress["Return Address (signal_trampoline)"] savedRegisters["Saved Registers (MContext)"] end stackGrowth["Stack Growth Direction ↓"] returnAddress --> savedRegisters siginfoArg --> ucontextArg signalHandler --> signoArg signoArg --> siginfoArg stackGrowth --> signalHandler ucontextArg --> returnAddress
The key components of this memory layout are:
- Signal Handler Function: The entry point for the signal handler
- Arguments: Three arguments are passed to the handler:
- Signal number
- Pointer to signal information
- Pointer to user context (UContext)
- Return Address: Set to the
signal_trampoline
function - Saved Context: The complete user context (UContext) including:
- Signal mask
- Signal stack information
- Machine context (registers)
This layout ensures that when the signal handler returns, it will jump to the signal trampoline, which will restore the original execution context through the rt_sigreturn syscall.
Sources: src/arch/loongarch64.rs(L7 - L18) src/arch/loongarch64.rs(L45 - L67)
Comparison with Other Architectures
The LoongArch64 implementation shares many similarities with other RISC architectures in the axsignal
crate, particularly with RISC-V. However, there are architecture-specific differences in register naming and context structure.
Feature | LoongArch64 | RISC-V | x86_64 | AArch64 |
---|---|---|---|---|
PC Register | era | sepc | rip | elr_el1 |
Register Count | 32 GP registers | 32 GP registers | 16 GP registers | 31 GP registers |
Context Flags | Simple 32-bit flags | Simple 32-bit flags | EFLAGS/XSAVE | PSTATE flags |
Signal Trampoline | Syscall 139 | Syscall 139 | Syscall 15 | Syscall 139 |
The main architecture-specific aspects of the LoongArch64 implementation include:
- Register Set: LoongArch64 has 32 general-purpose registers (like RISC-V)
- Program Counter: Called
era
(Exception Return Address) - Assembly Instructions: Uses LoongArch64-specific instructions like
li.w
andsyscall
Sources: src/arch/loongarch64.rs(L20 - L43)
Integration with Signal Handling System
The LoongArch64 implementation integrates with the broader signal handling system through the architecture abstraction layer in src/arch/mod.rs
.
The integration points include:
- Architecture Selection: Conditional compilation selects the LoongArch64 implementation based on the target architecture
- Signal Trampoline Address: Exposed through a common function to get the address of the architecture-specific signal trampoline
- Context Management: The architecture-specific UContext and MContext structures are used by the signal manager to save and restore execution context
Sources: src/arch/mod.rs(L1 - L25)
Summary
The LoongArch64 implementation in the axsignal
crate provides the architecture-specific components needed for Unix-like signal handling on LoongArch64 processors. It includes:
- A signal trampoline mechanism for returning from signal handlers
- Machine context (MContext) and user context (UContext) structures for saving and restoring CPU state
- Methods for converting between trap frames and machine contexts
- Integration with the architecture-independent signal handling system
These components enable the axsignal crate to provide a consistent signal handling API across different architectures, including LoongArch64.
Build Configuration and Dependencies
Relevant source files
Purpose and Scope
This document details the build configuration and dependency management aspects of the axsignal
crate. It explains how the crate is configured for different target architectures, its external dependencies, build-time configuration mechanisms, and integration with the ArceOS ecosystem. For information about the signal handling implementation details, refer to Signal Management System and Architecture Support.
Dependency Structure
The axsignal
crate is designed with carefully selected dependencies to provide Unix-like signal handling functionality within the ArceOS framework.
flowchart TD subgraph subGraph2["Patched Dependencies"] page_table_multiarch["page_table_multiarch"] page_table_entry["page_table_entry"] end subgraph subGraph1["ArceOS Dependencies"] axconfig["axconfig"] axhal["axhal (with uspace feature)"] axtask["axtask (with multitask feature)"] end subgraph subGraph0["Core Dependencies"] axerrno["axerrno (0.1.0)"] bitflags["bitflags (2.6)"] cfg_if["cfg-if (1.0.0)"] linux_raw_sys["linux-raw-sys (0.9.3)"] log["log (0.4)"] strum_macros["strum_macros (0.27.1)"] lock_api["lock_api (0.4.12)"] derive_more["derive_more (2.0.1)"] end axsignal["axsignal Crate"] axhal --> page_table_entry axhal --> page_table_multiarch axsignal --> axconfig axsignal --> axerrno axsignal --> axhal axsignal --> axtask axsignal --> bitflags axsignal --> cfg_if axsignal --> derive_more axsignal --> linux_raw_sys axsignal --> lock_api axsignal --> log axsignal --> strum_macros
Diagram: Dependency Structure of axsignal Crate
Sources: Cargo.toml(L6 - L26)
Standard Dependencies
The axsignal
crate relies on several standard Rust crates:
Dependency | Version | Purpose |
---|---|---|
axerrno | 0.1.0 | Provides error code definitions for system calls |
bitflags | 2.6 | Used for creating type-safe bit flags (e.g., signal sets) |
cfg-if | 1.0.0 | Simplifies conditional compilation |
linux-raw-sys | 0.9.3 | Provides low-level Linux system call definitions |
log | 0.4 | Logging functionality |
strum_macros | 0.27.1 | Used for enum string conversions |
lock_api | 0.4.12 | Abstractions for synchronization primitives |
derive_more | 2.0.1 | Additional derive macros for common traits |
The linux-raw-sys
dependency is configured with default-features = false
and explicitly enables the general
and no_std
features, ensuring compatibility with the no_std environment of ArceOS.
Sources: Cargo.toml(L6 - L26)
ArceOS Dependencies
The crate integrates with ArceOS through the following dependencies:
Dependency | Features | Purpose |
---|---|---|
axconfig | (none) | Configuration constants and parameters from ArceOS |
axhal | uspace | Hardware abstraction layer with userspace support |
axtask | multitask | Task/thread management system |
These dependencies are sourced directly from the ArceOS GitHub repository:
axconfig = { git = "https://github.com/oscomp/arceos.git" }
axhal = { git = "https://github.com/oscomp/arceos.git", features = ["uspace"] }
axtask = { git = "https://github.com/oscomp/arceos.git", features = ["multitask"] }
Sources: Cargo.toml(L10 - L14)
Dependency Patches
The axsignal
crate applies patches to two dependencies:
[patch.crates-io]
page_table_multiarch = { git = "https://github.com/Mivik/page_table_multiarch.git", rev = "19ededd" }
page_table_entry = { git = "https://github.com/Mivik/page_table_multiarch.git", rev = "19ededd" }
These patches ensure compatibility with the specific memory management requirements of ArceOS by using patched versions of the page table libraries.
Sources: Cargo.toml(L28 - L30)
Architecture-Specific Build Configuration
The axsignal
crate is designed to support multiple CPU architectures, with different implementation details for each. The build system automatically configures the appropriate architecture-specific code based on the target platform.
flowchart TD subgraph subGraph2["Implementation Files"] arch_mod["arch/mod.rs"] x86_64_impl["arch/x86_64.rs"] aarch64_impl["arch/aarch64.rs"] riscv_impl["arch/riscv.rs"] loongarch64_impl["arch/loongarch64.rs"] end subgraph subGraph1["Supported Architectures"] x86_64["x86_64"] x86["x86"] powerpc["powerpc"] powerpc64["powerpc64"] s390x["s390x"] arm["arm"] aarch64["aarch64"] other["Other Architectures"] end subgraph subGraph0["Architecture-Specific Config"] sa_restorer_cfg["sa_restorer cfg flag"] end build_rs["build.rs"] target["CARGO_CFG_TARGET_ARCH"] aarch64 --> sa_restorer_cfg arch_mod --> aarch64_impl arch_mod --> loongarch64_impl arch_mod --> riscv_impl arch_mod --> x86_64_impl arm --> sa_restorer_cfg build_rs --> target other --> sa_restorer_cfg powerpc --> sa_restorer_cfg powerpc64 --> sa_restorer_cfg s390x --> sa_restorer_cfg sa_restorer_cfg --> arch_mod target --> sa_restorer_cfg x86 --> sa_restorer_cfg
Diagram: Architecture-Specific Build Configuration
Sources: build.rs(L1 - L25)
The sa_restorer Configuration
The build.rs
script creates a configuration flag called sa_restorer
that is enabled only for specific architectures. This flag is used to conditionally compile code that handles the signal return trampoline mechanism:
fn main() {
let target_arch = std::env::var("CARGO_CFG_TARGET_ARCH").unwrap();
alias(
"sa_restorer",
[
"x86_64",
"x86",
"powerpc",
"powerpc64",
"s390x",
"arm",
"aarch64",
]
.contains(&target_arch.as_str()),
);
}
The sa_restorer
feature is architecture-dependent because only certain architectures support or require a dedicated signal return trampoline. This configuration flag allows the signal handling implementation to adapt to the specifics of each architecture.
Sources: build.rs(L1 - L15)
Build Script Helper Function
The build script uses a helper function called alias
to create the configuration flag:
#![allow(unused)] fn main() { fn alias(alias: &str, has_feature: bool) { println!("cargo:rustc-check-cfg=cfg({alias})"); if has_feature { println!("cargo:rustc-cfg={alias}"); } } }
This function:
- Declares the existence of the configuration option via
cargo:rustc-check-cfg
- Conditionally enables the configuration via
cargo:rustc-cfg
Sources: build.rs(L18 - L25)
Conditional Compilation Structure
The axsignal
crate makes extensive use of Rust's conditional compilation features to adapt to different environments and architectures. This approach allows the code to maintain compatibility with multiple platforms while minimizing redundancy.
flowchart TD subgraph subGraph2["Conditional Code Paths"] sa_restorer_code["Signal Restorer Logic"] arch_specific["Architecture-Specific Signal Context"] common_code["Common Signal Code"] end subgraph subGraph1["Architecture Implementations"] x86_64_impl["x86_64 Implementation"] aarch64_impl["aarch64 Implementation"] riscv_impl["RISC-V Implementation"] loongarch_impl["LoongArch64 Implementation"] end subgraph subGraph0["Compilation Conditions"] target_arch["Target Architecture"] sa_restorer["sa_restorer Feature"] feature_flags["Feature Flags"] end crate["axsignal Crate"] build["build.rs"] aarch64_impl --> arch_specific arch_specific --> common_code build --> sa_restorer crate --> target_arch loongarch_impl --> arch_specific riscv_impl --> arch_specific sa_restorer --> sa_restorer_code target_arch --> aarch64_impl target_arch --> loongarch_impl target_arch --> riscv_impl target_arch --> x86_64_impl x86_64_impl --> arch_specific
Diagram: Conditional Compilation Structure
Sources: build.rs(L1 - L25) Cargo.toml(L6 - L26)
Target Architecture Selection
The cfg-if
crate is used throughout the codebase to selectively include architecture-specific implementations based on the target architecture. For example, in the arch/mod.rs
file, different architecture-specific modules would be conditionally included:
cfg_if::cfg_if! {
if #[cfg(target_arch = "x86_64")] {
mod x86_64;
pub use self::x86_64::*;
} else if #[cfg(target_arch = "aarch64")] {
mod aarch64;
pub use self::aarch64::*;
} else if #[cfg(any(target_arch = "riscv32", target_arch = "riscv64"))] {
mod riscv;
pub use self::riscv::*;
} else if #[cfg(target_arch = "loongarch64")] {
mod loongarch64;
pub use self::loongarch64::*;
} else {
compile_error!("Unsupported target architecture");
}
}
This pattern ensures that only the appropriate architecture-specific code is compiled into the final binary.
Sources: Cargo.toml(L16)
The sa_restorer Configuration
The sa_restorer
configuration flag created by the build script enables conditional compilation of code related to the signal restoration mechanism. In architectures that support sa_restorer
, the signal action structure will include an additional field for the restorer function pointer.
For example, code using this flag might look like:
pub struct SignalOSAction {
pub handler: usize,
pub flags: SaFlags,
pub mask: SignalSet,
#[cfg(sa_restorer)]
pub restorer: usize,
}
This conditional field ensures that the signal action structure is correctly defined for each supported architecture.
Sources: build.rs(L1 - L15)
Integration with ArceOS
The axsignal
crate is designed to integrate seamlessly with the ArceOS operating system kernel. This integration is facilitated by the dependency specifications in the Cargo.toml
file and the design of the signal handling interfaces.
flowchart TD subgraph subGraph1["axsignal Integration"] axsignal["axsignal Crate"] managers["Signal Managers"] arch_support["Architecture Support"] signal_types["Signal Types"] end subgraph subGraph0["ArceOS Ecosystem"] arceos["ArceOS Kernel"] axconfig["axconfig"] axhal["axhal"] axtask["axtask"] other_modules["Other ArceOS Modules"] end arceos --> axconfig arceos --> axhal arceos --> axsignal arceos --> axtask arceos --> other_modules axhal --> arch_support axsignal --> axconfig axsignal --> axhal axsignal --> axtask axtask --> managers
Diagram: Integration with ArceOS Ecosystem
Sources: Cargo.toml(L10 - L14)
Dependency on axhal
The axhal
dependency is included with the uspace
feature enabled:
axhal = { git = "https://github.com/oscomp/arceos.git", features = ["uspace"] }
This dependency provides the hardware abstraction layer functionalities required for signal handling, such as:
- Access to trap frames and CPU context management
- Architecture-specific operations for signal handling
- Userspace support for delivering signals to user applications
The uspace
feature specifically enables the userspace support components in axhal
that are necessary for implementing signal handling in a user/kernel separated environment.
Sources: Cargo.toml(L11)
Dependency on axtask
The axtask
dependency is included with the multitask
feature enabled:
axtask = { git = "https://github.com/oscomp/arceos.git", features = ["multitask"] }
This dependency provides the task/thread management system that axsignal
uses to:
- Associate signal handlers with specific threads
- Manage signal delivery to the appropriate targets
- Coordinate the execution and scheduling of signal handlers
The multitask
feature ensures that proper thread management capabilities are available, which is essential for implementing per-thread signal handling.
Sources: Cargo.toml(L12 - L14)
Dependency on axconfig
The axconfig
dependency contains ArceOS configuration constants and parameters:
axconfig = { git = "https://github.com/oscomp/arceos.git" }
This dependency provides configuration settings that affect signal handling behavior, such as:
- Maximum number of concurrent signals
- Sizes of signal-related buffers
- System-wide constants affecting signal delivery
Sources: Cargo.toml(L10)
Build-Time Configuration Summary
The following table summarizes the key build-time configuration aspects of the axsignal
crate:
Configuration Aspect | Mechanism | Purpose |
---|---|---|
Architecture Support | Target architecture detection | Select appropriate architecture-specific implementation |
sa_restorer Feature | build.rs script | Enable/disable restorer functionality based on architecture |
ArceOS Integration | Git dependencies | Connect with other ArceOS components |
Feature Flags | Cargo features | Enable specific functionality (e.g., uspace, multitask) |
Dependency Patching | Cargo [patch] section | Ensure compatibility with specific dependency versions |
Sources: build.rs(L1 - L25) Cargo.toml(L1 - L31)
Conclusion
The build configuration and dependency management of the axsignal
crate are designed to support a flexible, cross-architecture signal handling implementation that integrates seamlessly with the ArceOS ecosystem. The crate uses conditional compilation extensively to adapt to different target architectures while maintaining a clean and maintainable codebase.
The build script provides architecture-specific configurations, while carefully selected dependencies enable the crate to leverage existing ArceOS components for tasks such as thread management and hardware abstraction. This approach allows axsignal
to provide Unix-like signal handling capabilities across multiple architectures with minimal redundancy and maximum compatibility.
Overview
Relevant source files
The axptr
library provides a safe abstraction for kernel code to access user-space memory. It prevents the kernel from crashing when accessing potentially invalid user memory while providing a convenient API for common user memory operations. This page introduces the main components of axptr
and provides a high-level understanding of its architecture.
For detailed information about the specific pointer types used for memory safety, see User Space Pointers. For comprehensive information about the safety mechanisms, see Safety Mechanisms.
Purpose and Scope
axptr
addresses a common challenge in operating system development: safely accessing memory that belongs to user processes. User-provided pointers can't be trusted directly because they might:
- Point to invalid memory addresses
- Have insufficient access permissions
- Be improperly aligned
- Cause page faults that could crash the kernel
This library provides a robust solution by wrapping raw pointers with safety checks and contextual page fault handling.
Sources: src/lib.rs(L1)
Key Components
flowchart TD subgraph subGraph0["axptr Library"] A["UserPtr"] D["User-space Memory"] B["UserConstPtr"] C["AddrSpaceProvider"] E["Address Space Management"] F["Safety Mechanisms"] G["Kernel from Crashes"] end H["Kernel Code"] A --> D B --> D C --> E F --> G H --> A H --> B
The library consists of several key components that work together to provide safe access to user-space memory:
- User pointers:
UserPtr<T>
: A wrapper around*mut T
for safe mutable access to user memoryUserConstPtr<T>
: A wrapper around*const T
for safe read-only access to user memory
- Address space abstraction:
AddrSpaceProvider
: A trait that abstracts operations for working with address spaces
- Safety mechanisms:
- Alignment checking
- Access permission validation
- Page table population
- Context-aware page fault handling
Sources: src/lib.rs(L128 - L170) src/lib.rs(L219 - L254) src/lib.rs(L119 - L126) src/lib.rs(L31 - L54)
Memory Access Flow
The following diagram illustrates the typical flow when kernel code accesses user-space memory through axptr
:
sequenceDiagram participant KernelCode as "Kernel Code" participant UserPtrUserConstPtr as "UserPtr/UserConstPtr" participant AddrSpaceProvider as "AddrSpaceProvider" participant check_region as "check_region()" participant UserMemory as "User Memory" KernelCode ->> UserPtrUserConstPtr: Request access (get/get_as_slice) UserPtrUserConstPtr ->> check_region: check_region_with() check_region ->> AddrSpaceProvider: with_addr_space() AddrSpaceProvider ->> check_region: Provide AddrSpace check_region ->> check_region: Check alignment check_region ->> check_region: Verify access permissions check_region ->> check_region: Populate page tables alt Memory checks pass check_region -->> UserPtrUserConstPtr: Return Ok(()) UserPtrUserConstPtr ->> UserMemory: Set ACCESSING_USER_MEM = true UserPtrUserConstPtr ->> UserMemory: Access memory safely UserPtrUserConstPtr ->> UserMemory: Set ACCESSING_USER_MEM = false UserPtrUserConstPtr -->> KernelCode: Return reference to user memory else Memory checks fail check_region -->> UserPtrUserConstPtr: Return Err(EFAULT) UserPtrUserConstPtr -->> KernelCode: Propagate error end
When a kernel function wants to access user memory:
- It calls a method like
get()
orget_as_slice()
on a user pointer - The user pointer performs safety checks through
check_region()
- If checks pass, the pointer accesses memory with special handling for page faults
- A reference to the memory is returned or an error if access is invalid
Sources: src/lib.rs(L171 - L198) src/lib.rs(L256 - L277) src/lib.rs(L31 - L54) src/lib.rs(L22 - L29)
Code Architecture
The following diagram shows the relationship between the main types and their important methods:
classDiagram class UserPtr~T~ { +*mut T pointer +const ACCESS_FLAGS: MappingFlags +address() VirtAddr +as_ptr() *mut T +cast~U~() UserPtr~U~ +is_null() bool +nullable() Option~Self~ +get() LinuxResult~&mut T~ +get_as_slice() LinuxResult~&mut [T]~ +get_as_null_terminated() LinuxResult~&mut [T]~ } class UserConstPtr~T~ { +*const T pointer +const ACCESS_FLAGS: MappingFlags +address() VirtAddr +as_ptr() *const T +cast~U~() UserConstPtr~U~ +is_null() bool +nullable() Option~Self~ +get() LinuxResult~&T~ +get_as_slice() LinuxResult~&[T]~ +get_as_null_terminated() LinuxResult~&[T]~ +get_as_str() LinuxResult~&str~ } class AddrSpaceProvider { <<trait>> +with_addr_space(f) R } class SafetyFunctions { <<functions>> +check_region() +check_null_terminated() +is_accessing_user_memory() +access_user_memory() } UserPtr --> SafetyFunctions : uses UserConstPtr --> SafetyFunctions : uses UserPtr --> AddrSpaceProvider : requires UserConstPtr --> AddrSpaceProvider : requires
The architecture follows these principles:
- Separate types for mutable (
UserPtr
) and read-only (UserConstPtr
) access - A trait (
AddrSpaceProvider
) to abstract address space operations - Helper functions to manage safety checks and context-aware page fault handling
- Methods on user pointer types for common operations like getting a single value, a slice, or a null-terminated array
Sources: src/lib.rs(L128 - L217) src/lib.rs(L219 - L303) src/lib.rs(L119 - L126) src/lib.rs(L18 - L107)
Context-Aware Page Fault Handling
One of the key safety features of axptr
is context-aware page fault handling:
flowchart TD A["Kernel attempts to access user memory"] B["axptr sets ACCESSING_USER_MEM = true"] C["Memory access occurs"] D["Page fault?"] E["OS page fault handler checks is_accessing_user_memory()"] F["is_accessing_user_memory() == true?"] G["Handle as user memory fault (non-fatal)"] H["Handle as kernel fault (potentially fatal)"] I["Memory access completes normally"] J["axptr sets ACCESSING_USER_MEM = false"] A --> B B --> C C --> D D --> E D --> I E --> F F --> G F --> H G --> J I --> J
This mechanism allows the OS to distinguish between:
- Page faults that occur when intentionally accessing user memory (expected and should be handled gracefully)
- Page faults in kernel code (may indicate a kernel bug and could be treated more severely)
The is_accessing_user_memory()
function is provided for OS page fault handlers to check this context.
Sources: src/lib.rs(L11 - L20) src/lib.rs(L22 - L29)
Dependencies
axptr
has the following key dependencies:
Dependency | Purpose |
---|---|
axerrno | Provides error codes and result types (LinuxError,LinuxResult) |
axmm | Memory management, providesAddrSpace |
memory_addr | Virtual address manipulation |
page_table_multiarch | Page table and memory mapping flags |
percpu | Per-CPU variable support |
Sources: src/lib.rs(L4 - L7) Cargo.toml(L7 - L12)
Conclusion
The axptr
library provides a comprehensive solution for safely accessing user-space memory from kernel code. By wrapping raw pointers in smart container types that perform necessary safety checks and implement context-aware page fault handling, it helps prevent kernel crashes while providing a convenient API.
For more detailed information about specific components, refer to the following pages:
- Memory Safety Architecture
- User Space Pointers
- Address Space Management
- Safety Mechanisms
- API Reference
Memory Safety Architecture
Relevant source files
This document explains the core architecture and design principles of the memory safety system in the axptr library. It focuses on how the system provides a safe interface for kernel code to access user-space memory while preventing potential security vulnerabilities or system crashes. For details about specific pointer types, see User Space Pointers, and for information about safety mechanisms, see Safety Mechanisms.
Overview of Memory Safety Architecture
The axptr library implements a robust architecture to ensure memory operations across privilege boundaries (kernel accessing user memory) remain safe. The architecture is built around three key principles:
- Type-safe access - Using strongly-typed pointer wrappers
- Memory region validation - Ensuring pointers reference valid user memory regions
- Context-aware fault handling - Managing page faults during user memory access
Sources: src/lib.rs(L129 - L183) src/lib.rs(L219 - L254)
Core Components
The memory safety architecture consists of these fundamental components:
Component | Description | Role |
---|---|---|
UserPtr | Typed wrapper for mutable user pointers | Provides safe access to user memory with read/write permissions |
UserConstPtr | Typed wrapper for immutable user pointers | Provides safe access to user memory with read-only permissions |
AddrSpaceProvider | Trait for address space operations | Abstracts address space lookup and access control |
Memory checking functions | Safety validation utilities | Verifies memory region alignment, permissions, and availability |
Context tracking | Page fault handling mechanism | Manages page faults during user memory access |
classDiagram class UserPtr~T~ { +*mut T pointer +const ACCESS_FLAGS: MappingFlags +get() +get_as_slice() +get_as_null_terminated() } class UserConstPtr~T~ { +*const T pointer +const ACCESS_FLAGS: MappingFlags +get() +get_as_slice() +get_as_null_terminated() } class AddrSpaceProvider { <<trait>> +with_addr_space() } UserPtr --> AddrSpaceProvider : uses UserConstPtr --> AddrSpaceProvider : uses
Sources: src/lib.rs(L119 - L126) src/lib.rs(L129 - L134) src/lib.rs(L219 - L225)
Memory Access Workflow
The core workflow for safely accessing user memory follows these steps:
sequenceDiagram participant KernelCode as "Kernel Code" participant UserPtrUserConstPtr as "UserPtr/UserConstPtr" participant check_region as "check_region()" participant ACCESSING_USER_MEMflag as "ACCESSING_USER_MEM flag" participant UserMemory as "User Memory" KernelCode ->> UserPtrUserConstPtr: Request user memory access UserPtrUserConstPtr ->> check_region: Validate memory region check_region ->> check_region: Check alignment check_region ->> check_region: Verify access permissions check_region ->> check_region: Populate page tables alt Memory region valid check_region ->> UserPtrUserConstPtr: Access permitted UserPtrUserConstPtr ->> ACCESSING_USER_MEMflag: Set to true UserPtrUserConstPtr ->> UserMemory: Access memory UserPtrUserConstPtr ->> ACCESSING_USER_MEMflag: Set to false UserPtrUserConstPtr ->> KernelCode: Return memory reference else Memory region invalid check_region ->> UserPtrUserConstPtr: Return EFAULT UserPtrUserConstPtr ->> KernelCode: Propagate error end
Sources: src/lib.rs(L31 - L54) src/lib.rs(L11 - L29) src/lib.rs(L175 - L183)
Memory Region Validation
Before any user memory access, a series of validation steps ensure memory safety:
- Alignment Checking: Ensures the pointer is properly aligned for the requested type
- Access Permission Verification: Checks that the memory region has appropriate read/write permissions
- Page Table Population: Ensures that all required pages are mapped in the address space
flowchart TD start["Memory Access Request"] align["Check Alignment"] error["Return EFAULT"] perms["Check Access Permissions"] populate["Populate Page Tables"] access["Set ACCESSING_USER_MEM flag"] read["Access Memory"] clear["Clear ACCESSING_USER_MEM flag"] finish["Return Result"] access --> read align --> error align --> perms clear --> finish perms --> error perms --> populate populate --> access populate --> error read --> clear start --> align
Sources: src/lib.rs(L31 - L54) src/lib.rs(L110 - L117)
Context-Aware Page Fault Handling
A key aspect of the memory safety architecture is handling page faults during user memory access. This is accomplished through the ACCESSING_USER_MEM
flag, which indicates when the kernel is accessing user memory.
stateDiagram-v2 state AccessingUser { [*] --> Reading Reading --> PageFault : Page not present PageFault --> Reading : Handle fault safely } [*] --> Normal Normal --> AccessingUser : set ACCESSING_USER_MEM = true AccessingUser --> Normal : set ACCESSING_USER_MEM = false
The architecture uses a per-CPU variable to track this state:
#[percpu::def_percpu]
static mut ACCESSING_USER_MEM: bool = false;
When set to true, the OS knows that any page faults occurring should be handled differently than regular kernel page faults, preventing kernel crashes from invalid user memory accesses.
Sources: src/lib.rs(L11 - L29) src/lib.rs(L22 - L29)
Null-Terminated Data Handling
The architecture includes specialized handling for null-terminated data like C strings, which is particularly important for OS interfaces:
flowchart TD request["Request null-terminated data"] check["check_null_terminated()"] page["Process page by page"] scan["Scan for null terminator"] return["Return validated slice"] check --> page page --> scan request --> check scan --> return
This process efficiently handles null-terminated structures while maintaining safety guarantees by:
- Validating pages incrementally as needed
- Handling page faults appropriately during traversal
- Returning the correctly sized slice or string when the null terminator is found
Sources: src/lib.rs(L56 - L107) src/lib.rs(L202 - L217) src/lib.rs(L280 - L303)
Security Implications
The memory safety architecture provides critical security guarantees:
- Protection against invalid memory access: Prevents kernel crashes from accessing invalid user memory
- Defense against privilege escalation: Ensures kernel code can only access user memory with proper permissions
- Safety from malicious user input: Validates user-provided pointers before use
By combining strong typing, rigorous validation, and context-aware fault handling, the architecture creates a comprehensive barrier against memory-related security vulnerabilities when crossing privilege boundaries.
Sources: src/lib.rs(L31 - L54) src/lib.rs(L11 - L29)
Integration with Operating System
The architecture is designed to integrate with operating systems through:
flowchart TD subgraph subGraph1["Process Management"] errnoSys["axerrno"] percpu["percpu"] end subgraph subGraph0["Memory Subsystem"] mmSys["axmm"] memAddr["memory_addr"] pageTable["page_table_multiarch"] end axptr["axptr"] memorySubsystem["Memory Subsystem"] os["Operating System Kernel"] processManagement["Process Management"] axptr --> errnoSys axptr --> memAddr axptr --> mmSys axptr --> pageTable axptr --> percpu memorySubsystem --> os processManagement --> os
The architecture's dependencies enable it to work with the underlying memory management system while providing a consistent error handling mechanism through Linux-compatible error codes.
Sources: Cargo.toml(L7 - L12)
User Space Pointers
Relevant source files
This page details the UserPtr
and UserConstPtr
types provided by the axptr library, which serve as safe abstractions for accessing user-space memory from kernel code. These pointer types prevent common errors that could lead to kernel crashes when handling user memory, such as invalid pointers, improper alignment, and unauthorized memory access.
For information about address space management that these pointers rely on, see Address Space Management. For details on the safety mechanisms they implement, see Safety Mechanisms.
Core Pointer Types
The axptr library provides two primary pointer types for accessing user-space memory:
UserPtr<T>
: For read-write access to user-space memoryUserConstPtr<T>
: For read-only access to user-space memory
Both types are represented as transparent wrappers around raw pointers (*mut T
and *const T
respectively), providing a safe interface for kernel code to access user-space memory.
classDiagram class UserPtr~T~ { +*mut T pointer +const ACCESS_FLAGS: MappingFlags +address() VirtAddr +as_ptr() *mut T +cast~U~() UserPtr~U~ +is_null() bool +nullable() Option~Self~ +get() LinuxResult~&mut T~ +get_as_slice() LinuxResult~&mut [T]~ +get_as_null_terminated() LinuxResult~&mut [T]~ } class UserConstPtr~T~ { +*const T pointer +const ACCESS_FLAGS: MappingFlags +address() VirtAddr +as_ptr() *const T +cast~U~() UserConstPtr~U~ +is_null() bool +nullable() Option~Self~ +get() LinuxResult~&T~ +get_as_slice() LinuxResult~&[T]~ +get_as_null_terminated() LinuxResult~&[T]~ +get_as_str() LinuxResult~&str~ } class RawPointer { <<Rust raw pointer>> } UserPtr --|> RawPointer : "Wraps *mut T" UserConstPtr --|> RawPointer : "Wraps *const T"
Sources: src/lib.rs(L128 - L130) src/lib.rs(L219 - L221)
Creating User Space Pointers
Both pointer types can be created from a user-space address represented as a usize
:
flowchart TD A["User-space address (usize)"] B["From::from()"] C["UserPtr or UserConstPtr"] A --> B B --> C
Sources: src/lib.rs(L130 - L134) src/lib.rs(L221 - L225)
Memory Access Flow
The main purpose of these pointer types is to provide safe access to user-space memory. The following diagram illustrates the flow of operations when accessing user memory:
sequenceDiagram participant KernelCode as "Kernel Code" participant UserPtrUserConstPtr as "UserPtr/UserConstPtr" participant AddrSpaceProvider as "AddrSpaceProvider" participant check_region as "check_region()" participant UserMemory as "User Memory" KernelCode ->> UserPtrUserConstPtr: Call get(), get_as_slice(), etc. UserPtrUserConstPtr ->> AddrSpaceProvider: Pass to AddrSpaceProvider AddrSpaceProvider ->> check_region: Call check_region() check_region ->> check_region: Verify alignment check_region ->> check_region: Check access permissions check_region ->> check_region: Populate page tables alt Memory access allowed check_region ->> UserPtrUserConstPtr: Return OK UserPtrUserConstPtr ->> UserMemory: Set ACCESSING_USER_MEM flag UserPtrUserConstPtr ->> UserMemory: Access memory safely UserPtrUserConstPtr ->> UserMemory: Clear ACCESSING_USER_MEM flag UserPtrUserConstPtr ->> KernelCode: Return memory reference else Memory access denied check_region ->> UserPtrUserConstPtr: Return EFAULT error UserPtrUserConstPtr ->> KernelCode: Propagate error end
Sources: src/lib.rs(L11 - L20) src/lib.rs(L22 - L29) src/lib.rs(L31 - L54) src/lib.rs(L175 - L183) src/lib.rs(L258 - L266)
Core Methods
Common Methods
Both UserPtr<T>
and UserConstPtr<T>
provide the following methods:
Method | Return Type | Description |
---|---|---|
address() | VirtAddr | Gets the virtual address of the pointer |
as_ptr() | *mut T/*const T | Unwraps the pointer into a raw pointer (unsafe) |
cast() | UserPtr/UserConstPtr | Casts the pointer to a different type |
is_null() | bool | Checks if the pointer is null |
nullable() | Option | Converts toOption |
Sources: src/lib.rs(L136 - L169) src/lib.rs(L227 - L254)
Memory Access Methods
ForUserPtr:
Method | Return Type | Description |
---|---|---|
get(aspace) | LinuxResult<&mut T> | Gets mutable access to the pointed value |
get_as_slice(aspace, length) | LinuxResult<&mut [T]> | Gets mutable access to a slice of values |
get_as_null_terminated(aspace) | LinuxResult<&mut [T]> | Gets mutable access to a null-terminated array |
Sources: src/lib.rs(L171 - L198) src/lib.rs(L201 - L217)
ForUserConstPtr:
Method | Return Type | Description |
---|---|---|
get(aspace) | LinuxResult<&T> | Gets read-only access to the pointed value |
get_as_slice(aspace, length) | LinuxResult<&[T]> | Gets read-only access to a slice of values |
get_as_null_terminated(aspace) | LinuxResult<&[T]> | Gets read-only access to a null-terminated array |
get_as_str()(only forUserConstPtr<c_char>) | LinuxResult<&'static str> | Gets read-only access as a UTF-8 string |
Sources: src/lib.rs(L256 - L278) src/lib.rs(L280 - L292) src/lib.rs(L294 - L303)
Safety Mechanisms
The main safety mechanisms implemented by these types include:
- Memory Region Validation: Before accessing user memory, the pointer types check if the memory region is accessible with the required permissions.
- Alignment Checks: Ensures the memory is properly aligned for the requested type.
- Page Table Population: Automatically populates page tables if necessary.
- Page Fault Handling: Using a flag to indicate when accessing user memory to properly handle page faults.
flowchart TD A["Access request via get()/get_as_slice()"] B["Check alignment"] C["Return EFAULT error"] D["Check region permissions"] E["Populate page tables"] F["Return error"] G["Set ACCESSING_USER_MEM flag"] H["Access memory"] I["Clear ACCESSING_USER_MEM flag"] J["Return reference to memory"] A --> B B --> C B --> D D --> C D --> E E --> F E --> G G --> H H --> I I --> J
Sources: src/lib.rs(L31 - L54) src/lib.rs(L11 - L12) src/lib.rs(L18 - L20) src/lib.rs(L22 - L29)
Null-Terminated Data Handling
A special feature of the pointer types is their ability to safely handle null-terminated data (such as C strings). The get_as_null_terminated()
method performs a specialized check that scans the user memory page by page until it finds a null terminator.
flowchart TD A["get_as_null_terminated()"] B["check_null_terminated()"] C["Check alignment"] D["Set ACCESSING_USER_MEM flag"] E["Scan memory page by page"] F["Found null terminator?"] G["Clear ACCESSING_USER_MEM flag"] H["Check next page permissions"] I["Return EFAULT error"] J["Return reference to memory slice"] A --> B B --> C C --> D D --> E E --> F F --> G F --> H G --> J H --> E H --> I
Sources: src/lib.rs(L56 - L107) src/lib.rs(L201 - L217) src/lib.rs(L280 - L292)
Type-Specific Operations
The UserConstPtr<c_char>
type provides additional functionality specifically for handling C strings:
flowchart TD A["UserConstPtr"] B["get_as_null_terminated()"] C["Obtain character slice"] D["Convert to u8 slice"] E["from_utf8()"] F["Return &str"] G["Return EILSEQ error"] A --> B B --> C C --> D D --> E E --> F E --> G
Sources: src/lib.rs(L294 - L303)
Integration with Address Space Management
The user pointer types work with the AddrSpaceProvider
trait to abstract address space operations. This allows them to work with different address space implementations as long as they implement this trait.
classDiagram class AddrSpaceProvider { <<trait>> +with_addr_space(f) R } class UserPtr~T~ { +get(aspace: impl AddrSpaceProvider) +get_as_slice(aspace: impl AddrSpaceProvider, length) +get_as_null_terminated(aspace: impl AddrSpaceProvider) } class UserConstPtr~T~ { +get(aspace: impl AddrSpaceProvider) +get_as_slice(aspace: impl AddrSpaceProvider, length) +get_as_null_terminated(aspace: impl AddrSpaceProvider) +get_as_str(aspace: impl AddrSpaceProvider) } UserPtr --> AddrSpaceProvider : "Requires" UserConstPtr --> AddrSpaceProvider : "Requires"
Sources: src/lib.rs(L119 - L126) src/lib.rs(L175 - L183) src/lib.rs(L258 - L266)
Common Usage Patterns
The typical usage pattern for user space pointers in kernel code involves:
- Receiving a user-space address as a
usize
- Converting it to a
UserPtr<T>
orUserConstPtr<T>
- Using the appropriate get method to safely access the memory
- Handling potential errors (EFAULT, EILSEQ, etc.)
sequenceDiagram participant KernelFunction as "Kernel Function" participant UserPtrT as "UserPtr<T>" participant AddrSpace as "AddrSpace" participant UserMemory as "User Memory" KernelFunction ->> UserPtrT: Create from user address KernelFunction ->> UserPtrT: Call get(), get_as_slice(), etc. UserPtrT ->> AddrSpace: Request permission check AddrSpace ->> AddrSpace: Validate memory region AddrSpace ->> UserPtrT: Return result alt Access Granted UserPtrT ->> UserMemory: Safely access memory UserMemory ->> KernelFunction: Return data/reference else Access Denied UserPtrT ->> KernelFunction: Return error (EFAULT) end
Sources: src/lib.rs(L130 - L134) src/lib.rs(L221 - L225) src/lib.rs(L175 - L183) src/lib.rs(L258 - L266)
Address Space Management
Relevant source files
Purpose and Scope
This document covers the address space management components within the axptr library. The address space management layer provides an abstraction for safely interacting with user-space memory through virtual address spaces and page tables. For information about the user space pointers that utilize this abstraction, see User Space Pointers.
Overview
Address space management in axptr is built around the AddrSpaceProvider
trait, which serves as a bridge between user pointers (UserPtr
/UserConstPtr
) and the underlying memory management system. This abstraction allows for flexible implementation of address space operations while maintaining memory safety guarantees.
flowchart TD subgraph subGraph2["Memory Management Layer"] check["Memory Region Checking"] populate["Page Table Population"] end subgraph subGraph1["Address Space Layer"] asp["AddrSpaceProvider trait"] aspimp["AddrSpace implementation"] end subgraph subGraph0["User Pointer Layer"] userptr["UserPtr"] userconstptr["UserConstPtr"] end asp --> aspimp aspimp --> check aspimp --> populate userconstptr --> asp userptr --> asp
Sources: src/lib.rs(L119 - L126) src/lib.rs(L31 - L54)
AddrSpaceProvider Trait
The AddrSpaceProvider
trait defines a contract for accessing an address space. It contains a single method that allows temporary access to an AddrSpace
object through a closure.
classDiagram class AddrSpaceProvider { <<trait>> +with_addr_space(f: impl FnOnce(&mut AddrSpace) -~ R) -~ R } class AddrSpace { +check_region_access(range: VirtAddrRange, flags: MappingFlags) -~ bool +populate_area(start: VirtAddr, size: usize) -~ LinuxResult~() ~ } AddrSpace --> AddrSpaceProvider : provides access to
Sources: src/lib.rs(L119 - L121) src/lib.rs(L122 - L126)
Implementation and Usage
The library provides a default implementation of AddrSpaceProvider
for &mut AddrSpace
:
#![allow(unused)] fn main() { impl AddrSpaceProvider for &mut AddrSpace { fn with_addr_space<R>(&mut self, f: impl FnOnce(&mut AddrSpace) -> R) -> R { f(self) } } }
This simple implementation allows a mutable reference to an AddrSpace
to be used as an AddrSpaceProvider
. The implementation pattern ensures that the AddrSpace
is only accessible within the provided closure, enforcing proper resource management.
Sources: src/lib.rs(L122 - L126)
Memory Region Management
The address space management layer is responsible for two primary operations:
- Checking if a memory region is accessible with specific permissions
- Populating page tables to ensure memory is mapped when accessed
These operations are encapsulated in the check_region
function:
flowchart TD A["check_region(aspace, start, layout, access_flags)"] B["Check address alignment"] C["Return EFAULT"] D["Check region access permissions"] E["Calculate page boundaries"] F["Populate page tables"] G["Return error"] H["Return Ok(())"] A --> B B --> C B --> D D --> C D --> E E --> F F --> G F --> H
Sources: src/lib.rs(L31 - L54)
Region Checking Process
The check_region
function performs several validation steps:
- Alignment Check: Verifies that the memory address is properly aligned for the requested data type
- Permission Check: Ensures the memory region has the appropriate access flags (read/write)
- Page Table Population: Maps the necessary pages in virtual memory
This function returns a LinuxResult<()>
which is Ok(())
if the region is valid and accessible, or Err(LinuxError::EFAULT)
if the region cannot be accessed.
Sources: src/lib.rs(L31 - L54)
Integration with User Pointers
The address space management layer is primarily used by the user pointer types (UserPtr
and UserConstPtr
) to safely access user-space memory. These types call into the address space abstraction whenever they need to validate memory accesses.
sequenceDiagram participant UserPtrUserConstPtr as "UserPtr/UserConstPtr" participant AddrSpaceProvider as "AddrSpaceProvider" participant AddrSpace as "AddrSpace" participant check_region as "check_region" UserPtrUserConstPtr ->> AddrSpaceProvider: with_addr_space(closure) AddrSpaceProvider ->> AddrSpace: invoke closure with &mut AddrSpace AddrSpace ->> check_region: check_region(start, layout, flags) check_region -->> AddrSpace: Ok(()) or Err(LinuxError) AddrSpace -->> AddrSpaceProvider: return result AddrSpaceProvider -->> UserPtrUserConstPtr: return result UserPtrUserConstPtr ->> UserPtrUserConstPtr: access memory if Ok
Sources: src/lib.rs(L175 - L182) src/lib.rs(L258 - L266)
Helper Function: check_region_with
To simplify the interaction between user pointers and address space providers, the library includes a check_region_with
helper function:
#![allow(unused)] fn main() { fn check_region_with( mut aspace: impl AddrSpaceProvider, start: VirtAddr, layout: Layout, access_flags: MappingFlags, ) -> LinuxResult<()> { aspace.with_addr_space(|aspace| check_region(aspace, start, layout, access_flags)) } }
This function takes an AddrSpaceProvider
and delegates to the check_region
function, simplifying the code in the user pointer methods.
Sources: src/lib.rs(L110 - L117)
Null-Terminated Data Handling
Special handling is provided for null-terminated data (like C strings) through the check_null_terminated
function. This function safely traverses memory until it finds a null terminator, validating each page as needed.
flowchart TD A["check_null_terminated(aspace, start, access_flags)"] B["Check address alignment"] C["Return EFAULT"] D["Initialize variables"] E["Begin memory traversal loop"] F["Check if current position crosses page boundary"] G["Validate next page access permissions"] H["Return EFAULT"] I["Move to next page"] J["Read memory at current position"] K["Check if value is null terminator"] L["Return pointer and length"] M["Increment length"] A --> B B --> C B --> D D --> E E --> F F --> G F --> J G --> H G --> I I --> J J --> K K --> L K --> M M --> E
Sources: src/lib.rs(L56 - L107)
The check_null_terminated
function uses the access_user_memory
helper to set a thread-local flag that indicates user memory is being accessed, allowing the kernel to handle page faults correctly.
Memory Access Context Management
To safely handle page faults during user memory access, the address space management system uses a thread-local flag:
flowchart TD A["access_user_memory(f)"] B["Set ACCESSING_USER_MEM = true"] C["Execute closure f"] D["Set ACCESSING_USER_MEM = false"] E["Return result of f"] A --> B B --> C C --> D D --> E
Sources: src/lib.rs(L22 - L29) src/lib.rs(L11 - L20)
The is_accessing_user_memory()
function provides a way for the OS to check if a page fault occurred during a legitimate user memory access, allowing it to handle these faults differently from other kernel faults.
Sources: src/lib.rs(L14 - L20)
Implementation Notes
- The address space management layer is designed to be minimal yet flexible, providing only the necessary abstractions for safe user memory access
- The
AddrSpaceProvider
trait follows the resource acquisition is initialization (RAII) pattern, ensuring proper resource management - All memory checks are performed before memory is accessed, preventing undefined behavior
- Page table population is done lazily, only when memory is actually accessed
Sources: src/lib.rs(L119 - L126) src/lib.rs(L31 - L54)
Safety Mechanisms
Relevant source files
This document details the safety mechanisms implemented in the axptr library to prevent kernel crashes when accessing user memory. These mechanisms form a critical layer of protection for kernel code that needs to interact with user-space memory securely and robustly. For information about the basic types used for user memory access, see User Space Pointers and for address space abstractions, see Address Space Management.
Overview of Safety Layers
The axptr library implements multiple safety layers that work together to ensure user memory access is handled safely from kernel code.
flowchart TD A["Kernel Code"] B["UserPtr / UserConstPtr"] C1["Null Pointer Check"] C2["Memory Alignment"] C3["Access Permissions"] C4["Page Table Population"] C5["Page Fault Handling"] D["Safe Memory Access"] A --> B B --> C1 B --> C2 B --> C3 B --> C4 B --> C5 C1 --> D C2 --> D C3 --> D C4 --> D C5 --> D
Sources: src/lib.rs(L31 - L54) src/lib.rs(L11 - L29) src/lib.rs(L175 - L216) src/lib.rs(L258 - L302)
Memory Region Checking
Before any user memory access is permitted, axptr performs thorough validation of the memory region to be accessed.
Alignment Verification
sequenceDiagram participant KernelCode as "Kernel Code" participant UserPtrget as "UserPtr::get()" participant check_region as "check_region()" KernelCode ->> UserPtrget: Request memory access UserPtrget ->> check_region: Validate region check_region ->> check_region: Check alignment Note over check_region: if start.as_usize() & (align - 1) != 0 check_region -->> UserPtrget: Return EFAULT if misaligned
Sources: src/lib.rs(L37 - L40)
Memory alignment verification ensures that pointers are properly aligned for the data type being accessed. Misaligned memory access can cause hardware exceptions on some architectures or inefficient memory operations on others.
Access Permission Validation
sequenceDiagram participant check_region as "check_region()" participant AddrSpace as "AddrSpace" check_region ->> AddrSpace: check_region_access(range, flags) AddrSpace -->> check_region: true/false Note over check_region: Return EFAULT if access not permitted
Sources: src/lib.rs(L42 - L47)
The function checks that the memory range is accessible with the requested permissions (read-only or read-write). This prevents the kernel from attempting to access protected user memory regions.
Page Table Population
sequenceDiagram participant check_region as "check_region()" participant AddrSpace as "AddrSpace" check_region ->> check_region: Calculate page_start and page_end check_region ->> AddrSpace: populate_area(page_start, size) AddrSpace -->> check_region: Result (success/error) Note over check_region: Propagate error or proceed
Sources: src/lib.rs(L49 - L52)
The system ensures that page tables are populated for the entire memory region being accessed. This helps prevent page faults during access by pre-populating the necessary page tables.
Context-Aware Page Fault Handling
A critical safety feature is the context-aware page fault handling mechanism, which allows the OS to distinguish between legitimate page faults while accessing user memory and actual kernel bugs.
flowchart TD subgraph subGraph0["User Memory Access Context"] C["Set ACCESSING_USER_MEM = true"] D["Execute memory access"] E["Set ACCESSING_USER_MEM = false"] end A["Kernel Code"] B["access_user_memory()"] F["Page Fault Handler"] G["is_accessing_user_memory()?"] H["Handle as legitimateuser memory fault"] I["Handle as kernel bug(panic/oops)"] A --> B B --> C C --> D D --> E F --> G G --> H G --> I
Sources: src/lib.rs(L11 - L29) src/lib.rs(L73 - L104)
The system uses a per-CPU flag ACCESSING_USER_MEM
to track whether kernel code is actively accessing user memory. This information is crucial for the OS's page fault handler, allowing it to:
- Properly handle page faults occurring during legitimate user memory access
- Correctly identify true kernel bugs that would otherwise cause crashes
This approach enables the kernel to access user memory regions that may trigger page faults (e.g., due to swapped pages) without crashing.
Null-Terminated Data Handling
The axptr library provides special handling for null-terminated data from user space, such as C-style strings.
flowchart TD subgraph subGraph0["Protected by access_user_memory()"] F["Process memory page by page"] G["More pagesto check?"] H["Check page permissions"] I["Permissiongranted?"] J["Move to next page"] K["Search for null terminator"] end A["Kernel Code"] B["UserPtr::get_as_null_terminated()"] C["check_null_terminated()"] D["Alignment Check"] E["Return EFAULT"] L["Return validatedslice to caller"] A --> B B --> C C --> D D --> E D --> F F --> G G --> H G --> K H --> I I --> E I --> J J --> G K --> L
Sources: src/lib.rs(L56 - L107) src/lib.rs(L202 - L217) src/lib.rs(L280 - L292) src/lib.rs(L294 - L303)
This process handles the special case of null-terminated data structures (like C strings) where the length is not known in advance. The implementation:
- Validates memory alignment
- Checks permissions page by page as it traverses the data
- Executes within the
access_user_memory()
context to safely handle potential page faults - Efficiently searches for the null terminator
- Returns a safe slice reference once validation is complete
Code Structure and Implementation
The table below summarizes how the safety mechanisms are implemented across various functions:
Safety Mechanism | Implementation | Key Functions |
---|---|---|
Null Pointer Detection | Built into UserPtr/UserConstPtr | is_null(),nullable() |
Alignment Verification | Check incheck_region()function | check_region() |
Permission Validation | Validation viaAddrSpace | check_region_access() |
Page Table Population | Ensures pages are ready for access | populate_area() |
Page Fault Protection | Per-CPU flag tracks access context | access_user_memory(),is_accessing_user_memory() |
Null-Terminated Data Handling | Special validation routine | check_null_terminated() |
Sources: src/lib.rs(L31 - L54) src/lib.rs(L56 - L107) src/lib.rs(L11 - L29) src/lib.rs(L158 - L169) src/lib.rs(L245 - L253)
Integration with Memory Access Functions
The safety mechanisms are integrated into all user memory access methods. The diagram below illustrates how UserPtr and UserConstPtr utilize these mechanisms:
classDiagram class UserPtr~T~ { +get(aspace) +get_as_slice(aspace, length) +get_as_null_terminated(aspace) } class UserConstPtr~T~ { +get(aspace) +get_as_slice(aspace, length) +get_as_null_terminated(aspace) +get_as_str() } class SafetyMechanisms { +check_region() +check_null_terminated() +access_user_memory() +is_accessing_user_memory() } UserPtr --> SafetyMechanisms : uses UserConstPtr --> SafetyMechanisms : uses
Sources: src/lib.rs(L175 - L198) src/lib.rs(L202 - L216) src/lib.rs(L258 - L277) src/lib.rs(L280 - L302)
Each access method (get()
, get_as_slice()
, etc.) applies the appropriate safety mechanisms before permitting memory access, ensuring that all user memory operations are properly validated and protected.
Error Handling
When safety checks fail, the system returns appropriate error codes to the caller rather than crashing:
- Misaligned memory:
EFAULT
- Inaccessible memory regions:
EFAULT
- Page table population failures: Various errors propagated from the underlying system
- Invalid UTF-8 in strings:
EILSEQ
This approach allows kernel code to gracefully handle user memory access failures without compromising system stability.
Sources: src/lib.rs(L39 - L40) src/lib.rs(L46 - L47) src/lib.rs(L301 - L302)
Memory Region Checking
Relevant source files
Purpose and Scope
This document explains the memory region checking mechanisms in the axptr library that validate user-space memory regions before they are accessed by kernel code. These validation mechanisms ensure memory safety by verifying alignment, access permissions, and page table population before allowing actual memory access. This is a critical component of the safety mechanisms in axptr.
For information about how page faults are handled during memory access, see Context-Aware Page Fault Handling.
Overview
Memory region checking is a multi-step validation process that occurs before any user-space memory access. This process ensures that the kernel does not crash when accessing potentially invalid memory regions.
flowchart TD A["Memory Access Request"] B["Alignment Check"] C["Return EFAULT"] D["Access Permission Check"] E["Page Table Population"] F["Return Error"] G["Safe Memory Access"] A --> B B --> C B --> D D --> C D --> E E --> F E --> G
Sources: src/lib.rs(L31 - L54)
Memory Region Checking Process
The memory region checking process happens in three main stages:
- Alignment Verification: Ensures the memory address aligns with the required alignment for the data type
- Access Permission Checking: Verifies the process has appropriate permissions for the memory region
- Page Table Population: Ensures pages are mapped into memory before access
Core Implementation
The central function for memory region checking is check_region
, which takes an address space, starting address, memory layout, and access flags as parameters:
Sources: src/lib.rs(L31 - L54) src/lib.rs(L110 - L117)
Alignment Verification
The first check performed is alignment verification, which ensures that the memory address is properly aligned for the data type being accessed.
flowchart TD A["Memory Address"] B["Extract Alignment Requirement"] C["Address & (align - 1) == 0?"] D["Proceed to Next Check"] E["Return EFAULT"] A --> B B --> C C --> D C --> E
For a memory address to be properly aligned, the memory address modulo the alignment requirement must be zero. This is checked using the bitwise AND operation:
if start.as_usize() & (align - 1) != 0 {
return Err(LinuxError::EFAULT);
}
Sources: src/lib.rs(L37 - L40) src/lib.rs(L61 - L64)
Access Permission Checking
The second check verifies that the memory region has the appropriate access permissions:
flowchart TD A["Create VirtAddrRange"] B["Call check_region_access()"] C["Return EFAULT"] D["Proceed to Page Table Population"] A --> B B --> C B --> D
The check_region_access
method on the AddrSpace
object determines if the current process has the necessary permissions to access the memory range with the specified access flags. The access flags are different for UserPtr
(READ|WRITE) and UserConstPtr
(READ only).
Sources: src/lib.rs(L42 - L47) src/lib.rs(L137) src/lib.rs(L228)
Page Table Population
The final step is to ensure that the pages containing the memory region are mapped into physical memory:
flowchart TD A["Calculate Page Boundaries"] B["page_start = start.align_down_4k()"] C["page_end = (start + size).align_up_4k()"] D["Call populate_area()"] E["Return Ok(())"] F["Return Error"] A --> B B --> C C --> D D --> E D --> F
This step aligns the address range to page boundaries and calls populate_area
to ensure that all necessary pages are mapped and available for access.
Sources: src/lib.rs(L49 - L53)
Null-Terminated Data Handling
A specialized checking mechanism exists for null-terminated data like C strings:
flowchart TD A["Check Alignment"] B["Return EFAULT"] C["Process Page by Page"] D["Check Page Access Permissions"] E["Move to Next Page if Needed"] F["Read Memory and Check for Null Terminator"] G["Return Pointer and Length"] H["Increment Length"] A --> B A --> C C --> D D --> B D --> E E --> F F --> G F --> H H --> C
The check_null_terminated
function scans memory page by page, checking access permissions for each page, until it finds the null terminator. This is used by the get_as_null_terminated
methods on both pointer types.
Sources: src/lib.rs(L56 - L107) src/lib.rs(L204 - L217) src/lib.rs(L282 - L291)
Integration With User Pointer Types
Memory region checking is integrated into the UserPtr
and UserConstPtr
types through their access methods:
Each access method performs the appropriate memory region checks before allowing access to the memory.
Access Method | Purpose | Checks Performed |
---|---|---|
get() | Access a single item | Alignment, permissions, page population |
get_as_slice() | Access an array of items | Alignment, permissions, page population |
get_as_null_terminated() | Access a null-terminated array | Alignment, permissions, page-by-page scanning |
get_as_str() | Access a C string (UserConstPtr only) | All checks fromget_as_null_terminated()plus UTF-8 validation |
Sources: src/lib.rs(L175 - L183) src/lib.rs(L186 - L198) src/lib.rs(L204 - L217) src/lib.rs(L258 - L266) src/lib.rs(L269 - L277) src/lib.rs(L282 - L291) src/lib.rs(L296 - L302)
Error Handling
Memory region checking functions propagate errors using the LinuxResult
type. The primary error returned is LinuxError::EFAULT
, which indicates an invalid address or permission error:
Error Condition | Error Value |
---|---|
Misaligned address | EFAULT |
Access permission denied | EFAULT |
Page population failure | (Propagated frompopulate_area) |
Invalid UTF-8 in string (forget_as_str) | EILSEQ |
Memory region checking ensures that these errors are detected before any actual memory access occurs, preventing kernel crashes.
Sources: src/lib.rs(L39) src/lib.rs(L46) src/lib.rs(L51) src/lib.rs(L301)
Relationship with Address Space Management
Memory region checking relies on the address space management capabilities provided by the AddrSpace
type:
sequenceDiagram participant UserPtrUserConstPtr as UserPtr/UserConstPtr participant check_region_with as check_region_with() participant AddrSpaceProvider as AddrSpaceProvider participant check_region as check_region() participant AddrSpace as AddrSpace UserPtrUserConstPtr ->> check_region_with: "Request memory check" check_region_with ->> AddrSpaceProvider: "with_addr_space()" AddrSpaceProvider ->> check_region: "Provide AddrSpace" check_region ->> AddrSpace: "check_region_access()" AddrSpace -->> check_region: "Return access status" check_region ->> AddrSpace: "populate_area()" AddrSpace -->> check_region: "Return population status" check_region -->> check_region_with: "Return check result" check_region_with -->> UserPtrUserConstPtr: "Return check result"
The AddrSpaceProvider
trait abstracts the process of obtaining an AddrSpace
object, which provides the necessary methods for checking access permissions and populating page tables.
For more information about address space management, see Address Space Management.
Sources: src/lib.rs(L110 - L126)
Performance Considerations
Memory region checking adds overhead to each user memory access, but this overhead is necessary to maintain memory safety. The implementation includes some optimizations:
- Alignment checks are performed first as they are the cheapest
- Permission checks are done before attempting to populate page tables
- Page population is done at page granularity to minimize the number of operations
For null-terminated data, the checking is more complex and potentially more expensive, as it must scan the data page by page until it finds the null terminator.
Sources: src/lib.rs(L31 - L54) src/lib.rs(L56 - L107)
Context-Aware Page Fault Handling
Relevant source files
Purpose and Scope
This document explains the mechanism used by axptr to safely handle page faults that may occur when the kernel accesses user space memory. The system uses a per-CPU flag to inform the operating system when user memory access is in progress, allowing the kernel to differentiate between legitimate page faults during user memory access and actual kernel bugs. For information about how memory regions are validated before access, see Memory Region Checking.
The Challenge of Accessing User Memory
When kernel code accesses user space memory, multiple issues can arise:
- The memory might not be currently mapped (page fault)
- The user process might have just freed the memory
- The user might have provided an invalid pointer
Without proper handling, these scenarios would cause a kernel panic, as page faults in kernel mode are typically considered fatal errors. Context-aware page fault handling provides a solution to this problem.
flowchart TD A["Kernel Code"] B["User Space Memory"] C["Challenge: Memory might not be mapped"] D["Challenge: Memory might be invalid"] E["Challenge: Page fault in kernel mode = crash"] F["Solution: Context-Aware Page Fault Handling"] A --> B B --> C B --> D B --> E C --> F D --> F E --> F
Sources: src/lib.rs(L11 - L20)
ACCESSING_USER_MEM Flag
The core of this mechanism is a per-CPU boolean flag named ACCESSING_USER_MEM
. This flag indicates whether the kernel is currently accessing user memory, allowing the page fault handler to make an informed decision about how to respond to a page fault.
#[percpu::def_percpu]
static mut ACCESSING_USER_MEM: bool = false;
This flag is:
- Defined as a per-CPU variable, so each CPU core has its own instance
- Initially set to
false
- Set to
true
immediately before accessing user memory - Reset to
false
after the access is complete
The operating system checks this flag when a page fault occurs to determine whether to treat it as a legitimate page fault (allowing recovery) or as a kernel bug (triggering a panic).
Sources: src/lib.rs(L11 - L12)
How Context-Aware Handling Works
The context-aware page fault handling process follows these steps:
sequenceDiagram participant KernelCode as "Kernel Code" participant access_user_memory as "access_user_memory()" participant ACCESSING_USER_MEMFlag as "ACCESSING_USER_MEM Flag" participant PageFaultHandler as "Page Fault Handler" participant UserSpaceMemory as "User Space Memory" KernelCode ->> access_user_memory: Call with closure to access user memory access_user_memory ->> ACCESSING_USER_MEMFlag: Set flag to true access_user_memory ->> UserSpaceMemory: Access user memory alt Memory access causes page fault UserSpaceMemory ->> PageFaultHandler: Trigger page fault PageFaultHandler ->> ACCESSING_USER_MEMFlag: Check flag ACCESSING_USER_MEMFlag ->> PageFaultHandler: Flag = true (accessing user memory) PageFaultHandler ->> UserSpaceMemory: Handle fault (map page, etc.) UserSpaceMemory ->> access_user_memory: Continue execution end access_user_memory ->> ACCESSING_USER_MEMFlag: Set flag to false access_user_memory ->> KernelCode: Return result
Sources: src/lib.rs(L22 - L29)
Implementation Details
The is_accessing_user_memory Function
This function allows the operating system's page fault handler to check whether a page fault occurred during legitimate user memory access:
#![allow(unused)] fn main() { pub fn is_accessing_user_memory() -> bool { ACCESSING_USER_MEM.read_current() } }
As the documentation states: "OS implementation shall allow page faults from kernel when this function returns true."
Sources: src/lib.rs(L14 - L20)
The access_user_memory Function
This function manages the context flag around a closure that accesses user memory:
fn access_user_memory<R>(f: impl FnOnce() -> R) -> R {
ACCESSING_USER_MEM.with_current(|v| {
*v = true;
let result = f();
*v = false;
result
})
}
Key points:
- Takes a closure
f
that performs the actual user memory access - Sets the flag before executing the closure
- Captures the result from the closure
- Clears the flag after execution
- Returns the result
Sources: src/lib.rs(L22 - L29)
Integration with Memory Access Functions
The context-aware page fault handling is primarily used when accessing potentially problematic user memory, such as when reading null-terminated arrays or strings.
Example: Null-Terminated Data Handling
The check_null_terminated
function uses this mechanism to safely scan user memory for a null terminator:
flowchart TD subgraph subGraph0["Protected Region"] F["Scan memory looking for null terminator"] G["Check next page access permissions"] H["Continue scanning"] I["Return EFAULT"] J["Break loop"] end A["check_null_terminated()"] B["Validate alignment"] C["Prepare for scanning"] D["Call access_user_memory()"] E["Set ACCESSING_USER_MEM flag"] K["Clear ACCESSING_USER_MEM flag"] L["Return pointer and length"] A --> B B --> C C --> D D --> E E --> F F --> G G --> H G --> I H --> F H --> J I --> K J --> K K --> L
The function:
- Validates the initial alignment of the memory region
- Sets up scanning variables
- Crucially wraps the scanning loop in
access_user_memory()
- Within the protected region, handles page boundaries and potential faults
- Returns a pointer and length when successful
Sources: src/lib.rs(L56 - L107)
System Interactions
Here's how the context-aware page fault handling interacts with different system components:
Sources: src/lib.rs(L11 - L29) src/lib.rs(L56 - L107) src/lib.rs(L204 - L216) src/lib.rs(L282 - L291)
Example Use Case
Consider what happens when a kernel function tries to access a user-provided null-terminated string that spans multiple pages, where some pages might not be mapped yet:
Step | Description | Flag State | System Behavior |
---|---|---|---|
1 | User calls kernel with string pointer | false | Normal operation |
2 | Kernel callsUserConstPtr::get_as_str() | false | Normal operation |
3 | access_user_memory()is called | true | Prepared for potential page faults |
4 | Memory is accessed, causing page fault | true | OS handles fault instead of panicking |
5 | OS maps the page | true | Execution continues |
6 | String scan completes | false(reset) | Return to normal operation |
Without this mechanism, any unmapped page in the user string would crash the kernel, even if the user's access was legitimate.
Sources: src/lib.rs(L295 - L302)
Key Benefits
- Safety: Prevents kernel crashes from legitimate user memory accesses
- Transparency: Kernel code can access user memory without explicit fault handling
- Efficiency: No need for complex user/kernel copying mechanisms
- Robustness: Properly handles both valid and invalid memory access scenarios
Sources: src/lib.rs(L11 - L29)
Null-Terminated Data Handling
Relevant source files
Purpose and Scope
This document explains how the axptr library safely handles null-terminated data structures in user memory, such as C-style strings and arrays. These special data structures have variable length and are terminated by a sentinel "null" value rather than having an explicit length parameter. For information about general memory region checking, see Memory Region Checking.
Overview
Null-terminated data structures present unique challenges for safe memory access. Unlike fixed-size arrays, their length cannot be determined without scanning the memory until a null terminator is found. This requires special handling to ensure memory safety while efficiently accessing these structures.
flowchart TD subgraph subGraph0["Null-terminated Data Handling"] C["check_null_terminated()"] D["Alignment Verification"] E["Page-by-Page Scan"] F["Return validated pointer + length"] end A["Kernel Code"] B["User Memory Pointer"] G["Safe Access Methods"] H["Null-terminated arrays"] I["C-strings"] A --> B B --> C B --> G C --> D D --> E E --> F G --> H G --> I
Sources: src/lib.rs(L56 - L107) src/lib.rs(L204 - L217) src/lib.rs(L282 - L292) src/lib.rs(L294 - L303)
Core Mechanism
The axptr library implements a specialized mechanism for safely handling null-terminated data from user space. This is performed by the check_null_terminated
function.
sequenceDiagram participant KernelCode as "Kernel Code" participant UserPtrUserConstPtr as "UserPtr/UserConstPtr" participant check_null_terminated as "check_null_terminated()" participant UserMemory as "User Memory" KernelCode ->> UserPtrUserConstPtr: get_as_null_terminated(aspace) UserPtrUserConstPtr ->> check_null_terminated: check address space & memory check_null_terminated ->> check_null_terminated: Check alignment check_null_terminated ->> check_null_terminated: Set up page tracking loop For each byte until null terminator check_null_terminated ->> check_null_terminated: Check if current position crosses page boundary alt Crosses page boundary check_null_terminated ->> check_null_terminated: Check if new page is accessible check_null_terminated ->> check_null_terminated: Move to next page end check_null_terminated ->> UserMemory: Read memory (with fault handling) UserMemory -->> check_null_terminated: Return value alt Value equals alt terminator check_null_terminated ->> check_null_terminated: Stop scanning else Value not terminator else Value not terminator check_null_terminated ->> check_null_terminated: Increment position & counter end end end check_null_terminated ->> UserPtrUserConstPtr: Return pointer & length UserPtrUserConstPtr ->> KernelCode: Return safe slice reference
Sources: src/lib.rs(L56 - L107)
Memory Layout Processing
The function processes null-terminated data by checking memory one page at a time, efficiently handling arbitrarily long data structures without needing to know their size in advance.
- Alignment Check: Ensures the starting address is properly aligned for the specified type.
- Page-by-Page Processing: Handles memory in page-sized chunks, validating each page before access.
- Safe Memory Reading: Uses the
access_user_memory
function to safely read user memory with proper fault handling. - Terminator Detection: Scans until it finds the terminator value (default value of type T).
The function returns a raw pointer to the start of the data and its length (excluding the terminator).
Sources: src/lib.rs(L56 - L107)
Access Methods for Null-Terminated Data
The library provides specialized methods for both UserPtr<T>
and UserConstPtr<T>
to handle null-terminated data.
Methods for UserPtr
UserPtr<T>
provides the get_as_null_terminated
method for accessing mutable null-terminated arrays:
For types that implement Eq + Default
, this method:
- Calls
check_null_terminated
with the appropriate access flags - Converts the raw pointer and length into a safe mutable slice
- Returns the slice wrapped in a
LinuxResult
Sources: src/lib.rs(L204 - L217)
Methods for UserConstPtr
Similarly, UserConstPtr<T>
provides a read-only version of the same functionality:
Sources: src/lib.rs(L282 - L292)
C-String Handling
The library includes specialized handling for C-style strings through the get_as_str
method on UserConstPtr<c_char>
.
Processing Flow
flowchart TD A["UserConstPtr"] B["get_as_null_terminated()"] C["Memory transmute to &[u8]"] D["str::from_utf8()"] E["Return &str"] F["Return EILSEQ error"] A --> B B --> C C --> D D --> E D --> F
Sources: src/lib.rs(L294 - L303)
This method:
- Gets the null-terminated array of
c_char
characters - Transmutes the slice from
&[c_char]
to&[u8]
(safe sincec_char
isu8
) - Attempts to parse the byte slice as a UTF-8 string
- Returns either a valid string slice or an error if the string is not valid UTF-8
Sources: src/lib.rs(L294 - L303)
Technical Implementation Details
Accessing User Memory Safely
The check_null_terminated
function uses the access_user_memory
helper to safely access user memory while handling page faults properly. This ensures that:
- The
ACCESSING_USER_MEM
flag is set to true during memory access - Any page faults occurring during the operation are handled correctly
- The flag is reset to false after the operation completes
Type Constraints
The null-terminated handling functions require that the type T
implements both:
Eq
- To compare values for equality with the terminatorDefault
- To create the terminator value (usually zero/null)
This allows the system to work with different types of null-terminated data beyond just strings.
Memory Safety Guarantees
The null-terminated data handling system provides the following safety guarantees:
Aspect | Guarantee |
---|---|
Memory Alignment | Ensures the pointer is properly aligned for type T |
Access Permissions | Verifies each page has appropriate read/write permissions |
Page Faults | Handles page faults during user memory access |
Memory Boundaries | Safely traverses page boundaries |
Data Validation | Ensures data is properly terminated |
UTF-8 Validation | Validates UTF-8 encoding for strings |
Sources: src/lib.rs(L56 - L107) src/lib.rs(L204 - L217) src/lib.rs(L282 - L292) src/lib.rs(L294 - L303)
Practical Considerations
Performance Characteristics
Scanning for null terminators can potentially traverse many pages of memory, especially for long strings or arrays. The implementation optimizes this by:
- Checking page boundaries only when necessary
- Validating permissions at the page level, not for each element
- Using volatile reads for maximum safety with minimal overhead
Error Handling
The null-terminated data methods return LinuxResult
values with appropriate error codes:
EFAULT
- If memory is inaccessible or improperly alignedEILSEQ
- If string data is not valid UTF-8 (forget_as_str
)
Sources: src/lib.rs(L56 - L107) src/lib.rs(L294 - L303)
Integration with Operating System
Relevant source files
This page documents how the axptr
library integrates with the underlying operating system components to provide safe user memory access from kernel code. We cover the dependency architecture, memory management integration, error handling, and page fault coordination that enable axptr
to function within a broader OS environment.
For information about the core pointer types and their usage, see User Space Pointers. For details on safety mechanisms, see Safety Mechanisms.
Dependency Architecture
The axptr
library depends on several OS components to provide its functionality:
flowchart TD subgraph subGraph1["OS Integration Points"] axmm["axmm: Memory Management"] axerrno["axerrno: Error Handling"] page_table["page_table_multiarch: Page Tables"] memory_addr["memory_addr: Address Types"] percpu["percpu: Per-CPU Variables"] end subgraph subGraph0["axptr Components"] userptr["UserPtr/UserConstPtr"] addrspace_provider["AddrSpaceProvider trait"] fault_handler["Page Fault Coordination"] end kernel["Kernel Memory Subsystem"] kernel_errors["Kernel Error Handling"] arch_mm["Architecture-specific Memory Management"] kernel_smp["Kernel SMP Support"] addrspace_provider --> axmm axerrno --> kernel_errors axmm --> kernel fault_handler --> percpu page_table --> arch_mm percpu --> kernel_smp userptr --> axerrno userptr --> axmm userptr --> memory_addr userptr --> page_table
The diagram illustrates how axptr
interfaces with various operating system components through its dependencies. These dependencies allow axptr
to leverage the kernel's existing infrastructure for memory management, error handling, and multi-core support.
Sources: Cargo.toml(L7 - L12) src/lib.rs(L4 - L11)
Memory Management Integration
axptr
integrates with the OS memory management system primarily through the axmm
crate, which provides the AddrSpace
abstraction. This integration enables axptr
to:
- Check permissions for memory regions
- Populate page tables as needed
- Enforce proper memory alignment
- Handle page faults gracefully
Address Space Provider Mechanism
The AddrSpaceProvider
trait serves as the primary integration point between axptr
and the OS memory management subsystem:
The trait is designed to be simple enough that OS-specific implementations can easily provide access to the appropriate address space, while still allowing for thread-safety and context-specific behavior.
Sources: src/lib.rs(L119 - L126)
Memory Access Workflow
When kernel code attempts to access user memory through UserPtr
or UserConstPtr
, the following sequence occurs:
sequenceDiagram participant KernelCode as "Kernel Code" participant UserPtrmethods as "UserPtr methods" participant check_regionfunction as "check_region function" participant AddrSpaceProvider as "AddrSpaceProvider" participant AddrSpaceOS as "AddrSpace (OS)" participant PageFaultHandlerOS as "Page Fault Handler (OS)" KernelCode ->> UserPtrmethods: get(aspace) UserPtrmethods ->> check_regionfunction: check_region_with(aspace, addr, layout, flags) check_regionfunction ->> AddrSpaceProvider: with_addr_space(lambda) AddrSpaceProvider ->> AddrSpaceOS: lambda(aspace) AddrSpaceOS ->> AddrSpaceOS: check_region_access(range, flags) AddrSpaceOS ->> AddrSpaceOS: populate_area(page_start, page_end - page_start) AddrSpaceOS -->> check_regionfunction: Result check_regionfunction -->> UserPtrmethods: Result alt Success UserPtrmethods ->> UserPtrmethods: Set ACCESSING_USER_MEM flag UserPtrmethods ->> KernelCode: Return memory reference Note over KernelCode,UserPtrmethods: During access, page fault may occur KernelCode ->> PageFaultHandlerOS: (Page fault in kernel mode) PageFaultHandlerOS ->> PageFaultHandlerOS: Check is_accessing_user_memory() PageFaultHandlerOS ->> KernelCode: Handle fault appropriately UserPtrmethods ->> UserPtrmethods: Clear ACCESSING_USER_MEM flag else Failure UserPtrmethods ->> KernelCode: Return error (EFAULT, etc.) end
This workflow demonstrates how axptr
coordinates with the OS memory management subsystem to safely access user memory, involving permission checks, page table population, and page fault handling.
Sources: src/lib.rs(L31 - L54) src/lib.rs(L175 - L198) src/lib.rs(L258 - L277)
Error Handling Integration
axptr
uses the axerrno
crate for Linux-compatible error codes. This integration ensures that errors from user memory access operations can be properly propagated to OS-specific error handling systems.
The primary error codes used by axptr
include:
Error Code | Description | Usage inaxptr |
---|---|---|
EFAULT | Bad address | Returned for misaligned or inaccessible memory regions |
EILSEQ | Illegal byte sequence | Returned when string conversion fails inget_as_str |
The error handling flow integrates with the OS through the LinuxResult
type, which is a Result<T, LinuxError>
that can be directly used by OS components or converted to OS-specific error types.
Sources: src/lib.rs(L4) src/lib.rs(L36 - L47) src/lib.rs(L301)
Page Fault Coordination
One of the most critical aspects of OS integration is the coordination between axptr
and the OS page fault handler. This is achieved through the ACCESSING_USER_MEM
per-CPU flag:
flowchart TD start["Kernel Code accesses user memory"] access_fn["access_user_memory() function"] set_flag["Set ACCESSING_USER_MEM = true"] memory_op["Perform memory operation"] page_fault["Page fault occurs"] os_handler["OS Page Fault Handler"] check_flag["Check is_accessing_user_memory()"] special_handling["Handle as user memory access"] kernel_crash["Handle as kernel bug"] clear_flag["Set ACCESSING_USER_MEM = false"] end_access["Return result"] access_fn --> set_flag check_flag --> kernel_crash check_flag --> special_handling clear_flag --> end_access memory_op --> clear_flag memory_op --> page_fault os_handler --> check_flag page_fault --> os_handler set_flag --> memory_op special_handling --> memory_op start --> access_fn
This mechanism requires the OS page fault handler to check is_accessing_user_memory()
when a page fault occurs in kernel mode. If true, the fault should be treated as a normal user memory access that may require page table updates or signal delivery. If false, it should be treated as a bug in the kernel.
Sources: src/lib.rs(L11 - L29) src/lib.rs(L73 - L104)
Implementation of AddrSpaceProvider
Operating systems integrating with axptr
must provide an implementation of the AddrSpaceProvider
trait. The library provides a simple implementation for &mut AddrSpace
, but OS-specific implementations might include:
- Process-specific address space providers
- Thread-specific address space providers
- Providers that switch to user address spaces temporarily
The implementation should ensure that:
- The correct address space is used for the current context
- Any necessary locking or synchronization is handled
- The address space remains valid throughout the operation
Example of the default implementation:
#![allow(unused)] fn main() { impl AddrSpaceProvider for &mut AddrSpace { fn with_addr_space<R>(&mut self, f: impl FnOnce(&mut AddrSpace) -> R) -> R { f(self) } } }
Sources: src/lib.rs(L119 - L126)
Dependency Requirements
The operating system must provide or accommodate the following components for proper integration with axptr
:
Dependency | Required Features |
---|---|
axmm | AddrSpace implementation with check_region_access and populate_area methods |
page_table_multiarch | Support for mapping flags (READ, WRITE) |
memory_addr | Address types and manipulation (VirtAddr, VirtAddrRange) |
percpu | Per-CPU variable support for the ACCESSING_USER_MEM flag |
axerrno | Linux-compatible error codes |
Each dependency provides essential functionality that axptr
relies on to safely access user memory. The operating system must ensure these dependencies are properly implemented and available.
Sources: Cargo.toml(L7 - L12) src/lib.rs(L4 - L11)
Integration Example Flow
The complete flow of integration between axptr
and the operating system for a typical user memory access operation:
sequenceDiagram participant KernelCode as "Kernel Code" participant UserPtr as "UserPtr" participant OSAddrSpaceProvider as "OS AddrSpaceProvider" participant OSAddrSpace as "OS AddrSpace" participant OSPageTables as "OS Page Tables" participant OSPageFaultHandler as "OS Page Fault Handler" KernelCode ->> UserPtr: get(os_provider) UserPtr ->> OSAddrSpaceProvider: with_addr_space(lambda) OSAddrSpaceProvider ->> OSAddrSpace: Acquire address space OSAddrSpaceProvider ->> UserPtr: Execute lambda with address space UserPtr ->> OSAddrSpace: check_region_access(range, flags) OSAddrSpace ->> OSAddrSpace: Validate permissions UserPtr ->> OSAddrSpace: populate_area(page_start, size) OSAddrSpace ->> OSPageTables: Ensure pages are populated UserPtr ->> UserPtr: Set ACCESSING_USER_MEM = true UserPtr ->> KernelCode: Return reference to user memory KernelCode ->> KernelCode: Access user memory alt Page Fault Occurs KernelCode -->> OSPageFaultHandler: Page fault exception OSPageFaultHandler ->> OSPageFaultHandler: Call is_accessing_user_memory() OSPageFaultHandler ->> OSPageTables: Handle fault (map page, etc.) OSPageFaultHandler -->> KernelCode: Resume execution end KernelCode ->> UserPtr: Memory access complete UserPtr ->> UserPtr: Set ACCESSING_USER_MEM = false
This diagram illustrates the complete integration flow, showing how various OS components interact with axptr
during a user memory access operation, including the handling of page faults.
Sources: src/lib.rs(L18 - L29) src/lib.rs(L31 - L54) src/lib.rs(L175 - L198)
API Reference
Relevant source files
This page provides a comprehensive reference for the axptr library, which offers safe abstractions for accessing user-space memory from kernel code. The API is designed to prevent memory-related security vulnerabilities and crashes that can occur when kernel code interacts with potentially unsafe user memory.
For architectural concepts and safety mechanisms, refer to Memory Safety Architecture and Safety Mechanisms.
API Components Overview
Sources: src/lib.rs(L119 - L126) src/lib.rs(L128 - L217) src/lib.rs(L219 - L303) src/lib.rs(L18 - L20)
Core Types
UserPtr
UserPtr<T>
is a wrapper around a raw mutable pointer (*mut T
) to user-space memory. It provides safe methods to access and manipulate user memory with validation checks.
flowchart TD A["Kernel Code"] B["UserPtr"] C["check_region()"] D["Access Permission Check"] E["Alignment Check"] F["Page Table Population"] G["User Memory"] A --> B B --> C B --> G C --> D C --> E C --> F
Sources: src/lib.rs(L128 - L217)
Constants
Constant | Type | Description |
---|---|---|
ACCESS_FLAGS | MappingFlags | Read and write access flags for the pointer (MappingFlags::READ.union(MappingFlags::WRITE)) |
Sources: src/lib.rs(L137)
Methods
Method | Signature | Description |
---|---|---|
address | fn address(&self) -> VirtAddr | Returns the virtual address of the pointer |
as_ptr | unsafe fn as_ptr(&self) -> *mut T | Unwraps the pointer into a raw pointer (unsafe) |
cast | fn cast(self) -> UserPtr | Casts the pointer to a different type |
is_null | fn is_null(&self) -> bool | Checks if the pointer is null |
nullable | fn nullable(self) -> Option | Converts the pointer to anOption, returningNoneif null |
get | fn get(&mut self, aspace: impl AddrSpaceProvider) -> LinuxResult<&mut T> | Safely accesses the value, validating the memory region |
get_as_slice | fn get_as_slice(&mut self, aspace: impl AddrSpaceProvider, length: usize) -> LinuxResult<&mut [T]> | Gets the value as a slice of specified length |
get_as_null_terminated | fn get_as_null_terminated(&mut self, aspace: impl AddrSpaceProvider) -> LinuxResult<&mut [T]> | Gets the value as a slice terminated by a null value |
Sources: src/lib.rs(L136 - L169) src/lib.rs(L171 - L198) src/lib.rs(L201 - L217)
UserConstPtr
UserConstPtr<T>
is a wrapper around a raw constant pointer (*const T
) to user-space memory. It provides similar functionality to UserPtr<T>
but for read-only access.
flowchart TD A["Kernel Code"] B["UserConstPtr"] C["check_region()"] D["Access Permission Check"] E["Alignment Check"] F["Page Table Population"] G["User Memory (read-only)"] A --> B B --> C B --> G C --> D C --> E C --> F
Sources: src/lib.rs(L219 - L303)
Constants
Constant | Type | Description |
---|---|---|
ACCESS_FLAGS | MappingFlags | Read-only access flags for the pointer (MappingFlags::READ) |
Sources: src/lib.rs(L228)
Methods
Method | Signature | Description |
---|---|---|
address | fn address(&self) -> VirtAddr | Returns the virtual address of the pointer |
as_ptr | unsafe fn as_ptr(&self) -> *const T | Unwraps the pointer into a raw pointer (unsafe) |
cast | fn cast(self) -> UserConstPtr | Casts the pointer to a different type |
is_null | fn is_null(&self) -> bool | Checks if the pointer is null |
nullable | fn nullable(self) -> Option | Converts the pointer to anOption, returningNoneif null |
get | fn get(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&T> | Safely accesses the value, validating the memory region |
get_as_slice | fn get_as_slice(&self, aspace: impl AddrSpaceProvider, length: usize) -> LinuxResult<&[T]> | Gets the value as a slice of specified length |
get_as_null_terminated | fn get_as_null_terminated(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&[T]> | Gets the value as a slice terminated by a null value |
Sources: src/lib.rs(L227 - L254) src/lib.rs(L256 - L278) src/lib.rs(L280 - L292)
Special Methods for UserConstPtr<c_char>
UserConstPtr<c_char>
has an additional method for working with strings:
Method | Signature | Description |
---|---|---|
get_as_str | fn get_as_str(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&'static str> | Gets the pointer as a Rust string, validating UTF-8 encoding |
Sources: src/lib.rs(L294 - L303)
AddrSpaceProvider Trait
The AddrSpaceProvider
trait is used to abstract the address space operations used by both pointer types. It provides a way to access the underlying address space.
Sources: src/lib.rs(L119 - L126)
Methods
Method | Signature | Description |
---|---|---|
with_addr_space | fn with_addr_space | Provides a reference to the address space for use with a callback function |
Sources: src/lib.rs(L119 - L121)
Helper Functions
The axptr library provides utility functions for working with user-space memory:
Function | Signature | Description |
---|---|---|
is_accessing_user_memory | fn is_accessing_user_memory() -> bool | Checks if we are currently accessing user memory, used for page fault handling |
access_user_memory | fn access_user_memory | Internal function to set a flag during user memory access |
Sources: src/lib.rs(L11 - L29)
Memory Access Process
The diagram below illustrates the process that occurs when kernel code attempts to access user memory through the axptr API:
sequenceDiagram participant KernelCode as Kernel Code participant UserPtrUserConstPtr as UserPtr/UserConstPtr participant check_region as check_region() participant AddrSpace as AddrSpace participant UserMemory as User Memory KernelCode ->> UserPtrUserConstPtr: get(...)/get_as_slice(...)/etc. UserPtrUserConstPtr ->> check_region: check_region_with(...) check_region ->> AddrSpace: check_region_access check_region ->> AddrSpace: populate_area alt Region is valid AddrSpace -->> check_region: Ok(()) check_region -->> UserPtrUserConstPtr: Ok(()) UserPtrUserConstPtr ->> UserPtrUserConstPtr: access_user_memory(...) UserPtrUserConstPtr ->> UserMemory: Safe memory access UserMemory -->> UserPtrUserConstPtr: Data UserPtrUserConstPtr -->> KernelCode: Return reference/slice else Region is invalid or inaccessible AddrSpace -->> check_region: Err(EFAULT) check_region -->> UserPtrUserConstPtr: Err(EFAULT) UserPtrUserConstPtr -->> KernelCode: Return error end
Sources: src/lib.rs(L31 - L54) src/lib.rs(L109 - L117) src/lib.rs(L22 - L29)
Type Conversion and Construction
Both UserPtr<T>
and UserConstPtr<T>
implement From<usize>
for convenient construction from raw addresses:
flowchart TD A["usize (memory address)"] B["UserPtr"] C["UserConstPtr"] A --> B A --> C
Sources: src/lib.rs(L130 - L134) src/lib.rs(L221 - L225)
Null-Terminated Data Handling
The library provides special handling for null-terminated data structures like C strings:
flowchart TD A["UserPtr/UserConstPtr"] B["get_as_null_terminated()"] C["check_null_terminated()"] D["traverse memory safely"] E["find null terminator"] F["return slice up to terminator"] G["UserConstPtr"] H["get_as_str()"] I["get_as_null_terminated()"] J["validate UTF-8"] K["return &str"] A --> B B --> C C --> D D --> E E --> F G --> H H --> I I --> J J --> K
Sources: src/lib.rs(L56 - L107) src/lib.rs(L201 - L217) src/lib.rs(L280 - L292) src/lib.rs(L294 - L303)
UserPtr API
Relevant source files
Purpose and Overview
This document provides detailed information about the UserPtr<T>
type, which enables safe access to mutable user-space memory from kernel code. The API ensures memory safety through rigorous access validation, proper alignment checking, and context-aware page fault handling.
For information about the read-only equivalent, see UserConstPtr API.
UserPtr<T>
wraps a raw pointer (*mut T
) to user-space memory and provides methods to safely access it through the kernel, preventing common vulnerabilities like null pointer dereferences and buffer overflows.
Sources: src/lib.rs(L1 - L7) src/lib.rs(L129 - L130)
Type Definition and Core Properties
UserPtr<T>
is defined as a transparent wrapper around a *mut T
raw pointer:
#[repr(transparent)]
pub struct UserPtr<T>(*mut T);
Key properties:
- Transparent representation: Ensures the struct has the same memory layout as a raw pointer
- Generic over type
T
: Can point to any type - Access flags: Includes both READ and WRITE permissions
Sources: src/lib.rs(L129 - L130) src/lib.rs(L137 - L138)
Basic Methods
Construction and Conversion
UserPtr<T>
can be constructed from a raw usize
memory address:
flowchart TD A["usize address"] B["UserPtr<T>"] C["UserPtr<T>"] D["UserPtr<U>"] A --> B C --> D
Pointer Manipulation Methods
Method | Description | Return Type |
---|---|---|
address() | Gets the virtual address | VirtAddr |
as_ptr() | Unwraps to a raw pointer (unsafe) | *mut T |
cast() | Casts to a different type | UserPtr |
is_null() | Checks if the pointer is null | bool |
nullable() | Converts to an Option (None if null) | Option |
Sources: src/lib.rs(L130 - L169)
Memory Access Methods
The UserPtr<T>
API provides three primary methods for safely accessing user-space memory:
flowchart TD UserPtr["UserPtr<T>"] A["&mut T(Single value)"] B["&mut [T](Fixed-length array)"] C["&mut [T](Null-terminated array)"] UserPtr --> A UserPtr --> B UserPtr --> C
get()
Retrieves a single value of type T
from user-space memory:
#![allow(unused)] fn main() { pub fn get(&mut self, aspace: impl AddrSpaceProvider) -> LinuxResult<&mut T> }
This method:
- Validates the memory region
- Checks alignment
- Verifies read/write permissions
- Populates the page tables if necessary
- Returns a mutable reference if successful, or an error (EFAULT) if access is invalid
Sources: src/lib.rs(L175 - L183)
get_as_slice()
Retrieves a fixed-length slice of elements from user-space memory:
#![allow(unused)] fn main() { pub fn get_as_slice( &mut self, aspace: impl AddrSpaceProvider, length: usize ) -> LinuxResult<&mut [T]> }
This method performs the same safety checks as get()
but for an array of specified length.
Sources: src/lib.rs(L186 - L199)
get_as_null_terminated()
Retrieves a null-terminated array from user-space memory:
#![allow(unused)] fn main() { pub fn get_as_null_terminated( &mut self, aspace: impl AddrSpaceProvider ) -> LinuxResult<&mut [T]> }
This specialized method:
- Searches for a null value (T::default()) to determine array length
- Validates each memory page during the search
- Returns a mutable slice containing all elements up to (but not including) the null terminator
This method requires that type T
implements Eq + Default
traits.
Sources: src/lib.rs(L204 - L217)
Memory Safety Mechanism
The UserPtr<T>
API employs a multi-layered safety mechanism to prevent kernel crashes when accessing user-space memory:
sequenceDiagram participant KernelCode as "Kernel Code" participant UserPtr as "UserPtr" participant check_region as "check_region()" participant AddrSpace as "AddrSpace" participant access_user_memory as "access_user_memory()" KernelCode ->> UserPtr: get() UserPtr ->> check_region: check_region_with() check_region ->> check_region: Check alignment check_region ->> AddrSpace: check_region_access() AddrSpace -->> check_region: Access allowed/denied alt Access allowed check_region ->> AddrSpace: populate_area() AddrSpace -->> check_region: Pages populated check_region -->> UserPtr: OK UserPtr ->> access_user_memory: Set ACCESSING_USER_MEM flag access_user_memory ->> UserPtr: Access memory safely UserPtr -->> KernelCode: Return reference else Access denied check_region -->> UserPtr: EFAULT UserPtr -->> KernelCode: Return error end
Key safety components:
- Alignment Checking: Ensures the pointer is properly aligned for the target type
- Access Validation: Verifies memory region is accessible with appropriate permissions
- Page Table Population: Prepares memory pages before access
- Context-Aware Page Fault Handling: Uses the
ACCESSING_USER_MEM
flag to permit controlled page faults - Error Propagation: Returns
LinuxError::EFAULT
when access is denied
Sources: src/lib.rs(L11 - L54) src/lib.rs(L175 - L183)
Usage Pattern
The typical usage pattern for UserPtr<T>
involves:
flowchart TD A["Create UserPtrfrom usize address"] B["Obtain AddrSpaceProvider"] C["Call appropriate get()method"] D["Use returned referencesafely"] E["Handle error(EFAULT)"] A --> B B --> C C --> D C --> E
Example Usage Flow
- Obtain a user-space address (typically from a system call parameter)
- Convert it to a
UserPtr<T>
- Get an address space provider (typically from the current process)
- Call one of the
get*
methods to safely access the memory - Use the returned reference to read or modify user memory
- Handle any errors (typically EFAULT for invalid access)
Sources: src/lib.rs(L175 - L217)
Relationship with AddrSpaceProvider
The UserPtr<T>
API relies on the AddrSpaceProvider
trait to abstract away the details of the underlying memory management system:
This abstraction allows the same UserPtr<T>
implementation to work with different memory management systems, as long as they implement the AddrSpaceProvider
trait.
Sources: src/lib.rs(L119 - L126) src/lib.rs(L175 - L183)
Implementation Details
Memory Region Validation
When accessing user memory, UserPtr<T>
validates the memory region through the check_region
function, which:
- Verifies proper alignment for the target type
- Checks access permissions (READ+WRITE for
UserPtr<T>
) - Ensures the memory pages are populated
For null-terminated arrays, the specialized check_null_terminated
function:
- Validates memory page by page while searching for the null terminator
- Handles page faults that might occur during the search
- Determines the total length of the array up to the null terminator
Sources: src/lib.rs(L31 - L54) src/lib.rs(L56 - L107)
UserConstPtr API
Relevant source files
This document provides a comprehensive reference for the UserConstPtr<T>
type, which enables safe read-only access to user-space memory from kernel code. For information about mutable access to user memory, see UserPtr API.
Overview
UserConstPtr<T>
is a generic wrapper around a raw const pointer (*const T
) that provides memory-safe operations for reading data from user space. It implements safety checks that prevent common issues like null pointer dereferences, buffer overflows, and illegal memory accesses.
classDiagram class UserConstPtr~T~ { +*const T pointer +const ACCESS_FLAGS: MappingFlags +address() VirtAddr +as_ptr() *const T +cast~U~() UserConstPtr~U~ +is_null() bool +nullable() Option~Self~ +get() LinuxResult~&T~ +get_as_slice() LinuxResult~&[T]~ +get_as_null_terminated() LinuxResult~&[T]~ } class UserConstPtr_cchar { +get_as_str() LinuxResult~&str~ } UserConstPtr_cchar --|> UserConstPtr : "Specializedimplementation"
Sources: src/lib.rs(L219 - L303)
Memory Safety Architecture
The UserConstPtr<T>
type is part of a comprehensive memory safety system that prevents the kernel from crashing when accessing potentially invalid user memory. Unlike raw pointers, UserConstPtr<T>
operations perform several safety checks before accessing user memory:
flowchart TD A["UserConstPtr.get()"] B["check_region_with()"] C["Is pointeraligned?"] D["Error: EFAULT"] E["Valid memoryaccess rights?"] F["Populate page tables"] G["Page tablespopulated?"] H["Error: ENOMEM"] I["Set ACCESSING_USER_MEM flag"] J["Access memory"] K["Clear ACCESSING_USER_MEM flag"] L["Return reference"] A --> B B --> C C --> D C --> E E --> D E --> F F --> G G --> H G --> I I --> J J --> K K --> L
Sources: src/lib.rs(L31 - L54) src/lib.rs(L258 - L266)
Type Definition
UserConstPtr<T>
is defined as a transparent wrapper around a raw const pointer:
#[repr(transparent)]
pub struct UserConstPtr<T>(*const T);
The #[repr(transparent)]
attribute ensures that UserConstPtr<T>
has the same memory layout as *const T
, making it efficient for passing across FFI boundaries.
Sources: src/lib.rs(L219 - L221)
Constants
Constant | Type | Description |
---|---|---|
ACCESS_FLAGS | MappingFlags | Specifies required memory access flags (READ) for user memory regions |
Sources: src/lib.rs(L227 - L228)
Basic Methods
Conversion and Type Manipulation
Method | Signature | Description |
---|---|---|
From | fn from(value: usize) -> Self | Creates aUserConstPtr |
address | fn address(&self) -> VirtAddr | Returns the virtual address of the pointer |
as_ptr | unsafe fn as_ptr(&self) -> *const T | Returns the underlying raw pointer (unsafe) |
cast | fn cast(self) -> UserConstPtr | Casts the pointer to a different type |
Sources: src/lib.rs(L221 - L243)
Null Checking
Method | Signature | Description |
---|---|---|
is_null | fn is_null(&self) -> bool | Checks if the pointer is null |
nullable | fn nullable(self) -> Option | Converts toNoneif null, orSome(self)otherwise |
Sources: src/lib.rs(L245 - L253)
Memory Access Methods
Single Value Access
#![allow(unused)] fn main() { fn get(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&T> }
Safely retrieves a reference to the value pointed to by UserConstPtr<T>
:
- Validates memory alignment
- Checks user memory access permissions
- Populates page tables if necessary
- Returns a reference or
EFAULT
error if access failed
Sources: src/lib.rs(L258 - L266)
Slice Access
#![allow(unused)] fn main() { fn get_as_slice(&self, aspace: impl AddrSpaceProvider, length: usize) -> LinuxResult<&[T]> }
Safely retrieves a slice of values:
- Validates memory region for the entire slice
- Verifies alignment and access permissions
- Returns a slice reference or error if access failed
Sources: src/lib.rs(L269 - L277)
Null-Terminated Data
#![allow(unused)] fn main() { fn get_as_null_terminated(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&[T]> where T: Eq + Default, }
Retrieves a slice of values terminated by a null value (default value of type T
):
- Scans memory until it finds a null value
- Checks access permissions page-by-page during scan
- Returns a slice that includes all values up to (but not including) the null terminator
Sources: src/lib.rs(L282 - L291)
String-Specific Operations
UserConstPtr<c_char>
has an additional method for safely retrieving strings from user space:
#![allow(unused)] fn main() { fn get_as_str(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&'static str> }
This method:
- Gets the null-terminated array using
get_as_null_terminated
- Transmutes the char array to bytes
- Validates that the bytes form valid UTF-8
- Returns a string reference or
EILSEQ
error if invalid UTF-8
Sources: src/lib.rs(L294 - L302)
Memory Access Flow
The following diagram illustrates the complete flow of operations when accessing user memory with UserConstPtr
:
sequenceDiagram participant KernelCode as "Kernel Code" participant UserConstPtrT as "UserConstPtr<T>" participant AddrSpaceProvider as "AddrSpaceProvider" participant check_region as "check_region" participant UserMemory as "User Memory" KernelCode ->> UserConstPtrT: Call get/get_as_slice/etc. UserConstPtrT ->> AddrSpaceProvider: with_addr_space() AddrSpaceProvider ->> check_region: check_region() check_region ->> check_region: Check alignment check_region ->> check_region: Check access permissions check_region ->> check_region: Populate page tables alt Access Allowed check_region ->> AddrSpaceProvider: Ok(()) AddrSpaceProvider ->> UserConstPtrT: Ok(()) UserConstPtrT ->> UserConstPtrT: Set ACCESSING_USER_MEM = true UserConstPtrT ->> UserMemory: Read memory UserConstPtrT ->> UserConstPtrT: Set ACCESSING_USER_MEM = false UserConstPtrT ->> KernelCode: Return reference else Access Denied check_region ->> AddrSpaceProvider: Err(EFAULT) AddrSpaceProvider ->> UserConstPtrT: Err(EFAULT) UserConstPtrT ->> KernelCode: Return error end
Sources: src/lib.rs(L22 - L29) src/lib.rs(L31 - L54) src/lib.rs(L258 - L266)
Usage Example
Here's a conceptual example of using UserConstPtr
:
- Receive a user address as a
usize
- Convert it to a
UserConstPtr<T>
- Check if it's null
- Access the user memory safely
- Handle any errors appropriately
Differences from UserPtr
While UserPtr<T>
provides mutable access with both read and write permissions, UserConstPtr<T>
is specifically designed for read-only access:
Feature | UserPtr | UserConstPtr |
---|---|---|
Underlying type | *mut T | *const T |
Access flags | `READ | WRITE` |
Reference type | &mut T | &T |
Usage | Reading and writing | Reading only |
Sources: src/lib.rs(L137) src/lib.rs(L227 - L228)
Thread Safety
UserConstPtr<T>
operations use a thread-local variable ACCESSING_USER_MEM
that informs the page fault handler that a page fault during memory access should be handled rather than causing a kernel panic. This flag is automatically set and cleared during memory access operations.
Sources: src/lib.rs(L11 - L12) src/lib.rs(L22 - L29)
Helper Functions
Relevant source files
This document describes the utility functions in the axptr library that support safe user-space memory access in kernel code. These helper functions implement the core safety mechanisms behind the UserPtr
and UserConstPtr
types but are not typically used directly by client code. For information about the user pointer types themselves, see UserPtr API and UserConstPtr API.
Per-CPU Flag System
The foundation of axptr's safety system is a per-CPU boolean flag that tracks when the kernel is accessing user memory.
flowchart TD A["User Memory Access Request"] B["is_accessing_user_memory()"] C["ACCESSING_USER_MEM flag"] D["OS allows page faultsfrom kernel mode"] E["OS handles as regularkernel page fault"] F["access_user_memory()"] G["ACCESSING_USER_MEM = true"] H["Execute memory accesscallback function"] I["ACCESSING_USER_MEM = false"] J["Return result"] A --> B B --> C C --> D C --> E F --> G G --> H H --> I I --> J
Sources: src/lib.rs(L11 - L29)
The library provides two key functions for working with this system:
is_accessing_user_memory()
: A public function that returns the current state of theACCESSING_USER_MEM
flag. Operating system implementations should check this flag when handling page faults in kernel mode - if it returns true, page faults should be allowed to proceed (as they might be from legitimate user memory access attempts).access_user_memory<R>(f: impl FnOnce() -> R) -> R
: An internal function that executes a callback with the user memory access flag set to true. This function:
- Sets the
ACCESSING_USER_MEM
flag to true - Executes the provided callback function
- Restores the flag to false
- Returns the result of the callback
The ACCESSING_USER_MEM
flag is implemented as a per-CPU variable using the percpu
crate to ensure thread safety without locking overhead.
Memory Region Validation
Before accessing user memory, axptr performs thorough validation using the check_region
function.
flowchart TD A["check_region()"] B["Memory aligned?"] C["Return EFAULT"] D["Access permissionsgranted?"] E["Populate page tables"] F["Return error"] G["Return OK"] A --> B B --> C B --> D D --> C D --> E E --> F E --> G
Sources: src/lib.rs(L31 - L54)
The check_region
function performs several critical checks:
- Alignment Validation: Verifies that the start address has proper alignment for the requested data type. If misaligned, returns
EFAULT
. - Access Permission Check: Uses the
AddrSpace.check_region_access()
method to verify that the memory region has the appropriate access permissions (read/write). - Page Table Population: Calls
AddrSpace.populate_area()
to ensure that page tables are set up correctly for the memory region. This may involve mapping physical pages if they're not already mapped.
The library also provides a wrapper function check_region_with
that works with the AddrSpaceProvider
trait, simplifying its usage from the pointer types.
Null-Terminated Data Processing
A specialized helper function handles the common case of accessing null-terminated data (like C strings) from user space.
flowchart TD subgraph subGraph0["Page Boundary Handling"] E["Scan memory for null terminator"] F["Reached page boundary?"] G["Page has accesspermission?"] H["Return EFAULT"] I["Move to next page"] J["Found nullterminator?"] K["Advance to next element"] L["End scan"] end A["check_null_terminated()"] B["Memory aligned?"] C["Return EFAULT"] D["Set ACCESSING_USER_MEM = true"] M["Set ACCESSING_USER_MEM = false"] N["Return pointer and length"] A --> B B --> C B --> D D --> E E --> F F --> G F --> J G --> H G --> I I --> F J --> K J --> L K --> F L --> M M --> N
Sources: src/lib.rs(L56 - L107)
The check_null_terminated<T>
function provides a safe way to access variable-length, null-terminated data from user space:
- Initial Alignment Check: Verifies the start address has proper alignment for type T.
- Page-by-Page Scanning: Processes memory one page at a time, checking permissions at each page boundary. This approach allows handling of strings that span multiple pages.
- Safe Memory Access: Uses the
access_user_memory()
function to set theACCESSING_USER_MEM
flag during scanning, allowing proper handling of page faults that might occur. - Null Terminator Detection: Reads each element using
read_volatile()
and compares it to the default value (T::default()
) to find the null terminator.
This function supports the implementation of get_as_null_terminated()
in both UserPtr
and UserConstPtr
types, as well as get_as_str()
for UserConstPtr<c_char>
.
Integration With Address Space Provider
The helper functions integrate with the address space abstraction through the check_region_with
function.
Sources: src/lib.rs(L110 - L117) src/lib.rs(L119 - L126)
The check_region_with
function serves as a bridge between the high-level pointer types and the low-level memory region validation:
- It accepts an
AddrSpaceProvider
implementation (typically a reference to anAddrSpace
) - It calls
with_addr_space()
on the provider to get access to the actualAddrSpace
- It passes
check_region()
as a callback, forwarding the memory validation request - It returns the result of the validation
This design reduces code duplication and avoids excessive generic function instantiations, as noted in the source code comment.
Helper Function Usage Patterns
The following table summarizes how the helper functions are used by the public API:
Helper Function | Used By | Purpose |
---|---|---|
is_accessing_user_memory() | OS implementation | Determine if page faults in kernel mode should be allowed |
access_user_memory() | check_null_terminated() | Set flag during user memory scanning |
check_region() | check_region_with() | Validate memory region alignment and permissions |
check_null_terminated() | get_as_null_terminated() | Safely scan for null-terminated data |
check_region_with() | UserPtr::get(),UserConstPtr::get(), etc. | Bridge between pointer types and memory validation |
Sources: src/lib.rs(L175 - L182) src/lib.rs(L204 - L216) src/lib.rs(L258 - L266) src/lib.rs(L282 - L291)
These helper functions work together to create a comprehensive safety system that prevents the kernel from crashing when accessing user memory, while maintaining good performance and ergonomics.
Overview
Relevant source files
axprocess
is a process management crate designed for ArceOS that provides core abstractions and mechanisms for managing processes, threads, process groups, and sessions. This document introduces the high-level concepts, architecture, and components of the system.
For a deeper dive into the architecture, see Core Architecture.
Purpose and Scope
The axprocess
crate implements a hierarchical process management system inspired by Unix-like operating systems, providing the following capabilities:
- Process creation, management, and termination
- Thread management within processes
- Process grouping through process groups
- Session management for related process groups
- Parent-child process relationships
The crate manages the lifecycle of these entities while ensuring proper resource cleanup and memory safety using Rust's ownership model.
Sources: src/lib.rs(L1 - L19) Cargo.toml(L1 - L7) README.md(L1 - L5)
System Overview
axprocess
implements a hierarchical system with four primary abstractions:
flowchart TD subgraph subGraph0["Process Management Hierarchy"] S["Session"] PG["Process Group"] P["Process"] T["Thread"] end P --> T PG --> P S --> PG
- Session: A collection of process groups, typically associated with a user login
- Process Group: A collection of related processes, useful for signal handling
- Process: An execution environment with its own address space and resources
- Thread: An execution context within a process
Sources: src/lib.rs(L8 - L11) src/lib.rs(L16 - L19)
Core Components and Relationships
The system is organized in a hierarchical structure with well-defined relationships between components:
classDiagram class Session { sid: Pid process_groups: WeakMapUnsupported markdown: del~ +sid() Pid +process_groups() Vec~Arc~ProcessGroup~~ } class ProcessGroup { pgid: Pid session: Arc~Session~ processes: WeakMapUnsupported markdown: del~ +pgid() Pid +session() Arc~Session~ +processes() Vec~Arc~Process~~ } class Process { pid: Pid is_zombie: AtomicBool children: StrongMapUnsupported markdown: del~ parent: Weak~Process~ group: Arc~ProcessGroup~ +pid() Pid +exit() void +is_zombie() bool +fork(pid: Pid) ProcessBuilder } class Thread { tid: Pid process: Arc~Process~ +tid() Pid +process() &Arc~Process~ +exit(exit_code: i32) bool } Process "1" o-- "*" ProcessGroup : contains Process "1" o-- "*" ProcessGroup : contains Process "1" o-- "*" Thread : contains Process o-- Process Process "1" --> "*" Process : parent-child Process --> Process
Key concepts in this relationship:
- Sessions contain multiple process groups
- Process groups contain multiple processes
- Processes contain threads
- Processes form parent-child relationships
Sources: src/lib.rs(L13 - L14) src/lib.rs(L16 - L19)
Reference Management Strategy
The system uses a carefully designed reference management strategy to prevent memory leaks and ensure proper cleanup:
- Strong references (
Arc
): Used for upward relationships to ensure parent objects remain alive as long as their children need them - Weak references (
Weak
): Used for downward and circular relationships to prevent reference cycles
This strategy ensures that resources are properly cleaned up when they're no longer needed, while maintaining the necessary relationships between components.
Sources: Cargo.toml(L8 - L11)
Process Lifecycle
Processes in the system follow a lifecycle from creation to termination:
This lifecycle management ensures proper resource cleanup and allows parent processes to retrieve exit status from terminated child processes.
For detailed information about process lifecycle, see Process Lifecycle.
Sources: src/lib.rs(L16)
Thread Management
Threads are execution contexts within a process:
flowchart TD subgraph subGraph0["Thread Management"] p["Process"] tb["ThreadBuilder"] t["Thread"] end p --> tb t --> p tb --> t
Each process can have multiple threads, and the last thread's exit typically triggers the process to exit as well. Thread creation is handled through the ThreadBuilder
pattern, providing a flexible way to configure new threads.
For more information on thread management, see Thread Management.
Sources: src/lib.rs(L19)
Integration with ArceOS
axprocess
serves as a foundational component in the ArceOS kernel, providing essential process management capabilities that other kernel subsystems build upon:
flowchart TD subgraph subGraph0["ArceOS Kernel Components"] axprocess["axprocess (Process Management)"] scheduler["Scheduler"] memory["Memory Management"] fs["File System"] end axprocess --> fs axprocess --> memory axprocess --> scheduler
The abstractions provided by axprocess
enable the development of higher-level operating system features and applications.
Sources: Cargo.toml(L6) README.md(L3)
Next Steps
For more detailed information about specific components and features of the axprocess
system, refer to these wiki pages:
- Process Management - Detailed explanation of processes and their management
- Process Groups and Sessions - Information about process grouping mechanisms
- Thread Management - Details about threads and their relationship to processes
- Memory Management - How memory and resources are managed across the system
Core Architecture
Relevant source files
This document explains the high-level architecture of the axprocess system, focusing on the core components and their relationships. It describes the hierarchical structure, component interactions, and memory management strategy used in the system. For specific details about process lifecycle management, see Process Lifecycle, and for thread management details, see Thread Management.
Component Overview
The axprocess system consists of four primary components that form a hierarchical structure:
- Session: A collection of process groups
- Process Group: A collection of processes
- Process: A basic unit of program execution that contains threads
- Thread: An execution unit within a process
Title: Core Component Hierarchy
Sources: src/process.rs src/process_group.rs src/session.rs src/thread.rs
Hierarchical Structure
The system follows a Unix-like hierarchical structure where components are organized in a containment hierarchy:
- Sessions contain multiple process groups and are identified by a session ID (sid)
- Process Groups contain multiple processes and are identified by a process group ID (pgid)
- Processes contain multiple threads and are identified by a process ID (pid)
- Threads are the execution units and are identified by a thread ID (tid)
Additionally, processes can have parent-child relationships with other processes, forming a separate process hierarchy.
flowchart TD subgraph subGraph2["Session (sid=100)"] subgraph subGraph0["ProcessGroup (pgid=100)"] P100["Process (pid=100)"] P101["Process (pid=101)"] P102["Process (pid=102)"] end subgraph subGraph1["ProcessGroup (pgid=200)"] P200["Process (pid=200)"] P201["Process (pid=201)"] end end T100["Thread (tid=100)"] T101["Thread (tid=101)"] T102["Thread (tid=102)"] P100 --> P101 P100 --> P102 P100 --> T100 P100 --> T101 P101 --> T102
Title: Hierarchical Container Relationships
Sources: src/process.rs(L34 - L164) src/process_group.rs(L12 - L17) src/session.rs(L12 - L16)
Component Relationships
Session and Process Group Relationship
Sessions contain process groups, and each process group belongs to exactly one session:
- A session is identified by a unique
sid
(Session ID) - Sessions maintain a weak map of process groups (
process_groups
) - Process groups hold a strong reference (
Arc
) to their session - New sessions are created using the
Session::new(sid)
method
Sources: src/session.rs(L12 - L27) src/process_group.rs(L14 - L30)
Process Group and Process Relationship
Process groups contain processes, and each process belongs to exactly one process group:
- A process group is identified by a unique
pgid
(Process Group ID) - Process groups maintain a weak map of processes (
processes
) - Processes hold a strong reference (
Arc
) to their process group - Processes can move between process groups using
Process::move_to_group()
Sources: src/process_group.rs(L12 - L47) src/process.rs(L84 - L164)
Process and Thread Relationship
Processes contain threads, and each thread belongs to exactly one process:
- A process contains a
ThreadGroup
which manages its threads - Threads hold a strong reference (
Arc
) to their process - Processes maintain weak references to their threads
- New threads are created using
Process::new_thread()
and built withThreadBuilder
Sources: src/process.rs(L18 - L31) src/process.rs(L167 - L192) src/thread.rs(L6 - L88)
Process Parent-Child Relationship
Processes form a hierarchy through parent-child relationships:
- Each process (except the init process) has a parent process
- Processes maintain strong references to their children
- Processes maintain weak references to their parents
- Child processes are created using
Process::fork()
- When a process exits, its children are inherited by the init process
Sources: src/process.rs(L70 - L81) src/process.rs(L195 - L237) src/process.rs(L261 - L282)
Reference Management Strategy
The system uses a carefully designed reference counting strategy to prevent memory leaks while ensuring proper cleanup:
flowchart TD subgraph subGraph1["Weak References (Weak)"] ProcessGroup2["ProcessGroup"] Process2["Process"] ParentProcess2["Parent Process"] Process3["Process"] Thread2["Thread"] end subgraph subGraph0["Strong References (Arc)"] Process["Process"] ProcessGroup["ProcessGroup"] Session["Session"] Thread["Thread"] ParentProcess["ParentProcess"] ChildProcess["Child Process"] end ParentProcess --> ChildProcess Process --> ProcessGroup Process2 --> ParentProcess2 Process3 --> Thread2 ProcessGroup --> Session ProcessGroup2 --> Process2 Session --> ProcessGroup2 Thread --> Process
Title: Reference Management Strategy
Key patterns in the reference management strategy:
- Upward References: Strong references (
Arc
) are used for upward relationships:
- Threads strongly reference their process
- Processes strongly reference their process group
- Process groups strongly reference their session
- Parent processes strongly reference their children
- Downward References: Weak references (
Weak
) are used for downward relationships:
- Sessions weakly reference their process groups
- Process groups weakly reference their processes
- Processes weakly reference their threads
- Processes weakly reference their parent
- Maps and Collections:
WeakMap
is used for downward referencesStrongMap
is used for the children collection in a process
This strategy ensures that components are kept alive as long as they're needed while preventing reference cycles that would cause memory leaks.
Sources: src/process.rs(L36 - L46) src/process_group.rs(L14 - L16) src/session.rs(L14 - L15) src/thread.rs(L7 - L11)
Process and Thread Lifecycle
Process Lifecycle
Title: Process Lifecycle States
The process lifecycle consists of these key stages:
- Creation: A process is created using
ProcessBuilder::build()
- Init process is created using
Process::new_init()
- Child processes are created using
Process::fork()
- Execution: The process is active and can create threads
- Termination: The process becomes a zombie when
Process::exit()
is called
- Its children are inherited by the init process
- It remains in the zombie state until freed
- Cleanup: The process resources are freed when
Process::free()
is called
Sources: src/process.rs(L195 - L237) src/process.rs(L261 - L331)
Thread Lifecycle
sequenceDiagram participant Process as Process participant ThreadBuilder as ThreadBuilder participant Thread as Thread participant ThreadGroup as ThreadGroup Process ->> ThreadBuilder: new_thread(tid) ThreadBuilder ->> ThreadBuilder: data(custom_data) ThreadBuilder ->> Thread: build() Thread ->> ThreadGroup: add to thread group Note over Thread: Thread execution Thread ->> ThreadGroup: exit(exit_code) ThreadGroup ->> ThreadGroup: remove thread ThreadGroup ->> Process: check if last thread alt Last thread Process ->> Process: may trigger process exit end
Title: Thread Lifecycle Flow
The thread lifecycle consists of these key stages:
- Creation: A thread is created using
ThreadBuilder::build()
- Process creates a new thread using
Process::new_thread()
- Thread is added to the process's thread group
- Execution: The thread executes its workload
- Termination: The thread exits using
Thread::exit()
- If it's the last thread, it may trigger process termination
- Thread is removed from the thread group
Sources: src/thread.rs(L29 - L40) src/thread.rs(L51 - L88) src/process.rs(L167 - L177)
Builder Pattern Implementation
The system uses the Builder pattern for creating processes and threads, allowing for flexible configuration:
Process Builder
Title: Process Builder Pattern
Process::new_init()
creates aProcessBuilder
for the init processProcess::fork()
creates aProcessBuilder
for a child processProcessBuilder::data()
sets custom data for the processProcessBuilder::build()
creates and initializes the process
Sources: src/process.rs(L261 - L331)
Thread Builder
Title: Thread Builder Pattern
Process::new_thread()
creates aThreadBuilder
ThreadBuilder::data()
sets custom data for the threadThreadBuilder::build()
creates and initializes the thread
Sources: src/thread.rs(L51 - L88) src/process.rs(L167 - L177)
System Integration
The axprocess crate is designed to provide process management capabilities for the ArceOS kernel:
flowchart TD subgraph subGraph1["External Systems"] Scheduler["OS Scheduler"] MemoryManagement["Memory Management"] FileSystem["File System"] end subgraph subGraph0["axprocess Crate"] Process["Process"] ProcessLifecycle["Process Lifecycle"] ThreadManagement["Thread Management"] ProcessGroups["Process Groups"] Sessions["Sessions"] ProcessBuilder["ProcessBuilder"] ParentChild["Parent-Child Relations"] ThreadBuilder["ThreadBuilder"] ThreadGroup["ThreadGroup"] end Process --> FileSystem Process --> MemoryManagement Process --> ProcessGroups Process --> ProcessLifecycle Process --> Scheduler Process --> Sessions Process --> ThreadManagement ProcessLifecycle --> ParentChild ProcessLifecycle --> ProcessBuilder ThreadManagement --> ThreadBuilder ThreadManagement --> ThreadGroup
Title: System Integration Overview
This process management system provides the foundation for:
- Creating and managing processes and threads
- Organizing processes into hierarchical structures
- Managing process lifecycle from creation to cleanup
- Supporting Unix-like process relationships
The design emphasizes:
- Memory safety through careful reference management
- Clear separation of concerns with distinct component types
- Flexibility through builder patterns
- Performance with minimal locking
Sources: src/lib.rs src/process.rs src/thread.rs
Process Management
Relevant source files
This document explains the process abstraction in the axprocess crate, detailing its internal structure, lifecycle, and key operations. The Process Management system provides the core functionality for creating, maintaining, and terminating processes within the ArceOS kernel.
For details on process creation, see Process Creation and Initialization. For information on parent-child relationships, see Parent-Child Relationships.
Process Structure
The Process
struct is the central component of the process management system, encapsulating all resources and state information for a running process.
classDiagram class Process { pid: Pid is_zombie: AtomicBool tg: SpinNoIrq data: Box children: SpinNoIrq~~ parent: SpinNoIrq~ group: SpinNoIrq~ +pid() Pid +data() Option~&T~ +is_init() bool +parent() Option~ +children() Vec~ +exit() +free() +fork(pid: Pid) ProcessBuilder } class ThreadGroup { threads: WeakMap~ exit_code: i32 group_exited: bool } Process --> ThreadGroup : contains
Sources: src/process.rs(L35 - L47) src/process.rs(L18 - L31)
The Process
struct maintains:
- A unique process ID (
pid
) - Zombie state tracking (
is_zombie
) - Thread management through
ThreadGroup
- Custom data storage (
data
) - Process hierarchy relationships (
children
,parent
) - Process group membership (
group
)
Process Creation
Processes are created using the Builder pattern, which provides a flexible way to initialize a new process with various configurations.
sequenceDiagram participant ParentProcess as "Parent Process" participant ProcessBuilder as "ProcessBuilder" participant NewProcess as "New Process" participant ProcessGroup as "ProcessGroup" Note over ParentProcess: Exists already ParentProcess ->> ProcessBuilder: "fork(new_pid)" ProcessBuilder ->> ProcessBuilder: "data(custom_data)" ProcessBuilder ->> NewProcess: "build()" NewProcess ->> ProcessGroup: Join group NewProcess ->> ParentProcess: Add as child Note over NewProcess: Ready to run
Sources: src/process.rs(L262 - L281) src/process.rs(L284 - L332)
There are two primary ways to create processes:
- Init Process Creation: The first process in the system is created using
Process::new_init()
, which returns aProcessBuilder
configured for the init process. - Child Process Creation: Existing processes can create child processes using
Process::fork()
, which returns aProcessBuilder
with the parent relationship already established.
The ProcessBuilder
allows setting custom data before finalizing process creation with build()
, which:
- Creates the process object
- Establishes parent-child relationships
- Adds the process to its process group
- Initializes the thread group
Process Lifecycle
Processes in axprocess follow a defined lifecycle from creation to termination and cleanup.
Sources: src/process.rs(L196 - L236) tests/process.rs(L16 - L44)
The process lifecycle consists of these key stages:
- Active: After creation, a process is active and can create threads, spawn child processes, and perform operations.
- Zombie: When a process terminates via
Process::exit()
, it becomes a zombie - the process has terminated but its resources are not fully released. At this point:
- The process is marked as zombie (
is_zombie = true
) - Child processes are reassigned to the init process
- The process remains in its parent's children list
- Freed: The parent process must call
Process::free()
on a zombie process to complete cleanup, which removes it from the parent's children list.
Note that the init process cannot exit, as enforced by a panic check in the exit()
method.
Process Hierarchy
Processes are organized in a hierarchical parent-child structure, similar to Unix-like systems.
flowchart TD subgraph subGraph0["Process Hierarchy"] Init["Init Process (PID 1)"] P1["Process (PID 2)"] P2["Process (PID 3)"] C1["Child Process (PID 4)"] C2["Child Process (PID 5)"] C3["Child Process (PID 6)"] end Init --> P1 Init --> P2 P1 --> C1 P1 --> C2 P2 --> C3
Sources: src/process.rs(L71 - L81) src/process.rs(L207 - L224) tests/process.rs(L47 - L55)
Key aspects of process hierarchy:
- Init Process: The root of the process hierarchy, created during system initialization. It cannot be terminated and serves as the fallback parent for orphaned processes.
- Parent-Child Relationships:
- Each process except init has exactly one parent
- A process can have multiple children
- These relationships are maintained using Arc/Weak references to prevent reference cycles
- Orphan Handling: When a parent process exits, its children are reassigned to the init process (known as "reaping"). This ensures all processes always have a valid parent.
Thread Management
Each process can contain multiple threads, managed through a thread group.
Sources: src/process.rs(L18 - L31) src/process.rs(L167 - L191)
The thread management system includes:
- ThreadGroup: Each process contains a
ThreadGroup
that tracks:
- All threads belonging to the process
- Exit code information
- Group exit status
- Thread Creation: New threads are created using:
process.new_thread(tid) -> ThreadBuilder
- Thread Listing: All threads in a process can be retrieved with:
process.threads() -> Vec<Arc<Thread>>
- Group Exit: A process can be marked as "group exited", which affects all its threads:
process.group_exit()
Custom Process Data
The Process structure allows associating arbitrary data with each process through a type-erased container.
Process
└── data: Box<dyn Any + Send + Sync>
Sources: src/process.rs(L40) src/process.rs(L55 - L58) src/process.rs(L293 - L297)
Custom data can be:
- Set during process creation via
ProcessBuilder::data<T>(data: T)
- Retrieved with
process.data<T>()
, which returnsOption<&T>
This mechanism provides flexibility for higher-level subsystems to extend process functionality without modifying the core Process structure.
Process Management API Summary
Operation | Method | Description |
---|---|---|
Create init process | Process::new_init(pid) | Creates the first process in the system |
Create child process | parent.fork(pid) | Creates a new process with the specified parent |
Get process ID | process.pid() | Returns the process ID |
Get parent | process.parent() | Returns the parent process, if any |
Get children | process.children() | Returns all child processes |
Create thread | process.new_thread(tid) | Creates a new thread in the process |
Check zombie state | process.is_zombie() | Returns true if process is a zombie |
Terminate process | process.exit() | Terminates the process, making it a zombie |
Clean up zombie | process.free() | Frees resources for a zombie process |
Get custom data | process.data | Returns custom data associated with process |
Sources: src/process.rs(L49 - L341)
Process Creation and Initialization
Relevant source files
This page documents how processes are created and initialized in the axprocess crate. We'll explore the creation of the init process, the ProcessBuilder pattern for process construction, and how child processes are created through forking. For information about the complete process lifecycle, including termination and cleanup, see Process Lifecycle.
Overview
In axprocess, all processes are created using a builder pattern that ensures proper initialization and establishment of hierarchical relationships. The system supports two primary creation paths:
- Creating the special "init process" (the first process in the system)
- Creating child processes by "forking" from existing parent processes
flowchart TD A["Process Creation"] B["Init Process Creation"] C["Child Process Creation"] D["ProcessBuilder::new_init()"] E["parent.fork()"] F["ProcessBuilder::build()"] G["New Process Instance"] A --> B A --> C B --> D C --> E D --> F E --> F F --> G
Sources: src/process.rs(L262 - L332)
The Init Process
The init process is the first process in the system and serves as the "root" of the process hierarchy. It has no parent and adopts orphaned processes when their parents exit.
Creating the Init Process
The init process is created using the Process::new_init
method and stored in a static variable for system-wide access.
sequenceDiagram participant ClientCode as "Client Code" participant ProcessBuilder as "ProcessBuilder" participant INIT_PROCstatic as "INIT_PROC (static)" participant NewSession as "New Session" participant NewProcessGroup as "New Process Group" ClientCode ->> ProcessBuilder: "Process::new_init(pid)" ClientCode ->> ProcessBuilder: "build()" ProcessBuilder ->> NewSession: "Session::new(pid)" ProcessBuilder ->> NewProcessGroup: "ProcessGroup::new(pid, session)" ProcessBuilder ->> INIT_PROCstatic: "INIT_PROC.init_once(process)" Note over INIT_PROCstatic: "Init process stored for<br>system-wide access"
Sources: src/process.rs(L262 - L272) src/process.rs(L301 - L331) src/process.rs(L334 - L341)
The Init Process Responsibilities
The init process has special responsibilities in the system:
- Cannot be terminated (the system enforces this)
- Adopts orphaned processes when their parents exit
- Provides the foundation for the process hierarchy
The code explicitly prevents the init process from exiting:
#![allow(unused)] fn main() { pub fn exit(self: &Arc<Self>) { if self.is_init() { panic!("init process cannot exit"); } // Exit code continues... } }
Sources: src/process.rs(L207 - L226) src/process.rs(L334 - L341)
The ProcessBuilder Pattern
The ProcessBuilder
struct provides a flexible way to configure and create new processes. It follows the builder pattern, allowing optional configurations before building the actual process.
ProcessBuilder Fields
Field | Type | Description |
---|---|---|
pid | Pid | Process identifier |
parent | Option<Arc | Parent process (None for init) |
data | Box<dyn Any + Send + Sync> | Custom data associated with the process |
Sources: src/process.rs(L285 - L289)
Process Construction Flow
sequenceDiagram participant ClientCode as "Client Code" participant ProcessBuilder as "ProcessBuilder" participant NewProcess as "New Process" participant ProcessGroup as "Process Group" ClientCode ->> ProcessBuilder: "new_init(pid) or fork(pid)" opt Configure Process ClientCode ->> ProcessBuilder: "data(custom_data)" end ClientCode ->> ProcessBuilder: "build()" ProcessBuilder ->> ProcessGroup: "Get parent's group or create new" ProcessBuilder ->> NewProcess: "Create process with necessary fields" ProcessBuilder ->> ProcessGroup: "Add process to group" alt Has Parent ProcessBuilder ->> NewProcess: "Add as child to parent" else No Parent (Init) ProcessBuilder ->> NewProcess: "Store as INIT_PROC" end ProcessBuilder -->> ClientCode: "Return Arc<Process>"
Sources: src/process.rs(L262 - L332)
Child Process Creation (Forking)
Child processes are created by "forking" from an existing parent process. This is done using the fork
method on a parent process.
Fork Process
flowchart TD A["Parent Process"] B["ProcessBuilder"] C["Configured Builder"] D["Child Process"] E["Parent's Process Group"] F["Parent-Child Relationship"] A --> B B --> C C --> D D --> E D --> F
Sources: src/process.rs(L275 - L281) src/process.rs(L301 - L331)
Inheritance During Forking
When a process is forked, the child process inherits several properties from its parent:
- Process Group: The child joins the parent's process group by default
- Parent Reference: The child maintains a reference to its parent
- Children Collection: The parent adds the child to its children collection
The code establishes these relationships during the build
method:
// Set parent-child relationship
if let Some(parent) = parent {
parent.children.lock().insert(pid, process.clone());
}
// Child inherits parent's group or creates new group for init
let group = parent.as_ref().map_or_else(
|| {
let session = Session::new(pid);
ProcessGroup::new(pid, &session)
},
|p| p.group(),
);
Sources: src/process.rs(L303 - L330)
Process Initialization Details
When a new process is created, several key initialization steps occur:
- Process Structure: A new
Process
structure is allocated with the provided PID - Zombie State: Set to
false
initially - Thread Group: Empty thread group is initialized
- Custom Data: Any provided custom data is stored
- Children Map: Empty children map is created
- Parent Reference: Weak reference to parent is stored (if any)
- Process Group: Process joins parent's group or creates a new group
- Registration: Process is registered with its group
Memory Management Strategy
Process creation uses a careful reference counting strategy to prevent memory leaks:
- Strong References (
Arc<Process>
):
- From parent to children
- From process group to processes
- From threads to their process
- Weak References (
Weak<Process>
):
- From child to parent (prevents reference cycles)
- From process group to processes in weak maps
flowchart TD subgraph subGraph0["Reference Relationship"] P1["Parent Process"] C1["Child Process"] PG["Process Group"] T1["Thread"] end C1 --> P1 C1 --> PG P1 --> C1 P1 --> PG PG --> C1 PG --> P1 T1 --> P1
Sources: src/process.rs(L301 - L331)
Practical Example
Here's how a process hierarchy might be created in code:
// Create init process with PID 0
let init = Process::new_init(0).build();
// Create a child process with PID 1
let child1 = init.fork(1).build();
// Create another child with PID 2 and custom data
let child2 = init.fork(2).data(MyCustomData { value: 42 }).build();
// Create a "grandchild" process with PID 3
let grandchild = child1.fork(3).build();
The resulting hierarchy would look like:
flowchart TD Init["Init Process (PID 0)"] Child1["Child Process (PID 1)"] Child2["Child Process (PID 2)with custom data"] Grandchild["Grandchild Process (PID 3)"] Child1 --> Grandchild Init --> Child1 Init --> Child2
Sources: src/process.rs(L262 - L332) tests/common/mod.rs(L15 - L28)
Implementation of ProcessBuilder::build
The build
method is the core of process initialization. It takes the builder's configuration and constructs a fully initialized process with proper relationships.
The method performs these key steps:
- Determines the process group (from parent or creates new one)
- Constructs the Process struct with all fields
- Adds the process to its group
- Establishes parent-child relationship or marks as init process
- Returns the new process wrapped in an Arc
Sources: src/process.rs(L301 - L331)
System Integration
The process creation system integrates with other components of axprocess:
- Thread Creation: After a process is created, threads can be added using
process.new_thread(tid)
- Process Group Management: Processes can create or join process groups after creation
- Session Management: Processes can create new sessions
- Parent-Child Relations: The system maintains a hierarchy for resource inheritance and cleanup
Sources: src/process.rs(L168 - L177) src/process.rs(L100 - L163)
Summary
The axprocess crate provides a robust system for process creation and initialization that:
- Uses the builder pattern for flexible configuration
- Establishes proper hierarchical relationships
- Manages memory safely with appropriate reference counting
- Supports special handling for the init process
- Maintains proper process group and session memberships
This foundation enables the subsequent lifecycle management and inter-process relationships that are essential to an operating system's process subsystem.
Process Lifecycle
Relevant source files
This document details the lifecycle of a process in the axprocess crate, from creation through execution to termination and cleanup. For information about process creation techniques and initialization, see Process Creation and Initialization. For details on parent-child relationships, see Parent-Child Relationships.
Overview
Processes in the axprocess crate follow a well-defined lifecycle that ensures proper resource management and cleanup. The lifecycle consists of three primary states:
Sources: src/process.rs(L207 - L236) src/process.rs(L285 - L332)
Process Creation
Processes are created using the ProcessBuilder
pattern, which configures and then builds a new process instance.
Init Process Creation
The initialization of the first process (init process) is a special case:
// Create the init process
let init = Process::new_init(pid).build();
The init process is stored in a static INIT_PROC
variable and serves as the ancestor of all other processes. It cannot be terminated and serves as the "reaper" for orphaned processes.
Child Process Creation
Regular processes are created as children of existing processes using the fork
method:
// Creating a child process
let child = parent.fork(new_pid).build();
The ProcessBuilder
allows customizing the process before creation, such as setting associated data:
let child = parent.fork(new_pid)
.data(custom_data)
.build();
Sources: src/process.rs(L262 - L282) src/process.rs(L285 - L332)
Process States and Transitions
Sources: src/process.rs(L179 - L186) src/process.rs(L196 - L236)
Active State
An active process is fully functioning and can:
- Create child processes
- Create or join sessions and process groups
- Create threads
- Access and modify its associated data
An active process can be marked as "group exited" using the group_exit()
method, which sets an internal flag but doesn't terminate the process.
Sources: src/process.rs(L179 - L186)
Zombie State
When a process calls exit()
, it enters the zombie state:
- It is marked as a zombie using an atomic boolean flag
- Its children are reassigned to the init process (or nearest subreaper)
- Resources are partially released, but the process structure remains in memory
- The process remains in its parent's child list
A zombie process retains minimal information needed for the parent to retrieve its exit status.
Sources: src/process.rs(L196 - L225)
Process Cleanup
The final state transition occurs when a zombie process is freed using the free()
method:
- The process is removed from its parent's child list
- This allows for complete deallocation when all references are dropped
The free()
method will panic if called on a non-zombie process.
Sources: src/process.rs(L227 - L236)
Process Exit Mechanism
sequenceDiagram participant ExitingProcess as "Exiting Process" participant ParentProcess as "Parent Process" participant InitProcess as "Init Process" participant ChildProcesses as "Child Processes" ExitingProcess ->> ExitingProcess: "is_zombie.store(true, Ordering::Release)" Note over ExitingProcess: Process is now a zombie ExitingProcess ->> InitProcess: "Get init process" ExitingProcess ->> ChildProcesses: "For each child" loop Transfer children ChildProcesses ->> InitProcess: "Set init as new parent" ExitingProcess ->> ExitingProcess: "Remove from children list" InitProcess ->> InitProcess: "Add to children list" end Note over ExitingProcess,ParentProcess: Parent must call free() later ParentProcess ->> ExitingProcess: "free()" ParentProcess ->> ParentProcess: "Remove child from children list"
Sources: src/process.rs(L207 - L236)
The exit mechanism includes several key aspects:
- Atomic state change: The process uses atomic operations to mark itself as a zombie
- Child inheritance: All children are transferred to the init process
- Parent notification: The parent is responsible for calling
free()
to complete cleanup
Note that attempting to exit the init process will cause a panic, as the init process must always exist in the system.
Sources: src/process.rs(L207 - L225) tests/process.rs(L32 - L35)
Process Exit and Cleanup Code Flow
flowchart TD A["Process::exit() called"] B["Is init process?"] C["Panic"] D["Get init process as reaper"] E["Mark self as zombie"] F["Get all children"] G["For each child"] H["Set child's parent to reaper"] I["Add child to reaper's children"] J["More children?"] K["Exit complete (zombie state)"] L["Process::free() called"] M["Is process zombie?"] N["Panic"] O["Get parent"] P["Remove self from parent's children"] Q["Cleanup complete"] A --> B B --> C B --> D D --> E E --> F F --> G G --> H H --> I I --> J J --> G J --> K L --> M M --> N M --> O O --> P P --> Q
Sources: src/process.rs(L207 - L236)
Special Considerations
Init Process
The init process has special properties in the lifecycle:
- It cannot exit (attempting to call
exit()
on it will panic) - It acts as the "reaper" for orphaned processes
- It is created at system initialization and persists until system shutdown
Zombies and Resource Management
Zombie processes maintain minimal state while waiting for their parent to acknowledge their termination via free()
. This approach:
- Allows parents to retrieve exit status information
- Prevents resource leaks by ensuring proper cleanup
- Maintains a clean process hierarchy in the system
Sources: src/process.rs(L196 - L236) tests/process.rs(L16 - L23) tests/process.rs(L37 - L44)
Testing Process Lifecycle
The process lifecycle is validated through several test cases:
Test | Description |
---|---|
exit() | Verifies that a process can exit and becomes a zombie |
free() | Ensures a zombie process can be freed and removed from parent |
free_not_zombie() | Confirms that freeing a non-zombie process causes a panic |
init_proc_exit() | Verifies that attempting to exit the init process causes a panic |
reap() | Tests that children of an exited process are reassigned to init |
Sources: tests/process.rs(L16 - L55)
Implementation Details
The process lifecycle is primarily implemented in the Process
struct, with key lifecycle methods:
ProcessBuilder::build()
: Creates and initializes a new processProcess::exit()
: Terminates the process, making it a zombieProcess::free()
: Removes the zombie process from its parent's children listProcess::is_zombie()
: Checks if the process is in the zombie stateProcess::group_exit()
: Marks the process as group exited
Internally, the zombie state is tracked using an atomic boolean, ensuring thread-safe state transitions:
#![allow(unused)] fn main() { pub fn is_zombie(&self) -> bool { self.is_zombie.load(Ordering::Acquire) } }
Sources: src/process.rs(L196 - L236) src/process.rs(L36 - L47)
Parent-Child Relationships
Relevant source files
This document details how parent-child relationships between processes are managed in the axprocess
crate. It covers the implementation of process hierarchy, relationship establishment, orphan handling, and cleanup mechanisms. For information about the complete process lifecycle, see Process Lifecycle.
Relationship Structure
In the axprocess
crate, processes are organized in a hierarchical structure similar to Unix-like operating systems. Each process maintains references to both its parent and its children.
classDiagram class Process { pid: Pid parent: SpinNoIrq~ children: SpinNoIrq~~ +parent() Option~ +children() Vec~ } Process --> Process Process "1" --> "*" Process : children Process --> Process
Process Hierarchy Implementation
A process stores:
- A weak reference to its parent process (to avoid reference cycles)
- Strong references to all its child processes (ensuring children don't get dropped prematurely)
This implementation allows for proper resource management while maintaining the process hierarchy.
Sources: src/process.rs(L43 - L44) src/process.rs(L73 - L80)
Establishing Parent-Child Relationships
Parent-child relationships are established during process creation. A new process is created using the fork
method on an existing process, which returns a ProcessBuilder
.
sequenceDiagram participant ParentProcess as "Parent Process" participant ProcessBuilder as "ProcessBuilder" participant ChildProcess as "Child Process" ParentProcess ->> ProcessBuilder: "fork(pid)" ProcessBuilder ->> ProcessBuilder: "data(...)" ProcessBuilder ->> ChildProcess: "build()" ChildProcess -->> ParentProcess: "Add to children" ChildProcess -->> ChildProcess: "Store parent reference"
When ProcessBuilder::build()
is called:
- The new process stores a weak reference to its parent
- The parent adds the new process to its children collection
Sources: src/process.rs(L275 - L281) src/process.rs(L301 - L331)
Code Examples
Here's how the parent-child relationship is established during process creation:
- Parent reference in the new process:
parent: SpinNoIrq::new(parent.as_ref().map(Arc::downgrade).unwrap_or_default())
- Adding the child to the parent's children:
if let Some(parent) = parent {
parent.children.lock().insert(pid, process.clone());
}
Sources: src/process.rs(L318 - L328)
Accessing Relationships
Processes provide methods to access their relationships:
flowchart TD P["Process"] PP["parent()"] C["children()"] OAP["Option>"] VAC["Vec>"] C --> VAC P --> C P --> PP PP --> OAP
parent()
: Returns the parent process if it existschildren()
: Returns a vector of all child processes
Sources: src/process.rs(L73 - L80)
Orphan Handling
When a process exits, its children become orphans. The axprocess
system handles this by reparenting these orphan processes to the init process.
sequenceDiagram participant Process as "Process" participant ChildProcesses as "Child Processes" participant InitProcess as "Init Process" Process ->> Process: "exit()" Process ->> Process: "is_zombie = true" Process ->> InitProcess: "Get init_proc" Process ->> ChildProcesses: "For each child" loop For each child ChildProcesses ->> ChildProcesses: "Set parent to init" ChildProcesses ->> InitProcess: "Add to init's children" end
When a process calls exit()
:
- It's marked as a zombie
- Its children are transferred to the init process:
- Each child updates its parent reference to point to init
- The child is added to init's children collection
This ensures no process becomes truly orphaned, maintaining the integrity of the process hierarchy.
Sources: src/process.rs(L207 - L224)
Process Cleanup
When a zombie process is freed, it's removed from its parent's children collection:
flowchart TD Z["Zombie Process"] P["Parent"] C["children collection"] P --> C Z --> P
This is performed by the free()
method, which can only be called on zombie processes:
- It checks that the process is a zombie
- It removes itself from its parent's children collection
Sources: src/process.rs(L230 - L236)
Special Role of the Init Process
The init process serves as the root of the process hierarchy and has special characteristics:
flowchart TD subgraph subGraph0["Special Properties"] NP["No Parent"] CE["Cannot Exit"] OR["Orphan Reaper"] end I["Init Process"] C1["Child 1"] C2["Child 2"] O["Orphaned Processes"] I --> C1 I --> C2 I --> CE I --> NP I --> O I --> OR
The init process:
- Is created without a parent
- Cannot exit (attempting to call
exit()
on it will panic) - Serves as the adoptive parent for all orphaned processes
- Is created at system initialization and accessible via the
init_proc()
function
Sources: src/process.rs(L208 - L210) src/process.rs(L262 - L272) src/process.rs(L333 - L341)
Testing Behavior
The process relationship behavior is verified through tests that demonstrate:
Test Case | Description |
---|---|
child | Verifies child processes correctly reference their parent |
exit | Checks that exited processes become zombies but remain in their parent's children list |
free | Ensures freed zombie processes are removed from their parent's children list |
reap | Confirms orphaned processes (children of an exited process) are reparented to the init process |
Sources: tests/process.rs(L9 - L55)
Complete Parent-Child Lifecycle
The following diagram shows the complete lifecycle of parent-child relationships from creation through exit to cleanup:
This diagram illustrates how a process moves through its lifecycle while maintaining appropriate parent-child relationships throughout.
Sources: src/process.rs(L207 - L236) src/process.rs(L275 - L331)
Process Groups and Sessions
Relevant source files
Purpose and Scope
This document details the process group and session management subsystem in the axprocess crate. Process groups and sessions are hierarchical abstractions that organize processes into logical collections, similar to Unix-like operating systems. They play a crucial role in managing process relationships and controlling process behavior.
For information about specific process management and parent-child relationships, see Process Management and Parent-Child Relationships. For thread management within processes, see Thread Management.
Hierarchical Organization
Process groups and sessions form a three-level hierarchy in the process management system:
flowchart TD S["Session"] PG1["ProcessGroup"] PG2["ProcessGroup"] P1["Process"] P2["Process"] P3["Process"] P4["Process"] T1["Thread"] T2["Thread"] T3["Thread"] P1 --> T1 P1 --> T2 P2 --> T3 PG1 --> P1 PG1 --> P2 PG2 --> P3 PG2 --> P4 S --> PG1 S --> PG2
This hierarchical organization provides:
- Structured process management
- Logical grouping of related processes
- Potential for process control operations at different granularity levels
Sources: src/session.rs(L12 - L17) src/process_group.rs(L12 - L17)
Session Implementation
A session is a collection of process groups, represented by the Session
struct:
classDiagram class Session { sid: Pid process_groups: SpinNoIrq~~ +new(sid: Pid) : Arc +sid() : Pid +process_groups() : Vec~ }
Key characteristics:
- Each session has a unique Session ID (
sid
) - Sessions contain multiple process groups stored in a thread-safe weak reference map
- Process groups are referenced by their Process Group ID (
pgid
) - The session implementation uses
SpinNoIrq
for synchronization andWeakMap
for memory management
Sources: src/session.rs(L12 - L45)
Process Group Implementation
A process group is a collection of processes, represented by the ProcessGroup
struct:
classDiagram class ProcessGroup { pgid: Pid session: Arc processes: SpinNoIrq~~ +new(pgid: Pid, session: &Arc) : Arc +pgid() : Pid +session() : Arc +processes() : Vec~ }
Key characteristics:
- Each process group has a unique Process Group ID (
pgid
) - Process groups maintain a strong reference to their containing session
- Processes within a group are stored in a thread-safe weak reference map
- Processes are referenced by their Process ID (
pid
)
Sources: src/process_group.rs(L12 - L47)
Reference Management
The memory management strategy prevents memory leaks while ensuring objects remain alive as needed:
flowchart TD subgraph subGraph0["Reference Structure"] P["Process"] PG["ProcessGroup"] S["Session"] end P --> PG PG --> P PG --> S S --> PG
This approach:
- Uses strong references (
Arc
) for upward relationships (Process → Process Group → Session) - Uses weak references for downward relationships (Session → Process Group → Process)
- Prevents reference cycles that could cause memory leaks
- Ensures objects persist when needed but can be garbage collected when no longer referenced
Sources: src/session.rs(L7 - L16) src/process_group.rs(L7 - L17)
Creation and Relationship Management
The creation flow and relationship management between these entities follows a pattern:
sequenceDiagram participant Process as Process participant ProcessGroup as ProcessGroup participant Session as Session Note over Session: Session::new(sid) Note over ProcessGroup: ProcessGroup::new(pgid, &session) Session ->> ProcessGroup: Store weak reference to group Note over Process: Process joins a group Process ->> ProcessGroup: Store weak reference to process Process ->> ProcessGroup: Maintain strong reference to group ProcessGroup ->> Session: Maintain strong reference to session
Key operations:
- Sessions are created with a unique SID
- Process groups are created within a session with a unique PGID
- Processes join process groups, establishing the necessary reference relationships
- Reference counting manages the lifecycle of these objects
Sources: src/session.rs(L19 - L26) src/process_group.rs(L19 - L29)
Memory Safety and Synchronization
The implementation ensures thread safety and proper memory management:
- Thread Safety:
SpinNoIrq
locks protect shared data structures- Used for both session's process groups and process group's processes
- Memory Management:
WeakMap
collections store weak references to prevent reference cycles- Strong references (
Arc
) ensure objects persist as long as needed - Weak references allow objects to be garbage collected when no longer needed
- Collection Methods:
- Both
Session
andProcessGroup
provide methods to retrieve contained objects - Collection methods create strong references (
Arc
) from weak references - Only live objects are returned from collection methods
Sources: src/session.rs(L29 - L39) src/process_group.rs(L32 - L47)
Future Extensions
The session implementation contains a TODO comment about shell job control, suggesting future functional extensions:
// TODO: shell job control
This indicates planned future support for Unix-like shell job control features, which typically include:
- Foreground/background process management
- Job suspension and resumption
- Terminal signal handling for process groups
Sources: src/session.rs(L16)
Relationship to Unix Process Management
The implementation mirrors Unix-like process management concepts:
Concept | Unix-like Systems | axprocess Implementation |
---|---|---|
Process | Basic execution unit | Processstruct |
Process Group | Collection of related processes | ProcessGroupstruct |
Session | Collection of process groups | Sessionstruct |
Process Group Leader | First process in a group | Process with PID matching PGID |
Session Leader | Process that creates a session | Process with PID matching SID |
This familiar design makes the system more intuitive for developers with Unix system programming experience while leveraging Rust's memory safety features.
Process Groups
Relevant source files
Purpose and Scope
This document explains process groups in the axprocess crate, their implementation, and how they fit into the process management hierarchy. Process groups are collections of related processes that enable group-based operations and organization. For information about sessions, which contain process groups, see Sessions. For parent-child process relationships, see Parent-Child Relationships.
Process Group Hierarchy
Process groups form a middle layer in the process management hierarchy of axprocess. They provide a way to organize related processes and enable group-based operations.
flowchart TD S["Session"] PG["ProcessGroup"] P1["Process"] P2["Process"] P3["Process"] PG --> P1 PG --> P2 PG --> P3 S --> PG
Sources: src/process_group.rs(L12 - L17)
Process Group Implementation
A process group is represented by the ProcessGroup
struct, which maintains references to a collection of processes and belongs to a session.
classDiagram class ProcessGroup { pgid: Pid session: Arc~Session~ processes: SpinNoIrq~WeakMap~ +new(pgid: Pid, session: &Arc~Session~) Arc~Self~ +pgid() Pid +session() Arc~Session~ +processes() Vec~Arc~Process~~ } class Session { sid: Pid process_groups: WeakMap } class Process { pid: Pid group: Arc~ProcessGroup~ } Process "1" --> "1" Session : belongs to Process "1" --> "*" ProcessGroup : contains
The key components of a process group are:
Component | Type | Description |
---|---|---|
pgid | Pid | The unique identifier for the process group |
session | Arc | The session this process group belongs to |
processes | SpinNoIrq<WeakMap<Pid, Weak | A map of processes belonging to this group |
Sources: src/process_group.rs(L12 - L17) src/process_group.rs(L19 - L29)
Process Group Creation and Management
Creation
Process groups are created within an existing session. Typically, a process creates a new group and becomes the leader of that group.
sequenceDiagram participant Process as Process participant NewProcessGroup as New ProcessGroup participant Session as Session participant OldProcessGroup as Old ProcessGroup Process ->> NewProcessGroup: create with P.pid() as pgid NewProcessGroup ->> Session: register with session Process ->> OldProcessGroup: remove from old group Process ->> NewProcessGroup: add to new group Note over Process,NewProcessGroup: Process becomes group leader
The process group ID (pgid) is typically set to the process ID (pid) of the creating process, making that process the group leader.
Sources: src/process_group.rs(L19 - L29) tests/group.rs(L22 - L43)
Process Movement Between Groups
Processes can move between process groups, which involves removing them from their current group and adding them to a new one.
sequenceDiagram participant Process as Process participant OldProcessGroup as Old ProcessGroup participant NewProcessGroup as New ProcessGroup Process ->> OldProcessGroup: remove from processes map Process ->> NewProcessGroup: add to processes map Process ->> Process: update group reference Note over OldProcessGroup: If empty, may be cleaned up
When a process moves to a new group, it's removed from its old group's process map and added to the new group's map. If the old group becomes empty (no more processes), it may be cleaned up.
Sources: tests/group.rs(L77 - L113)
Memory Management
Process groups use a combination of strong (Arc
) and weak (Weak
) references to manage memory and prevent reference cycles:
- Processes hold strong references (
Arc
) to their process group - Process groups hold weak references (
Weak
) to their processes - Sessions hold weak references to process groups
flowchart TD subgraph subGraph0["Reference Direction"] P["Process"] PG["ProcessGroup"] S["Session"] end P --> PG PG --> P PG --> S S --> PG
This reference strategy ensures proper cleanup when processes or groups are no longer needed.
Sources: src/process_group.rs(L15 - L16) tests/group.rs(L54 - L65)
Process Group Inheritance
When a process is forked (a new child is created), the child typically inherits the parent's process group. This maintains the group relationship across process creation.
sequenceDiagram participant ParentProcess as Parent Process participant ChildProcess as Child Process participant ProcessGroup as Process Group ParentProcess ->> ChildProcess: fork() creates ChildProcess ->> ProcessGroup: inherits parent's group ProcessGroup ->> ProcessGroup: adds child to processes map Note over ParentProcess,ChildProcess: Both in same group
Sources: tests/group.rs(L67 - L75)
Process Group Cleanup
Process groups are automatically cleaned up when:
- All processes in the group have exited and been freed
- There are no more strong references to the process group
This automatic cleanup is handled through Rust's reference counting mechanism and the weak reference strategy used in axprocess.
When the last process in a group is removed (either by exiting or moving to another group), the process group becomes eligible for cleanup if there are no other strong references to it.
Sources: tests/group.rs(L54 - L65) tests/group.rs(L102 - L113)
API Summary
The ProcessGroup
struct provides the following key methods:
Method | Purpose |
---|---|
new(pgid, session) | Creates a new process group with the given ID in the specified session |
pgid() | Returns the process group ID |
session() | Returns a reference to the session this group belongs to |
processes() | Returns a vector of all processes in this group |
Sources: src/process_group.rs(L19 - L46)
Use Cases
Process groups serve several important purposes in operating systems:
- Job Control: Allow signals to be sent to multiple related processes at once
- Organization: Group related processes together (e.g., a shell pipeline)
- Termination Control: Enable orderly shutdown of related processes
The implementation in axprocess follows patterns similar to Unix-like systems but with Rust's memory safety guarantees.
Sources: tests/group.rs(L9 - L141)
Sessions
Relevant source files
Purpose and Scope
This document explains the concept of Sessions in the axprocess codebase. A Session represents a collection of Process Groups and forms the top level of the process hierarchy. Each process belongs to exactly one process group, and each process group belongs to exactly one session.
The session abstraction is inspired by Unix-like operating systems, where sessions are typically used to manage groups of related processes, such as those associated with a terminal login session.
Sources: src/session.rs(L12 - L17)
Session Structure
A Session in axprocess is a simple structure that maintains a collection of process groups:
classDiagram note for Session "Each session has a unique session ID (sid)" class Session { sid: Pid process_groups: SpinNoIrq~~ +sid() Pid +process_groups() Vec~ +new(sid: Pid) Arc } class ProcessGroup { pgid: Pid session: Arc processes: WeakMap~ } Session "1" o-- "*" ProcessGroup : contains
The Session
struct contains:
sid
: A unique session ID (of typePid
)process_groups
: A thread-safe weak map that links process group IDs to weak references of process groups
Sessions use weak references to their process groups to avoid reference cycles, as process groups hold strong references to their sessions.
Sources: src/session.rs(L12 - L17) src/session.rs(L19 - L26)
Session Hierarchy
Sessions form the top level of the process management hierarchy in axprocess. Each element in the hierarchy has specific relationships with others:
flowchart TD S["Session"] PG1["Process Group 1"] PG2["Process Group 2"] PG3["Process Group 3"] P1["Process 1"] P2["Process 2"] P3["Process 3"] P4["Process 4"] P5["Process 5"] P6["Process 6"] T1["Thread 1.1"] T2["Thread 1.2"] T3["Thread 4.1"] P1 --> T1 P1 --> T2 P4 --> T3 PG1 --> P1 PG1 --> P2 PG2 --> P3 PG3 --> P4 PG3 --> P5 PG3 --> P6 S --> PG1 S --> PG2 S --> PG3
This hierarchical structure allows for logical grouping of related processes and simplifies operations that need to be performed on sets of processes.
Sources: src/session.rs(L12 - L17) tests/session.rs(L9 - L19)
Memory Management
The session implementation uses a careful reference counting approach to prevent memory leaks and ensure proper cleanup:
flowchart TD subgraph subGraph0["Reference Structure"] P["Process"] PG["Process Group"] S["Session"] PG_ref["Process Group (Reference)"] P_ref["Process (Reference)"] end note["Strong references point upwardWeak references point downward"] P --> PG PG --> P_ref PG --> S S --> PG_ref
Key aspects of memory management for sessions:
- Processes hold strong references to their process groups
- Process groups hold strong references to their sessions
- Sessions hold weak references to their process groups
- This prevents circular references while ensuring objects stay alive as needed
When all processes in a process group are freed, the process group is dropped, and when all process groups in a session are dropped, the session is freed.
Sources: src/session.rs(L15) tests/session.rs(L51 - L64)
Session Creation and Management
Creation
A session is created when a process calls create_session()
. The process becomes the leader of both the new session and a new process group:
sequenceDiagram participant Process as Process participant ProcessGroup as Process Group participant OldProcessGroup as Old Process Group participant NewSession as New Session participant OldSession as Old Session Process ->> Process: create_session() Note over Process: Process must not be a group leader Process ->> NewSession: Session::new(pid) NewSession -->> Process: new session Process ->> ProcessGroup: ProcessGroup::new(pid, &session) ProcessGroup -->> Process: new process group Process ->> OldProcessGroup: leave old group OldProcessGroup -->> OldSession: update group membership Process ->> ProcessGroup: join new group Process -->> Process: return (session, group)
The process must not already be a group leader to create a new session. This is enforced at runtime.
Sources: src/session.rs(L19 - L26) tests/session.rs(L21 - L44) tests/session.rs(L47 - L49)
Session Management
Sessions provide methods to access their properties and process groups:
sid()
: Returns the session IDprocess_groups()
: Returns all process groups that belong to this session
A process cannot move to a process group that belongs to a different session:
Sources: src/session.rs(L29 - L39) tests/session.rs(L86 - L96)
Cleanup
When all processes in a session exit and are freed, the session's process groups will be empty, and eventually, the session itself will be cleaned up through Rust's reference counting mechanism:
sequenceDiagram participant Process as Process participant ProcessGroup as Process Group participant Session as Session Process ->> Process: exit() Process ->> Process: free() Note over Process: Process no longer in group Note over ProcessGroup: When all processes are gone ProcessGroup ->> ProcessGroup: Drop Note over ProcessGroup: Process group removes itself from session Note over Session: When all process groups are gone Session ->> Session: Drop Note over Session: Session is freed
Sources: tests/session.rs(L51 - L64) tests/session.rs(L99 - L108)
Practical Examples
Basic Session Structure
The initial process (init) automatically creates a session and process group:
let init = init_proc();
let group = init.group();
let session = group.session();
// The group and session IDs match the init process ID
assert_eq!(group.pgid(), init.pid());
assert_eq!(session.sid(), init.pid());
Sources: tests/session.rs(L9 - L19)
Creating a New Session
A child process can create its own session:
let parent = init_proc();
let child = parent.new_child();
let (child_session, child_group) = child.create_session().unwrap();
// The child becomes the leader of both the new session and group
assert_eq!(child_group.pgid(), child.pid());
assert_eq!(child_session.sid(), child.pid());
Sources: tests/session.rs(L21 - L44)
Implementation Details
The Session
struct is implemented in src/session.rs with these key methods:
fn new(sid: Pid) -> Arc<Self>
: Creates a new session with the given session IDfn sid(&self) -> Pid
: Returns the session IDfn process_groups(&self) -> Vec<Arc<ProcessGroup>>
: Returns all process groups in this session
The implementation uses SpinNoIrq
locks for thread safety and concurrent access to session data.
Sources: src/session.rs(L19 - L39)
Related Topics
For more information about how process groups interact with sessions, see Process Groups.
For details on how processes move between groups and the hierarchical relationship between sessions, groups, and processes, see Hierarchy and Movement.
Hierarchy and Movement
Relevant source files
This page explains the hierarchical relationships between sessions, process groups, and processes in the axprocess
crate, and details how processes can move between different groups and sessions. For information about parent-child relationships between processes, see Parent-Child Relationships. For details about thread management within processes, see Thread Management.
Hierarchical Structure Overview
The axprocess
system implements a three-level hierarchical structure inspired by Unix-like operating systems:
- Sessions: The top-level container that groups related process groups
- Process Groups: The middle-level container that groups related processes
- Processes: The individual execution units that can contain multiple threads
This hierarchy is used for organizing processes and implementing features like job control. Each entity in the hierarchy is identified by a unique process identifier (Pid).
flowchart TD subgraph Session["Session"] S["Session (sid)"] PG1["ProcessGroup (pgid1)"] PG2["ProcessGroup (pgid2)"] P1["Process (pid1)"] P2["Process (pid2)"] P3["Process (pid3)"] P4["Process (pid4)"] end PG1 --> P1 PG1 --> P2 PG2 --> P3 PG2 --> P4 S --> PG1 S --> PG2
Sources: src/session.rs(L1 - L45) src/process_group.rs(L1 - L56) src/process.rs(L83 - L89)
Process Group and Session Relationships
In the axprocess
crate, the relationships between sessions, process groups, and processes are implemented using strong and weak references:
- Each process stores a strong reference (
Arc
) to its process group - Each process group stores a strong reference to its session
- Both process groups and sessions store weak references (
WeakMap
) to their contained entities to prevent reference cycles
classDiagram class Session { sid: Pid process_groups: WeakMap~ +sid() Pid +process_groups() Vec~ } class ProcessGroup { pgid: Pid session: Arc processes: WeakMap~ +pgid() Pid +session() Arc +processes() Vec~ } class Process { pid: Pid group: SpinNoIrq~ +group() Arc +set_group() void +create_session() Option~(Arc, Arc) ~ +create_group() Option~ +move_to_group() bool } Process "1" --> "*" ProcessGroup : belongs to Process "1" --> "*" ProcessGroup : belongs to
Sources: src/session.rs(L12 - L18) src/process_group.rs(L12 - L17) src/process.rs(L83 - L164)
Creating New Sessions
A process can create a new session and become its session leader by calling the create_session()
method. This operation also creates a new process group within the new session, with the process as the process group leader.
sequenceDiagram participant Process as Process participant NewSession as New Session participant NewProcessGroup as New Process Group participant OldProcessGroup as Old Process Group Process ->> Process: create_session() Note over Process: Check if already session leader Process ->> NewSession: Session::new(pid) Process ->> NewProcessGroup: ProcessGroup::new(pid, &new_session) Process ->> OldProcessGroup: Remove process from old group Process ->> NewProcessGroup: Add process to new group Process ->> Process: Update group reference Process -->> Process: Return (new_session, new_group)
Key features of session creation:
- A process cannot create a new session if it is already a session leader (when
process.group().session.sid() == process.pid()
) - When a new session is created, a new process group is also created with the same ID
- The process is moved from its old process group to the new one
Sources: src/process.rs(L100 - L123) tests/session.rs(L21 - L44)
Creating New Process Groups
A process can create a new process group within its current session and become its group leader by calling the create_group()
method.
sequenceDiagram participant Process as Process participant Session as Session participant NewProcessGroup as New Process Group participant OldProcessGroup as Old Process Group Process ->> Process: create_group() Note over Process: Check if already group leader Process ->> NewProcessGroup: ProcessGroup::new(pid, ¤t_session) Process ->> OldProcessGroup: Remove process from old group Process ->> NewProcessGroup: Add process to new group Process ->> Process: Update group reference Process -->> Process: Return new_group
Key features of process group creation:
- A process cannot create a new process group if it is already a process group leader (when
process.group().pgid() == process.pid()
) - The new process group is created within the process's current session
- The process is moved from its old process group to the new one
Sources: src/process.rs(L124 - L143) tests/group.rs(L22 - L43)
Moving Between Process Groups
A process can move to a different process group within the same session by calling the move_to_group()
method.
sequenceDiagram participant Process as Process participant DestinationProcessGroup as Destination Process Group participant CurrentProcessGroup as Current Process Group Process ->> Process: move_to_group(destination_group) alt Already in the group Process -->> Process: Return true (no action needed) else Different session Process -->> Process: Return false (operation not allowed) else Move allowed Process ->> CurrentProcessGroup: Remove process from current group Process ->> DestinationProcessGroup: Add process to destination group Process ->> Process: Update group reference Process -->> Process: Return true (move successful) end
Key constraints on process movement:
- A process can only move to a process group within the same session
- If a process is already in the specified process group, no action is taken
- The process is removed from its old process group and added to the new one
Sources: src/process.rs(L145 - L163) tests/group.rs(L77 - L100) tests/session.rs(L86 - L96)
Process Creation and Inheritance
When a new process is created using Process::fork()
and then ProcessBuilder::build()
, it inherits its parent's process group by default. This behavior ensures that related processes stay within the same group unless explicitly moved.
flowchart TD PP["Parent Process"] PB["ProcessBuilder"] CP["Child Process"] PPG["Parent Process Group"] PB --> CP PP --> PB PPG --> CP
Sources: src/process.rs(L285 - L332)
Resource Cleanup and Memory Management
The hierarchical structure is designed to ensure proper cleanup of resources:
- When a process moves to a different group, its reference to the old group is dropped
- If a process was the last one in a group, the group will be automatically cleaned up when all references to it are dropped
- Similarly, when a process group is removed from a session, the session will be cleaned up if it was the last group
This approach prevents memory leaks while maintaining the hierarchical relationships.
flowchart TD subgraph subGraph1["After Move"] GC["Garbage Collection"] subgraph Before["Before"] P2["Process"] PG3["Process Group"] S2["Session"] PG4["Process Group (empty)"] P1["Process"] PG1["Process Group"] S["Session"] PG2["Process Group"] end end P1 --> PG1 P2 --> PG3 PG1 --> S PG2 --> S PG3 --> S2 PG4 --> GC PG4 --> S2
Sources: tests/group.rs(L54 - L65) tests/group.rs(L102 - L113) tests/session.rs(L52 - L64)
Example: Moving Processes Between Groups
Here's a practical example of how process groups can be manipulated in code:
sequenceDiagram participant Parent as Parent participant Child1 as Child1 participant Child2 as Child2 participant Group1 as Group1 participant Group2 as Group2 Parent ->> Child1: new_child() Child1 ->> Group1: create_group() Parent ->> Child2: new_child() Child2 ->> Group2: create_group() Child2 ->> Child2: move_to_group(Group1) Note over Child2,Group1: Child2 now belongs to Group1 Note over Group2: Group2 is now empty
This diagram illustrates the flow from tests where:
- A parent process creates two child processes
- Each child creates its own process group
- The second child moves to the first child's group
- The second child's original group becomes empty
Sources: tests/group.rs(L77 - L100)
Constraints and Rules Summary
The following rules govern process movement in the hierarchy:
Operation | Condition | Result |
---|---|---|
Create session | Process is already a session leader | Operation fails, returnsNone |
Create session | Process is not a session leader | New session and group created, process moved |
Create group | Process is already a group leader | Operation fails, returnsNone |
Create group | Process is not a group leader | New group created, process moved |
Move to group | Target group in different session | Operation fails, returnsfalse |
Move to group | Target group in same session | Process moved, returnstrue |
Move to group | Already in target group | No action, returnstrue |
Sources: src/process.rs(L100 - L163) tests/group.rs(L44 - L52) tests/session.rs(L46 - L48)
Practical Implications
Understanding the hierarchical structure and movement capabilities in axprocess
allows for effective process management:
- Related processes can be grouped together for collective management
- Session boundaries provide isolation between unrelated process groups
- Process movement enables dynamic reorganization of processes based on their relationships or roles
- The hierarchy forms the foundation for implementing job control and terminal management
By organizing processes into groups and sessions, the system can implement sophisticated process management features commonly found in Unix-like operating systems.
Sources: src/process.rs(L83 - L164) src/process_group.rs(L1 - L56) src/session.rs(L1 - L45)
Thread Management
Relevant source files
This document explains how threads are implemented and managed within the axprocess crate. It covers thread creation, lifecycle, and the relationship between threads and processes. For process-specific features, see Process Management, and for memory management aspects, see Memory Management.
Thread Structure and Components
The thread management system consists of several key components that work together to provide thread functionality:
classDiagram class Thread { tid: Pid process: Arc data: Box +tid() Pid +process() &Arc +exit(exit_code: i32) bool +data() Option~&T~ } class ThreadBuilder { tid: Pid process: Arc data: Box +new(tid: Pid, process: Arc) ThreadBuilder +data(data: T) ThreadBuilder +build() Arc } class Process { pid: Pid tg: SpinNoIrq +new_thread(tid: Pid) ThreadBuilder +threads() Vec~ +is_group_exited() bool +group_exit() void } class ThreadGroup { threads: WeakMap~ exit_code: i32 group_exited: bool } Thread --> Process : belongs to ThreadBuilder --> Thread : builds Process --> ThreadGroup : contains ThreadGroup --> Thread : tracks
Sources: src/thread.rs(L6 - L28) src/thread.rs(L51 - L88) src/process.rs(L18 - L32) src/process.rs(L167 - L192)
Thread Structure
The Thread
struct represents an individual thread within a process:
- It has a unique thread ID (
tid
) of typePid
- It maintains a strong reference to its parent process using
Arc<Process>
- It can store arbitrary data via a type-erased
Box<dyn Any + Send + Sync>
- It provides methods to access its properties and manage its lifecycle
Sources: src/thread.rs(L6 - L28)
Thread Group
Each process contains a ThreadGroup
which manages all threads within that process:
- The
ThreadGroup
maintains a collection of weak references to threads usingWeakMap<Pid, Weak<Thread>>
- It tracks the process exit code, which is set when threads exit
- It has a
group_exited
flag that can be set to indicate the entire thread group should exit
Sources: src/process.rs(L18 - L32)
Thread Creation Process
Threads are created through a multi-step process using the builder pattern:
sequenceDiagram participant Process as "Process" participant ThreadBuilder as "ThreadBuilder" participant Thread as "Thread" participant ThreadGroup as "ThreadGroup" Process ->> ThreadBuilder: new_thread(tid) Note over Thread,ThreadBuilder: Configure thread ThreadBuilder ->> ThreadBuilder: data(custom_data) ThreadBuilder ->> Thread: build() Thread ->> ThreadGroup: register thread Note over Process,ThreadGroup: Thread is now part of the process's thread group
Sources: src/process.rs(L168 - L171) src/thread.rs(L58 - L88)
- Thread creation begins by calling
Process::new_thread(tid)
, which returns aThreadBuilder
instance - The builder can be configured with custom data using the
data()
method - Calling
build()
on the builder creates the actual thread - During building, the thread is registered in the process's thread group
- The builder returns an
Arc<Thread>
as the final product
This builder pattern allows for optional configuration while ensuring proper registration of the thread with its process.
Sources: src/thread.rs(L51 - L88) src/process.rs(L168 - L171)
Thread Lifecycle Management
Threads in axprocess go through several states during their lifetime:
Sources: src/thread.rs(L29 - L39) src/process.rs(L167 - L192)
Thread Exit
The thread exit process is a critical part of thread management:
- When a thread is ready to terminate, it calls
Thread::exit(exit_code)
- This method:
- Updates the thread group's exit code (if group exit hasn't been set)
- Removes the thread from the process's thread group
- Returns a boolean indicating if it was the last thread in the group
- If the thread was the last one to exit, typically the caller would trigger process termination
Sources: src/thread.rs(L29 - L39)
Process-Thread Relationship
The relationship between processes and threads is fundamental to the system design:
flowchart TD subgraph subGraph2["Thread 2"] T2["Thread Methods:- tid()- process()- exit()"] end subgraph subGraph1["Thread 1"] T1["Thread Methods:- tid()- process()- exit()"] end subgraph Process["Process"] TG["ThreadGroup"] P["Process Methods:- new_thread()- threads()- is_group_exited()- group_exit()"] end P --> TG T1 --> P T2 --> P TG --> T1 TG --> T2
Sources: src/process.rs(L167 - L192) src/thread.rs(L6 - L39)
Process Thread Management Functions
A process provides several methods to manage its threads:
new_thread(tid)
: Creates a new thread with the given thread IDthreads()
: Returns a list of all threads in the processis_group_exited()
: Checks if the thread group has been marked for exitgroup_exit()
: Marks the thread group as exited, signaling all threads to terminate
When a process's group_exit()
method is called, its group_exited
flag is set to true. This doesn't directly terminate threads, but serves as a signal that they should exit. Individual threads need to check this flag and respond accordingly.
Sources: src/process.rs(L167 - L192)
Thread Exit and Process Status
When a thread exits, it may affect the process state:
- If the exiting thread is the last thread in the process, the process should typically be terminated
- The thread's exit code may become the process's exit code (unless
group_exited
is true) - When all threads exit, resources associated with the thread group can be cleaned up
Sources: src/thread.rs(L29 - L39) src/process.rs(L167 - L192)
Data Storage in Threads
Both Thread
and Process
contain a data
field of type Box<dyn Any + Send + Sync>
, which allows storing arbitrary data that satisfies the Send
and Sync
traits:
- The
data<T: Any + Send + Sync>()
method on both types allows retrieving this data when its exact type is known - The builder patterns for both types allow setting this data during creation
- This mechanism provides a flexible way to associate custom data with threads and processes
This type-erased data storage enables client code to store task-specific information without modifying the core thread and process implementations.
Sources: src/thread.rs(L24 - L27) src/thread.rs(L67 - L73)
Thread Management Best Practices
When working with the thread management system in axprocess, consider these guidelines:
- Always check the return value of
Thread::exit()
to determine if process termination is needed - Use the builder pattern properly by calling methods in a chain and ending with
build()
- Manage thread references carefully to prevent memory leaks
- Be aware of the process lifecycle and how thread termination affects it
The thread management system in axprocess provides a flexible foundation for multithreaded applications while maintaining proper resource management and cleanup.
Thread Creation and Builder
Relevant source files
Purpose and Scope
This document explains the thread creation process and the ThreadBuilder pattern in axprocess. It covers how new threads are instantiated, configured, and registered with their parent processes. For information about the thread lifecycle and exit procedures, see Thread Lifecycle and Exit.
Thread Structure
In axprocess, a thread is represented by the Thread
struct, which contains the following key components:
Field | Type | Description |
---|---|---|
tid | Pid | Unique thread identifier |
process | Arc | Reference to the process that owns the thread |
data | Box<dyn Any + Send + Sync> | Custom data associated with the thread |
The Thread
struct provides methods to access its properties and manage its lifecycle:
classDiagram class Thread { tid: Pid process: Arc data: Box +tid() Pid +process() &Arc +data() Option~&T~ +exit(exit_code: i32) bool }
Sources: src/thread.rs(L6 - L40)
ThreadBuilder Pattern
Thread creation follows the builder pattern through the ThreadBuilder
struct. This pattern allows for flexible configuration before final construction.
classDiagram class ThreadBuilder { tid: Pid process: Arc data: Box +new(tid: Pid, process: Arc) Self +data(data: T) Self +build() Arc } class Thread { } Thread --> ThreadBuilder : "builds"
The ThreadBuilder provides a clean interface for configuring a thread before its construction:
- Instantiate the builder with a thread ID and process reference
- Optionally set custom data
- Build the thread, which registers it with the process's thread group
Sources: src/thread.rs(L51 - L88)
Thread Creation Flow
The thread creation process involves several steps from initialization to registration with the thread group:
sequenceDiagram participant Caller as "Caller" participant ThreadBuilder as "ThreadBuilder" participant Thread as "Thread" participant Process as "Process" participant ThreadGroup as "ThreadGroup" Caller ->> ThreadBuilder: new(tid, process) Note over ThreadBuilder: Initialize builder with<br>thread ID and process opt Configure thread Caller ->> ThreadBuilder: data(custom_data) Note over ThreadBuilder: Set custom data end Caller ->> ThreadBuilder: build() ThreadBuilder ->> Thread: Create Thread Note over Thread: Initialize with<br>tid, process, data ThreadBuilder ->> Process: process.tg.lock() Process ->> ThreadGroup: Return locked ThreadGroup ThreadBuilder ->> ThreadGroup: threads.insert(tid, &thread) Note over ThreadGroup: Register thread in the<br>thread group ThreadBuilder -->> Caller: Return Arc<Thread>
Sources: src/thread.rs(L59 - L87)
ThreadBuilder API
The ThreadBuilder API provides the following methods:
new(tid: Pid, process: Arc<Process>)
- Creates a new ThreadBuilder with the specified thread ID and processdata<T: Any + Send + Sync>(data: T)
- Associates custom data with the threadbuild()
- Constructs the Thread and registers it with the process's thread group
Thread Construction
When build()
is called, the ThreadBuilder performs these steps:
- Creates a new
Thread
instance with the configured parameters - Wraps the thread in an
Arc
for shared ownership - Registers the thread with the process's thread group
- Returns the
Arc<Thread>
to the caller
Sources: src/thread.rs(L76 - L87)
Thread Data Management
The data
field in both Thread
and ThreadBuilder
uses Rust's Any
trait to allow storing any type that is Send
and Sync
. This enables the thread to associate arbitrary data types with itself.
flowchart TD subgraph subGraph1["Data Retrieval"] E["Get Typed Data"] F["Access Custom Data"] end subgraph subGraph0["Thread Creation"] B["ThreadBuilder"] C["Configure"] D["Thread"] end B --> C C --> D D --> E E --> F
The data<T>()
method on Thread
allows retrieving this custom data as a specific type by using the downcast_ref
method provided by the Any
trait.
Sources: src/thread.rs(L25 - L27) src/thread.rs(L68 - L73)
Integration with Process and Thread Group
Threads are managed within a Process through a ThreadGroup:
flowchart TD P["Process"] TG["ThreadGroup"] T1["Thread 1"] T2["Thread 2"] TN["Thread N"] TB["ThreadBuilder"] TNP["New Thread"] P --> TB P --> TG TB --> TG TB --> TNP TG --> T1 TG --> T2 TG --> TN
When a thread is created using ThreadBuilder.build()
, it is automatically registered with its process's thread group (stored in the tg
field of the Process
struct). This registration happens by inserting the thread's ID and a reference to the thread into the thread group's threads collection.
Sources: src/thread.rs(L84)
Thread Identity and Ownership
Each thread has a unique thread ID (tid
), which is a Pid
type. The thread maintains a strong reference (Arc
) to its parent process, establishing a clear ownership relationship:
flowchart TD T["Thread"] P["Process"] TG["ThreadGroup"] TR["Thread References"] P --> T T --> P TG --> TR TR --> T
This reference pattern ensures:
- A thread cannot outlive its process
- Processes can track their threads without creating reference cycles
- Thread cleanup can occur properly when a thread exits
Sources: src/thread.rs(L7 - L11) src/thread.rs(L19 - L22)
Thread Lifecycle and Exit
Relevant source files
Purpose and Scope
This document explains the lifecycle of threads within the axprocess system, focusing on thread creation, execution, and termination processes. It details how threads are managed within processes and the impact of thread exit on the overall process lifecycle. For information about thread creation specifically, see Thread Creation and Builder.
Thread Structure and Components
In the axprocess system, a thread represents an execution context within a process. Each thread has its own identity and data but operates within the context of its parent process.
classDiagram class Thread { tid: Pid process: Arc data: Box +tid() Pid +process() &Arc +data() Option~&T~ +exit(exit_code: i32) bool } class Process { pid: Pid tg: SpinNoIrq +threads() Vec~ +new_thread(tid: Pid) ThreadBuilder } class ThreadGroup { threads: WeakMap~ exit_code: i32 group_exited: bool } Thread --> Process : belongs to Process --> ThreadGroup : contains ThreadGroup --> Thread : tracks
Thread-to-Process Relationship Diagram
Sources: src/thread.rs(L7 - L27) src/process.rs(L18 - L31) src/process.rs(L167 - L191)
Key Components
- Thread: A single execution unit with its own thread ID (
tid
), a reference to its parent process, and associated data. - ThreadGroup: Manages all threads within a process, tracking:
- Active threads
- Exit code
- Group exit status
- Process: Contains the thread group and provides methods for thread management.
Sources: src/thread.rs(L7 - L27) src/process.rs(L18 - L31) src/process.rs(L34 - L47)
Thread Lifecycle States
Threads in axprocess move through several distinct states throughout their existence:
Thread Lifecycle States Diagram
Sources: src/thread.rs(L76 - L87) src/thread.rs(L29 - L39)
State Transitions
- Creation: A thread is created using
ThreadBuilder::build()
, which:
- Creates a new
Thread
object with the specified parameters - Adds the thread to the process's thread group
- Returns an
Arc<Thread>
for subsequent operations
- Running: After creation, a thread is considered to be in the running state (though actual scheduling is handled outside axprocess)
- Exit: When
Thread::exit(exit_code)
is called:
- The thread is removed from the thread group
- If this was not a group exit, the exit code is stored
- The method returns a boolean indicating if this was the last thread in the group
Sources: src/thread.rs(L76 - L87) src/thread.rs(L29 - L39)
Thread Exit Process
When a thread exits, a specific sequence of operations occurs to handle cleanup and potential process termination:
sequenceDiagram participant Thread as "Thread" participant ThreadGroup as "ThreadGroup" participant Process as "Process" Thread ->> ThreadGroup: exit(exit_code) ThreadGroup ->> ThreadGroup: Lock thread group alt group_exi- alt ted is alt false ThreadGroup ->> ThreadGroup: Set exit_code end end end ThreadGroup ->> ThreadGroup: Remove thread from threads map ThreadGroup -->> Thread: Return if threads is empty alt Last thread exited Thread ->> Process: May trigger process exit end
Thread Exit Process Sequence Diagram
Sources: src/thread.rs(L29 - L39) src/process.rs(L195 - L225)
Exit Process Details
- Acquire Lock: The thread acquires a lock on the process's thread group.
- Update Exit Code: If the thread group hasn't already been marked as exited (through
group_exit()
), the exit code is updated with the provided value. - Remove Thread: The thread is removed from the thread group's thread map.
- Check Last Thread: The method returns
true
if this thread was the last one in the group, which may trigger further actions:
pub fn exit(&self, exit_code: i32) -> bool {
let mut tg = self.process.tg.lock();
if !tg.group_exited {
tg.exit_code = exit_code;
}
tg.threads.remove(&self.tid);
tg.threads.is_empty()
}
- Process Termination: If the last thread exits, the caller is responsible for handling process termination if needed.
Sources: src/thread.rs(L29 - L39)
Group Exit Mechanism
The thread group can be marked for group exit, which affects how individual thread exits are handled:
flowchart TD A["Process::group_exit()"] B["Set group_exited = true"] C["Thread::exit(exit_code)"] D["Check group_exited"] E["group_exited?"] F["Keep existing exit_code"] G["Set exit_code = new value"] H["Remove thread from group"] I["Last thread?"] J["Return true"] K["Return false"] A --> B C --> D D --> E E --> F E --> G F --> H G --> H H --> I I --> J I --> K
Group Exit Mechanism Diagram
Sources: src/process.rs(L179 - L186) src/thread.rs(L29 - L39)
Group Exit Details
- Initiation: A process can be marked for group exit by calling
Process::group_exit()
:
#![allow(unused)] fn main() { pub fn group_exit(&self) { self.tg.lock().group_exited = true; } }
- Effect on Threads: When threads exit after group exit is set:
- The exit code from individual threads is ignored
- The previously set exit code (before group exit) is preserved
- Exit Status Preservation: This mechanism allows the exit status to be fixed at a specific value regardless of how individual threads exit.
Sources: src/process.rs(L179 - L186) src/thread.rs(L32 - L34)
Impact on Process Lifecycle
Thread exits play a critical role in the process lifecycle:
flowchart TD A["Thread::exit(exit_code)"] B["Last thread?"] C["May trigger Process::exit()"] D["Mark process as zombie"] E["Reparent children to init process"] F["Process becomes zombie"] G["Process continues running"] H["Process::group_exit()"] I["All threads exit with same status"] J["Eventually leads to Process::exit()"] A --> B B --> C B --> G C --> D D --> E E --> F H --> I I --> J
Thread Exit Impact on Process Diagram
Sources: src/thread.rs(L29 - L39) src/process.rs(L195 - L225)
Key Considerations
- Last Thread Exit: When the last thread exits, the process itself may need to exit, which is typically handled by the scheduler or executor.
- Zombie Process: When a process exits, it becomes a zombie until its parent collects its exit status and frees it:
#![allow(unused)] fn main() { pub fn exit(self: &Arc<Self>) { // Check not init process // Mark as zombie self.is_zombie.store(true, Ordering::Release); // Reparent children to init process // Additional cleanup } }
- Resource Cleanup:
- Thread resources are cleaned up when the thread is removed from the thread group
- Process resources are only fully cleaned up when the zombie process is freed
Sources: src/process.rs(L195 - L236)
Memory Management and Reference Counting
The axprocess system employs careful memory management to ensure proper resource cleanup:
flowchart TD subgraph subGraph1["Weak References"] ThreadGroup["ThreadGroup"] end subgraph subGraph0["Strong References"] Thread["Thread"] Process["Process"] end Process --> ThreadGroup Thread --> Process ThreadGroup --> Thread
Reference Relationship Diagram
Sources: src/thread.rs(L7 - L11) src/process.rs(L18 - L22) src/thread.rs(L76 - L87)
Key Memory Management Patterns
- Thread to Process: Threads maintain strong references (
Arc
) to their parent processes to ensure the process remains alive as long as any thread is running. - ThreadGroup to Thread: The thread group uses weak references to threads, allowing threads to be dropped when they exit.
- Creation: When a thread is created, it's added to the process's thread group using a weak reference.
- Cleanup: When a thread exits, it's removed from the thread group, allowing its memory to be reclaimed if there are no other references.
Sources: src/thread.rs(L76 - L87) src/thread.rs(L29 - L39) src/process.rs(L18 - L22)
Thread-Process Interaction Summary
The lifecycle of threads is tightly coupled with the lifecycle of their parent process:
Thread Action | Process Effect |
---|---|
Thread creation | Added to process's thread group |
Normal thread exit | Removed from thread group, exit code recorded if first exit |
Last thread exit | May trigger process termination |
Process group exit | All subsequent thread exits preserve initial exit code |
Process exit | All resources partially released, becomes zombie |
Process free | All resources fully released |
Sources: src/thread.rs(L29 - L39) src/process.rs(L167 - L191) src/process.rs(L195 - L236)
Conclusion
Understanding the thread lifecycle and exit process is crucial for effective process management in the axprocess system. Threads are the execution units of processes, and their creation and termination directly impact the process lifecycle. The system provides mechanisms for individual thread exit as well as coordinated group exit, with careful resource management through Rust's ownership model.
Memory Management
Relevant source files
This document explains how memory is managed in the axprocess crate, focusing on reference counting patterns, hierarchical object management, and cleanup mechanisms. For information about zombie processes and cleanup specifically, see Zombie Processes and Cleanup. For details about reference counting and ownership patterns, see Reference Counting and Ownership.
Overview of Memory Management Strategy
The axprocess crate implements a hierarchical process management system that uses Rust's ownership model and reference counting patterns to ensure memory safety while maintaining proper object relationships. The system employs both strong references (Arc) and weak references (Weak) strategically to prevent memory leaks and reference cycles.
flowchart TD subgraph subGraph0["Memory Management Strategy"] A["Process Management Objects"] B["Strong Reference (Arc)"] C["Weak Reference (Weak)"] D["Upward References(Child→Parent Type)"] E["Downward References(Parent→Child Type)"] F["Circular References"] end A --> B A --> C B --> D C --> E C --> F
Sources: src/process.rs(L1 - L10) src/process_group.rs(L1 - L9) src/session.rs(L1 - L9)
Reference Hierarchy and Ownership Model
The axprocess crate implements a hierarchical memory management model with carefully designed ownership relationships between different components:
Sources: src/process.rs(L35 - L47) src/process_group.rs(L12 - L17) src/session.rs(L12 - L17) src/thread.rs(L7 - L11)
Strong vs Weak References
The system carefully balances the use of strong references (Arc) and weak references (Weak) to maintain object relationships while preventing memory leaks:
Component | Field | Reference Type | Purpose |
---|---|---|---|
Process | children | Strong | Keep child processes alive while parent exists |
Process | parent | Weak | Prevent reference cycles between parent-child |
Process | group | Strong | Keep process group alive while process exists |
ProcessGroup | session | Strong | Keep session alive while process group exists |
ProcessGroup | processes | Weak | Allow processes to be cleaned up independently |
Session | process_groups | Weak | Allow process groups to be cleaned up independently |
Thread | process | Strong | Keep process alive while thread exists |
Process | tg.threads | Weak | Allow threads to be cleaned up independently |
Sources: src/process.rs(L35 - L47) src/process_group.rs(L12 - L17) src/session.rs(L12 - L17) src/thread.rs(L7 - L11)
Core Data Structures
The memory management system relies on specialized data structures for managing references:
classDiagram class StrongMap { +insert(key, value) +remove(key) +values() Vec~Arc~T~~ } class WeakMap { +insert(key, &Arc~T~) +remove(key) +values() Vec~Arc~T~~ +upgrade(key) Option~Arc~T~~ } class SpinNoIrq { +lock() MutexGuard +new(T) SpinNoIrq~T~ } class Process { } class ProcessGroup { } class Session { } Process --> StrongMap : "children" Process --> WeakMap : "tg.threads" ProcessGroup --> WeakMap : "processes" Session --> WeakMap : "process_groups" Process --> SpinNoIrq : "contains (thread-safe)"
Sources: src/process.rs(L14) src/process.rs(L18 - L22) src/process_group.rs(L16) src/session.rs(L15)
Process Creation and Memory Allocation
The ProcessBuilder
pattern manages memory allocation during process creation, ensuring proper initialization of reference relationships:
sequenceDiagram participant Client as Client participant ProcessBuilder as "ProcessBuilder" participant Process as "Process" participant ProcessGroup as "ProcessGroup" participant Session as "Session" participant INIT_PROC as INIT_PROC Client ->> ProcessBuilder: fork(pid) or new_init(pid) Client ->> ProcessBuilder: data(custom_data) Client ->> ProcessBuilder: build() ProcessBuilder ->> Process: create Process object alt Init Process Process ->> Session: new(pid) Session ->> ProcessGroup: new(pid, session) else Child Process Process -->> Process: inherit parent's group end Process ->> ProcessGroup: add self (weak ref) alt Init Process Process ->> INIT_PROC: initialize lazy static else Child Process Process ->> Process: add to parent's children (strong ref) end ProcessBuilder -->> Client: return Arc<Process>
Sources: src/process.rs(L260 - L341) src/process_group.rs(L19 - L29) src/session.rs(L19 - L27)
Zombie Process Management
When a process exits, it becomes a zombie, and its memory management changes:
During the zombie state:
- Process marks itself as a zombie using atomic boolean
- Child processes are reparented to the init process
- Process resources are partially released
- The parent must call
free()
to complete cleanup
Sources: src/process.rs(L195 - L237) src/thread.rs(L29 - L40)
Parent-Child Memory Management
The parent-child relationship memory management is particularly important:
The parent keeps strong references to children in a StrongMap
, while children have weak references to their parent. This prevents reference cycles while maintaining the parent-child relationship.
Sources: src/process.rs(L70 - L81) src/process.rs(L195 - L237)
Thread Memory Management
Threads are managed within a process using a thread group:
Threads maintain strong references to their parent process, ensuring the process stays alive as long as any thread is running. The process maintains weak references to its threads, preventing reference cycles.
Sources: src/process.rs(L18 - L31) src/thread.rs(L7 - L40)
Session and Process Group Memory Management
Sessions and process groups form the higher levels of the hierarchy:
Process groups maintain strong references to their session, while processes maintain strong references to their process group. This upward ownership pattern ensures that higher-level objects remain alive as long as any lower-level object needs them.
Sources: src/process.rs(L83 - L164) src/process_group.rs(L12 - L47) src/session.rs(L12 - L45)
Memory Safety Mechanisms
The axprocess crate employs several mechanisms to ensure memory safety:
- Thread-safe access: Using
SpinNoIrq
locks for shared mutable state - Atomic operations: Using
AtomicBool
for zombie state tracking - Builder pattern: Ensuring proper initialization with
ProcessBuilder
andThreadBuilder
- Reference counting: Using
Arc
andWeak
for managing object lifetimes - Explicit cleanup: Using
exit()
andfree()
methods for proper resource cleanup
Sources: src/process.rs(L35 - L47) src/process.rs(L195 - L237) src/thread.rs(L29 - L40)
Summary
The memory management system in axprocess creates a hierarchical model where:
- Objects lower in the hierarchy (threads, processes) hold strong references to objects higher up (process groups, sessions)
- Objects higher in the hierarchy hold weak references to objects lower down
- Special cases like parent-child process relationships use weak references for parents to avoid reference cycles
- Thread-safe access is ensured through spinlocks and atomic operations
- Zombie state management prevents premature cleanup while allowing proper resource release
This design ensures memory safety while maintaining the flexibility required for process management in an operating system.
Sources: src/process.rs src/process_group.rs src/session.rs src/thread.rs
Reference Counting and Ownership
Relevant source files
This document explains how the axprocess crate implements memory management through Rust's reference counting mechanisms. It details the ownership patterns between system components (Sessions, ProcessGroups, Processes, and Threads) and how they prevent memory leaks while maintaining proper object lifetimes.
For information about cleanup of terminated processes, see Zombie Processes and Cleanup.
Reference Counting Fundamentals
The axprocess crate relies on Rust's smart pointers to manage memory and object lifetimes:
- Arc (Atomic Reference Counting): Provides shared ownership of a value with thread-safe reference counting
- Weak: A non-owning reference that doesn't prevent deallocation when all Arc references are dropped
This approach avoids both manual memory management and garbage collection, guaranteeing memory safety while maintaining predictable resource cleanup.
flowchart TD A["Arc"] O["Object"] W["Weak"] RC["Reference Count"] D["Deallocate"] A --> O A --> RC RC --> D W --> O W --> RC
Diagram: Reference Counting Basics
Sources: src/process.rs(L1 - L4)
Ownership Hierarchy in axprocess
The axprocess system employs a careful hierarchy of strong and weak references to maintain proper component ownership while preventing reference cycles.
classDiagram class Process { pid: Pid children: StrongMap~ parent: Weak group: Arc } class ProcessGroup { pgid: Pid session: Arc processes: WeakMap~ } class Session { sid: Pid process_groups: WeakMap~ } class Thread { tid: Pid process: Arc } Process --> Process Process --> Process : "weak reference to parent" Process ..> Process Process --> ProcessGroup : "strong reference" ProcessGroup --> Session : "strong reference" ProcessGroup ..> Process : "weak references" Session ..> Process : "weak references" Thread --> Process : "strong reference"
Diagram: Reference Relationships Between Components
Sources: src/process.rs(L35 - L47) src/process_group.rs(L13 - L17) src/session.rs(L13 - L17)
Reference Direction Strategy
The system uses a deliberate pattern for determining which direction uses strong vs. weak references:
Upward Strong References
Components hold strong references (Arc
) to their "container" components:
- Processes strongly reference their ProcessGroup
- ProcessGroups strongly reference their Session
This ensures container components remain alive as long as any child component needs them.
Downward Weak References
Container components hold weak references to their "members":
- Sessions weakly reference their ProcessGroups
- ProcessGroups weakly reference their Processes
- ThreadGroups weakly reference their Threads
This prevents reference cycles while allowing containers to access their members.
Hierarchical Strong References
Processes hold strong references to their children, ensuring child processes remain valid while the parent exists. This reflects the parent-child ownership model where parents are responsible for their children's lifecycle.
flowchart TD subgraph subGraph2["Hierarchical Strong References"] Parent["Parent Process"] Child["Child Process"] S2["Session"] PG2["ProcessGroup"] P["Process"] PG["ProcessGroup"] end Child --> Parent P --> PG PG --> S PG2 --> P2 Parent --> Child S2 --> PG2
Diagram: Reference Direction Strategy
Sources: src/process.rs(L43 - L46) src/process_group.rs(L14 - L16) src/session.rs(L14 - L15)
Implementation Details
Process Ownership
The Process
struct maintains:
- Strong references to children in a
StrongMap
- Weak reference to its parent
- Strong reference to its ProcessGroup
Process {
children: SpinNoIrq<StrongMap<Pid, Arc<Process>>>,
parent: SpinNoIrq<Weak<Process>>,
group: SpinNoIrq<Arc<ProcessGroup>>,
}
Sources: src/process.rs(L43 - L46)
ProcessGroup Ownership
The ProcessGroup
struct maintains:
- Strong reference to its Session
- Weak references to its member Processes
ProcessGroup {
session: Arc<Session>,
processes: SpinNoIrq<WeakMap<Pid, Weak<Process>>>,
}
Sources: src/process_group.rs(L14 - L16)
Session Ownership
The Session
struct maintains:
- Weak references to its member ProcessGroups
Session {
process_groups: SpinNoIrq<WeakMap<Pid, Weak<ProcessGroup>>>,
}
Sources: src/session.rs(L14 - L15)
Reference Management During Object Creation
When creating objects, the system carefully establishes the appropriate references:
- When a
Process
is created:
- It acquires a strong reference to its ProcessGroup
- The ProcessGroup stores a weak reference back to the Process
- If it has a parent, the parent stores a strong reference to it
- It stores a weak reference to its parent
- When a
ProcessGroup
is created:
- It acquires a strong reference to its Session
- The Session stores a weak reference back to the ProcessGroup
sequenceDiagram participant ProcessBuilder as ProcessBuilder participant Process as Process participant ProcessGroup as ProcessGroup participant Session as Session Note over ProcessBuilder: ProcessBuilder::build() alt No parent (init process) ProcessBuilder ->> Session: Session::new(pid) ProcessBuilder ->> ProcessGroup: ProcessGroup::new(pid, &session) else Has parent ProcessBuilder ->> ProcessGroup: parent.group() end ProcessBuilder ->> Process: Create new Process Process ->> ProcessGroup: group.processes.insert(pid, weak_ref) alt Has parent Process ->> Process: parent.children.insert(pid, strong_ref) else No parent (init process) Process ->> Process: INIT_PROC.init_once(process) end
Diagram: Reference Setup During Process Creation
Sources: src/process.rs(L302 - L331) src/process_group.rs(L21 - L28) src/session.rs(L20 - L26)
Reference Management During Process Termination
When a Process exits:
- It is marked as a zombie
- Its children are re-parented to the init process
- The children update their weak parent reference to point to the init process
- The init process takes strong ownership of the children
sequenceDiagram participant Process as Process participant InitProcess as Init Process participant ChildProcesses as Child Processes Process ->> Process: is_zombie.store(true) Process ->> InitProcess: Get init_proc() Process ->> Process: Take children loop For each child Process ->> ChildProcesses: Update weak parent reference to init Process ->> InitProcess: Add child to init's children end
Diagram: Reference Management During Process Exit
Sources: src/process.rs(L207 - L225)
Memory Safety Considerations
The reference counting design in axprocess provides several safety guarantees:
Safety Feature | Implementation | Benefit |
---|---|---|
No reference cycles | Strategic use of weak references | Prevents memory leaks |
Component lifetime guarantees | Upward strong references | Components can't be deallocated while in use |
Clean resource release | Weak references in containers | Enables efficient cleanup without dangling pointers |
Automatic cleanup | Arc drop semantics | Resources are freed when no longer needed |
Thread safety | Arc's atomic reference counting | Safe to use across threads |
Sources: src/process.rs(L35 - L47) src/process_group.rs(L13 - L17) src/session.rs(L13 - L17)
Practical Example: Process Lifecycle References
Let's trace the reference management during a process's lifecycle:
- Process creation:
- Parent process creates a child using
fork()
andProcessBuilder::build()
- Child gets a strong reference to parent's process group
- Parent stores a strong reference to child
- Child stores a weak reference to parent
- Process execution:
- Process maintains its references throughout execution
- Process termination:
- Process calls
exit()
and is marked as zombie - Child processes are re-parented to init
- Parent process eventually calls
free()
to remove its strong reference - When all strong references are gone, process is deallocated
Sources: src/process.rs(L207 - L236) src/process.rs(L275 - L331)
Utility Functions for Reference Management
The codebase provides several methods to manage references between components:
Method | Purpose | Reference Type |
---|---|---|
Process::parent() | Get parent process | Weak → Strong conversion |
Process::children() | Get child processes | Strong references |
Process::group() | Get process group | Strong reference |
ProcessGroup::session() | Get session | Strong reference |
ProcessGroup::processes() | Get member processes | Weak → Strong conversion |
Session::process_groups() | Get process groups | Weak → Strong conversion |
Sources: src/process.rs(L73 - L80) src/process.rs(L86 - L88) src/process_group.rs(L33 - L46) src/session.rs(L30 - L38)
Conclusion
The reference counting and ownership model in axprocess provides a robust foundation for memory management by:
- Using strong references strategically to ensure components remain alive as needed
- Using weak references to prevent reference cycles
- Following a consistent pattern of upward strong references and downward weak references
- Maintaining proper parent-child relationships through appropriate reference types
This approach leverages Rust's ownership model to provide memory safety without garbage collection, ensuring efficient and predictable resource management.
Zombie Processes and Cleanup
Relevant source files
This document explains how axprocess manages terminated processes (zombies) and their eventual cleanup. It covers the zombie state, resource management, process inheritance, and the cleanup mechanisms that ensure proper resource deallocation. For related information about the overall process lifecycle, see Process Lifecycle.
Zombie Process Concept
In axprocess, a zombie process is a process that has terminated execution but still exists in the system's process table. When a process exits, it doesn't immediately disappear - it enters a zombie state where some minimal information is retained until its parent process acknowledges the termination.
Sources: src/process.rs(L196 - L236)
Zombie State Implementation
When a process terminates, it's marked as a zombie through the Process::exit()
method, which sets the is_zombie
atomic flag to true. In this state:
- The process is no longer executing but still exists in the process table
- Resources are partially released
- Exit status is preserved for the parent process to retrieve
- Child processes are reassigned to the init process
The zombie state allows the parent process to retrieve exit information from its children before they're completely deallocated.
classDiagram class Process { pid: Pid is_zombie: AtomicBool tg: SpinNoIrq data: Box children: StrongMap~ parent: Weak group: Arc +is_zombie() bool +exit() void +free() void } class ThreadGroup { threads: WeakMap~ exit_code: i32 group_exited: bool } Process --> ThreadGroup : contains
Sources: src/process.rs(L35 - L47) src/process.rs(L196 - L225)
Zombie Process Cleanup
Cleanup of zombie processes is a two-step process:
- A process terminates by calling
Process::exit()
, which marks it as a zombie - The parent process calls
Process::free()
to complete the cleanup
The free()
method removes the zombie process from its parent's children list. If a process is freed before it's marked as a zombie, the system will panic to prevent incorrect resource management.
sequenceDiagram participant ChildProcess as "Child Process" participant ParentProcess as "Parent Process" participant InitProcess as "Init Process" ChildProcess ->> ChildProcess: exit() Note over ChildProcess: Sets is_zombie = true alt Parent still alive ParentProcess ->> ChildProcess: free() Note over ChildProcess,ParentProcess: Remove from parent's children else Parent already exited InitProcess ->> ChildProcess: free() Note over ChildProcess,InitProcess: Remove from init's children end
Sources: src/process.rs(L227 - L236) tests/process.rs(L25 - L44)
Resource Management During Exit
The exit()
implementation handles several key cleanup tasks:
- Marks the process as a zombie using atomic operations
- Reassigns child processes to a reaper (currently always the init process)
- Updates parent references in all child processes
- Maintains the process in the parent's children list for later cleanup
Table: Key Resources in Zombie Processes
Resource | Status in Zombie Process | Cleaned Up By |
---|---|---|
Memory for Process structure | Still allocated | free()method |
Child process references | Transferred to init | exit()method |
Parent reference | Maintained | Parent'schildrenmap |
Process Group membership | Maintained | Not removed untilfree() |
Exit code | Preserved | Stored in ThreadGroup |
Sources: src/process.rs(L196 - L225)
Orphan Process Handling
When a parent process exits before its children, the children become "orphaned" and are inherited by the init process. This prevents zombie processes from becoming permanent if their parents exit without cleaning them up.
flowchart TD subgraph subGraph1["After Parent Exit"] ParentZ["Parent Process (Zombie)"] subgraph subGraph0["Before Parent Exit"] Init["Init Process"] Child1a["Child Process 1"] Child2a["Child Process 2"] Parent["Parent Process"] Child1["Child Process 1"] Child2["Child Process 2"] end end Init --> Child1a Init --> Child2a Init --> ParentZ Parent --> Child1 Parent --> Child2
Implementation details:
- When a process exits, it transfers all its children to the init process (or designated subreaper)
- Each child's parent reference is updated to point to the new parent
- These processes now appear in the init process's children collection
- The init process becomes responsible for cleaning them up when they exit
Sources: src/process.rs(L207 - L224) tests/process.rs(L47 - L55)
Cleanup Implementation Details
The zombie cleanup is implemented through reference management. Let's examine how this is done:
flowchart TD subgraph subGraph0["Reference Management"] Parent["Parent Process"] Child["Child Process"] Zombie["Zombie State"] Freed["Removed from parent"] end Child --> Parent Child --> Zombie Parent --> Child Zombie --> Freed
Key implementation points:
- The parent holds strong references (
Arc<Process>
) to its children in aStrongMap
- Children hold weak references (
Weak<Process>
) to their parent - When
free()
is called, the zombie process is removed from its parent'schildren
map - This removes the strong reference, allowing memory deallocation when all references are gone
The Process::free()
method also checks that a process is actually a zombie before freeing it, to prevent accidental cleanup of active processes.
Sources: src/process.rs(L227 - L236) tests/process.rs(L25 - L29)
Special Case: Init Process
The init process requires special handling in the context of zombies:
- The init process cannot exit (calling
exit()
on it will panic) - It's responsible for cleaning up orphaned processes
- It must properly handle zombie processes inherited from terminated parents
This special status ensures that there's always a process available to clean up orphaned zombies, preventing resource leaks.
Sources: src/process.rs(L207 - L209) tests/process.rs(L31 - L35)
Resource Management Considerations
Proper zombie process management is essential for preventing resource leaks:
- Memory leaks: Zombie processes that are never freed can accumulate and waste memory
- Process ID exhaustion: Each zombie still occupies a process ID
- Parent responsibility: Parents must clean up their zombie children
Users of this API must ensure they properly handle the cleanup of zombie processes by calling free()
after retrieving any needed exit information.
Sources: src/process.rs(L227 - L236)
Development and Testing
Relevant source files
This document outlines the development practices, testing methodologies, and CI/CD pipeline for the axprocess crate. It provides information for developers who want to contribute to or modify the codebase, explaining how to set up a development environment, run tests, and understand the automated workflows in place.
For information about specific process management functionality, see Process Management or Thread Management.
Development Environment
The axprocess crate is built using Rust's standard development tools and follows modern Rust development practices. The codebase uses the nightly Rust toolchain for development and testing.
Code Style and Formatting
Code formatting is strictly defined through the project's rustfmt.toml
configuration file. All code contributions should adhere to these formatting guidelines.
# Key rustfmt settings
unstable_features = true
style_edition = "2024"
group_imports = "StdExternalCrate"
imports_granularity = "Crate"
normalize_comments = true
wrap_comments = true
reorder_impl_items = true
format_strings = true
format_code_in_doc_comments = true
To ensure consistent formatting, run rustfmt
with the project's configuration before submitting any code changes:
cargo +nightly fmt
Sources: rustfmt.toml(L1 - L19)
Development Workflow
Typical Development Workflow
flowchart TD A["Clone Repository"] B["Setup Nightly Toolchain"] C["Implement Feature/Bug Fix"] D["Run Tests Locally"] E["Format Code with rustfmt"] F["Check with Clippy"] G["Create Pull Request"] H["CI Checks Run"] I["Code Review"] J["Address Feedback"] K["Merge to main"] A --> B B --> C C --> D D --> E E --> F F --> G G --> H H --> I I --> J I --> K J --> H
Sources: .github/workflows/ci.yml(L1 - L62) rustfmt.toml(L1 - L19)
Testing Methodology
The axprocess crate employs several testing approaches to ensure code quality and correctness. The codebase follows Rust's standard testing conventions, with tests organized within the source files themselves.
Types of Tests
Testing Structure in axprocess
flowchart TD subgraph subGraph0["Test Execution"] E["cargo test --all-features"] end A["axprocess Tests"] B["Unit Tests"] C["Integration Tests"] D["Documentation Tests"] B1["Process Component Tests"] B2["Thread Management Tests"] B3["Session/Group Tests"] C1["Component Interaction Tests"] D1["API Example Tests"] A --> B A --> C A --> D B --> B1 B --> B2 B --> B3 B1 --> E B2 --> E B3 --> E C --> C1 C1 --> E D --> D1 D1 --> E
Running Tests Locally
To run the full test suite locally:
cargo test --all-features
For running specific tests:
cargo test <test_name> --all-features
For verbose test output:
cargo test -- --nocapture
Sources: .github/workflows/ci.yml(L29 - L30)
CI/CD Pipeline
The project uses GitHub Actions for continuous integration and deployment, ensuring that all code changes are automatically tested and documented.
CI Workflow
CI/CD Pipeline Architecture
flowchart TD A["Push to main/PR"] B["GitHub Actions CI Workflow"] C["check job"] D["doc job"] C1["Setup nightly toolchain"] C2["Run clippy linter"] C3["Run cargo test"] D1["Setup nightly toolchain"] D2["Build documentation"] D3["Prepare doc artifact"] E["deploy job"] E1["Deploy to GitHub Pages"] A --> B B --> C B --> D C --> C1 C1 --> C2 C2 --> C3 D --> D1 D1 --> D2 D2 --> D3 D3 --> E E --> E1
Sources: .github/workflows/ci.yml(L1 - L62)
CI Jobs and Tasks
The CI pipeline consists of three main jobs:
Job | Purpose | Key Tasks |
---|---|---|
check | Code quality & testing | Run clippy linter, execute test suite |
doc | Documentation | Build API documentation, prepare artifact |
deploy | Publication | Deploy documentation to GitHub Pages |
The CI workflow is triggered on:
- Push events to the main branch
- Pull requests targeting the main branch
Each job in the workflow runs on the latest Ubuntu environment.
Environment Variables
The CI environment sets the following variables:
RUST_BACKTRACE: 1
This ensures that any test failures provide detailed backtraces to help identify the source of problems.
Sources: .github/workflows/ci.yml(L15 - L16)
Documentation Generation
The documentation job automatically generates API documentation using cargo doc
and deploys it to GitHub Pages. This ensures that the latest documentation is always available online.
sequenceDiagram participant GitHubActions as "GitHub Actions" participant DocJob as "Doc Job" participant GitHubPages as "GitHub Pages" GitHubActions ->> DocJob: Trigger on main branch changes DocJob ->> DocJob: Setup nightly toolchain DocJob ->> DocJob: Run "cargo doc --all-features --no-deps" DocJob ->> DocJob: Create index.html with redirect DocJob ->> GitHubActions: Upload artifact GitHubActions ->> GitHubPages: Deploy artifact GitHubPages ->> GitHubPages: Publish documentation
Sources: .github/workflows/ci.yml(L32 - L61)
Best Practices for Contributors
When contributing to the axprocess crate:
- Always use the nightly Rust toolchain as specified in the CI configuration
- Ensure code passes clippy linting with
cargo clippy --all-features --all-targets
- Add appropriate tests for new functionality
- Format code according to the project's rustfmt configuration
- Add documentation comments for public APIs
- Verify that all tests pass before submitting a pull request
Following these practices ensures that contributions integrate smoothly with the existing codebase and pass the automated CI checks.
Sources: .github/workflows/ci.yml(L1 - L62) rustfmt.toml(L1 - L19)
Testing Approach
Relevant source files
This document outlines the testing methodology used for the axprocess crate, focusing on how process management components are tested within the system. The axprocess crate leverages Rust's built-in testing framework to ensure proper functionality of process, process group, and session abstractions.
Test Organization
The test suite is organized into multiple files, each focused on testing specific subsystems:
flowchart TD A["Tests Structure"] B["process.rs"] C["group.rs"] D["session.rs"] E["common/mod.rs"] B1["Process lifecycle tests"] B2["Parent-child relationship tests"] C1["Process group functionality tests"] C2["Group membership tests"] D1["Session management tests"] D2["Session-group relationship tests"] E1["Common test utilities"] E2["ProcessExt trait"] A --> B A --> C A --> D A --> E B --> B1 B --> B2 C --> C1 C --> C2 D --> D1 D --> D2 E --> E1 E --> E2
Sources: tests/process.rs tests/group.rs tests/session.rs tests/common/mod.rs
Testing Infrastructure
Test Initialization
The axprocess tests utilize a common initialization mechanism that runs before any tests:
sequenceDiagram participant TestFramework as "Test Framework" participant Commoninitfunction as "Common init function" participant Processnew_init as "Process::new_init" Note over Commoninitfunction: Runs before any tests using TestFramework ->> Commoninitfunction: Load test module Commoninitfunction ->> Commoninitfunction: alloc_pid() Commoninitfunction ->> Processnew_init: new_init(pid) Processnew_init ->> Processnew_init: build() Note over Processnew_init: init process created
The ctor
crate is used to automatically initialize the test environment by creating an initial process before any tests run:
Sources: tests/common/mod.rs(L15 - L18)
PID Allocation
Tests use a simple atomic counter to allocate unique process IDs:
#![allow(unused)] fn main() { static PID: AtomicU32 = AtomicU32::new(0); fn alloc_pid() -> u32 { PID.fetch_add(1, Ordering::SeqCst) } }
This ensures that each test process receives a unique PID without conflicts.
Sources: tests/common/mod.rs(L9 - L13)
ProcessExt Trait
To simplify test code, a ProcessExt
trait provides helper methods for common operations:
#![allow(unused)] fn main() { pub trait ProcessExt { fn new_child(&self) -> Self; } impl ProcessExt for Arc<Process> { fn new_child(&self) -> Self { self.fork(alloc_pid()).build() } } }
This extension trait makes test code more concise by providing shortcuts for creating child processes.
Sources: tests/common/mod.rs(L20 - L28)
Test Categories
Process Lifecycle Tests
These tests verify the fundamental process management capabilities:
Test Name | Purpose |
---|---|
child | Verifies parent-child relationship creation |
exit | Tests process termination and zombie state transition |
free_not_zombie | Verifies that freeing non-zombie processes causes panic |
init_proc_exit | Ensures init process cannot be terminated |
free | Tests resource cleanup after process termination |
reap | Verifies orphan handling when parent processes exit |
Example test verifying process exit:
#[test]
fn exit() {
let parent = init_proc();
let child = parent.new_child();
child.exit();
assert!(child.is_zombie());
assert!(parent.children().iter().any(|c| Arc::ptr_eq(c, &child)));
}
Sources: tests/process.rs(L8 - L55)
Process Group Tests
Tests in this category verify process group functionality:
Test Name | Purpose |
---|---|
basic | Tests basic process group properties |
create | Verifies process group creation |
create_leader | Tests group leader constraints |
cleanup | Verifies resource cleanup |
inherit | Tests group inheritance by child processes |
move_to | Tests moving processes between groups |
move_cleanup | Verifies empty group cleanup |
move_back | Tests moving processes back to previous groups |
cleanup_processes | Tests group cleanup after processes exit |
Sources: tests/group.rs(L8 - L141)
Session Tests
These tests verify session functionality:
Test Name | Purpose |
---|---|
basic | Tests basic session properties |
create | Verifies session creation |
create_leader | Tests session leader constraints |
cleanup | Verifies resource cleanup |
create_group | Tests group creation within a session |
move_to_different_session | Verifies cross-session move constraints |
cleanup_groups | Tests session cleanup after groups disappear |
Sources: tests/session.rs(L8 - L108)
Test Method Patterns
The axprocess test suite follows several patterns:
flowchart TD A["Standard Test Pattern"] B["Unsupported markdown: list"] C["Unsupported markdown: list"] D["Unsupported markdown: list"] E["Error Test Pattern"] F["Unsupported markdown: list"] G["Unsupported markdown: list"] H["Cleanup Test Pattern"] I["Unsupported markdown: list"] J["Unsupported markdown: list"] K["Unsupported markdown: list"] L["Unsupported markdown: list"] A --> B B --> C C --> D E --> F F --> G H --> I I --> J J --> K K --> L
Sources: tests/process.rs tests/group.rs tests/session.rs
Standard Tests
Most tests follow a structure of:
- Initialize the test environment (create necessary processes)
- Perform the operation being tested
- Assert the expected outcomes using
assert!
or similar functions
Example from group.rs:
#[test]
fn basic() {
let init = init_proc();
let group = init.group();
assert_eq!(group.pgid(), init.pid());
let child = init.new_child();
assert!(Arc::ptr_eq(&group, &child.group()));
let processes = group.processes();
assert!(processes.iter().any(|p| Arc::ptr_eq(p, &init)));
assert!(processes.iter().any(|p| Arc::ptr_eq(p, &child)));
}
Sources: tests/group.rs(L8 - L20)
Error Tests
Tests that verify error handling use the #[should_panic]
attribute:
#![allow(unused)] fn main() { #[test] #[should_panic] fn free_not_zombie() { init_proc().new_child().free(); } }
This verifies that attempting to free a non-zombie process triggers a panic as expected.
Sources: tests/process.rs(L25 - L29)
Resource Cleanup Tests
Tests that verify proper resource cleanup often use weak references to ensure resources are properly deallocated:
#[test]
fn cleanup() {
let child = init_proc().new_child();
let group = Arc::downgrade(&child.create_group().unwrap());
assert!(group.upgrade().is_some());
child.exit();
child.free();
drop(child);
assert!(group.upgrade().is_none());
}
Sources: tests/group.rs(L54 - L65)
Running the Tests
Tests can be run using the standard Cargo test command:
cargo test
For more specific subsets of tests:
cargo test --test process # Run only process tests
cargo test --test group # Run only group tests
cargo test --test session # Run only session tests
Relationship to System Architecture
The testing approach directly mirrors the core architecture of the axprocess system:
flowchart TD subgraph subGraph1["Test Modules"] TP["process.rs"] TG["group.rs"] TS["session.rs"] end subgraph subGraph0["System Architecture"] PA["Process Abstraction"] PG["Process Groups"] S["Sessions"] PR["Parent-Child Relationships"] end PA --> TP PG --> TG PR --> TP S --> TS
This one-to-one mapping between system components and test modules ensures comprehensive test coverage.
Sources: tests/process.rs tests/group.rs tests/session.rs
Best Practices for Adding Tests
Based on the existing test patterns, here are the best practices for adding new tests to the axprocess crate:
- Use the common utilities: Leverage the
ProcessExt
trait and other utilities incommon/mod.rs
- Follow the established patterns: Maintain consistency with existing test structure
- Test one behavior per test: Each test should focus on a specific functionality
- Test both success and failure paths: Add
#[should_panic]
tests for error conditions - Verify resource cleanup: Use weak references to verify proper resource deallocation
- Maintain independence: Tests should not depend on each other's state
By following these practices, new tests will integrate well with the existing test suite and maintain test quality.
CI/CD Pipeline
Relevant source files
This document details the Continuous Integration and Continuous Deployment (CI/CD) pipeline configured for the axprocess
repository. It explains how automated testing, linting, and documentation generation are set up to ensure code quality and maintain up-to-date documentation.
For information about testing approaches and how to run tests manually, see Testing Approach.
Pipeline Overview
The axprocess
repository uses GitHub Actions for its CI/CD pipeline, which automatically runs on code changes to verify quality and deploy documentation. The pipeline ensures that:
- Code follows style guidelines and passes static analysis
- All tests pass successfully
- Documentation is automatically generated and deployed
flowchart TD subgraph subGraph1["CI/CD Pipeline"] check["check job:Linting & Testing"] doc["doc job:Documentation Generation"] deploy["deploy job:Deploy to GitHub Pages"] end subgraph subGraph0["Trigger Events"] push["Push to main branch"] pr["Pull Request to main branch"] end check --> doc doc --> deploy pr --> check push --> check
Sources: .github/workflows/ci.yml(L1 - L62)
Pipeline Trigger Events
The CI/CD pipeline is configured to run automatically in response to specific Git events:
Event Type | Branch | Action |
---|---|---|
Push | main | Run full pipeline |
Pull Request | main | Run full pipeline |
The pipeline uses GitHub's concurrency controls to avoid redundant runs:
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
cancel-in-progress: true
This means that if multiple commits are pushed in quick succession, earlier workflow runs will be canceled in favor of the most recent one, saving CI resources.
Sources: .github/workflows/ci.yml(L3 - L13)
CI Jobs and Steps
The pipeline consists of three main jobs:
flowchart TD subgraph deploy["deploy"] dp1["Deploy to GitHub Pages"] d1["Checkout Code"] c1["Checkout Code"] end subgraph doc["doc"] subgraph check["check"] dp1["Deploy to GitHub Pages"] d1["Checkout Code"] d2["Setup Rust toolchain"] d3["Build Documentation"] d4["Upload Artifact"] c1["Checkout Code"] c2["Setup Rust toolchain"] c3["Run Clippy"] c4["Run Tests"] end end c1 --> c2 c2 --> c3 c3 --> c4 d1 --> d2 d2 --> d3 d3 --> d4
Sources: .github/workflows/ci.yml(L18 - L61)
Check Job
The check
job runs on Ubuntu and performs the following steps:
- Checks out the repository code
- Sets up the Rust nightly toolchain with the Clippy component
- Runs Clippy for static analysis with warnings treated as errors
- Runs all tests with all features enabled
check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Rust toolchain
run: |
rustup default nightly
rustup component add clippy
- name: Clippy
run: cargo clippy --all-features --all-targets -- -Dwarnings
- name: Test
run: cargo test --all-features
Sources: .github/workflows/ci.yml(L19 - L30)
Documentation Job
The doc
job is responsible for generating the Rust documentation:
- Checks out the repository code
- Sets up the Rust nightly toolchain
- Builds the documentation with all features enabled
- Creates an index.html redirect page
- Uploads the generated documentation as an artifact
doc:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Rust toolchain
run: |
rustup default nightly
- name: Build docs
run: |
cargo doc --all-features --no-deps
printf '<meta http-equiv="refresh" content="0;url=%s/index.html">' $(cargo tree | head -1 | cut -d' ' -f1 | tr '-' '_') > target/doc/index.html
- name: Upload artifact
uses: actions/upload-pages-artifact@v3
with:
path: target/doc
Sources: .github/workflows/ci.yml(L32 - L46)
Deploy Job
The deploy
job takes the documentation artifact and deploys it to GitHub Pages:
- Uses GitHub's deploy-pages action to publish the documentation
- Requires appropriate GitHub permissions configured in the workflow
deploy:
runs-on: ubuntu-latest
needs: doc
permissions:
contents: read
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4
Sources: .github/workflows/ci.yml(L48 - L61)
Relationship to Code Structure
The CI/CD pipeline interacts with different parts of the axprocess codebase:
flowchart TD subgraph subGraph1["axprocess Codebase"] src["src/ Directory"] tests_dir["tests/ Directory"] cargo["Cargo.toml"] end subgraph subGraph0["CI/CD Pipeline Components"] clippy["Clippy Static Analysis"] tests["Unit & Integration Tests"] docs["Documentation Generator"] end clippy --> cargo clippy --> src docs --> cargo docs --> src tests --> tests_dir
Sources: .github/workflows/ci.yml(L1 - L62) Cargo.toml(L1 - L16)
Environment Configuration
The CI/CD pipeline uses specific environment configurations:
- Uses Rust nightly toolchain for all steps
- Sets
RUST_BACKTRACE=1
for better error reporting - Runs on Ubuntu Linux
flowchart TD subgraph subGraph0["Environment Setup"] env["Environment Variables:RUST_BACKTRACE=1"] rust["Rust Setup:- Channel: nightly- Components: clippy"] platform["Platform:Ubuntu Latest"] end CI["CI/CD Jobs"] env --> CI platform --> CI rust --> CI
Sources: .github/workflows/ci.yml(L15 - L16) .github/workflows/ci.yml(L23 - L26)
Documentation Deployment Flow
The documentation deployment process follows these steps:
sequenceDiagram participant GitHubActions as GitHub Actions participant CargoDoc as Cargo Doc participant GitHubPages as GitHub Pages GitHubActions ->> CargoDoc: Generate Documentation Note over CargoDoc: Processes all source files<br>with rustdoc CargoDoc ->> GitHubActions: Create doc artifacts Note over GitHubActions: Generates index.html redirect GitHubActions ->> GitHubPages: Upload documentation Note over GitHubPages: Documentation published to<br>Github Pages URL
Sources: .github/workflows/ci.yml(L32 - L61)
Best Practices for Developers
When working with the axprocess repository, developers should be aware of the CI/CD pipeline requirements:
- Clippy Compliance: All code must pass Clippy checks with no warnings (
-Dwarnings
flag is enabled) - Test Coverage: New features should include tests, which will be automatically run by the pipeline
- Documentation: Code should be properly documented as it will be automatically published
- Build Requirements: The pipeline uses the nightly Rust toolchain, so code should be compatible with it
Conclusion
The CI/CD pipeline for axprocess provides automated quality checks and documentation deployment, ensuring that:
- Code meets style and quality standards through static analysis
- All tests pass on each change
- Documentation is automatically built and deployed to GitHub Pages
- Developers receive quick feedback on their code changes
This automation helps maintain a high-quality codebase and up-to-date documentation with minimal manual intervention.
Sources: .github/workflows/ci.yml(L1 - L62)