Overview

Relevant source files

This document provides an introduction to the weak-map repository, a Rust library that implements a specialized B-Tree map data structure capable of storing weak references to values. These entries are automatically removed when the referenced values are dropped, preventing memory leaks and dangling references.

For detailed information about specific components, see Core Components, WeakMap and StrongMap, and Reference Traits. For practical applications, refer to Usage Guide.

Sources: README.md(L1 - L7)  src/lib.rs(L1 - L3) 

What is weak-map?

The weak-map library offers a Rust implementation of WeakMap, which wraps the standard BTreeMap to store weak references to values rather than the values themselves. This approach enables memory-efficient collections where entries are automatically removed when the referenced values are dropped elsewhere in the program.

Key characteristics:

  • No standard library dependency (works in no_std environments)
  • Support for both single-threaded (Rc) and thread-safe (Arc) reference counting
  • Similar to the weak-table library, but uses BTreeMap as its underlying implementation

Sources: README.md(L1 - L7)  src/lib.rs(L1 - L6)  Cargo.toml(L2 - L11) 

Core Components

The library consists of four main components defined in the src/map.rs and src/traits.rs files and exposed through src/lib.rs:

  1. WeakMap: A map that stores weak references to values, automatically cleaning up entries when values are dropped
  2. StrongMap: A simpler wrapper around BTreeMap for storing strong references
  3. WeakRef: Trait defining the interface for weak reference types
  4. StrongRef: Trait defining the interface for strong reference types

Component Architecture

flowchart TD
A["WeakMap"]
B["BTreeMap"]
C["WeakRef Trait"]
D["StrongMap"]
E["StrongRef Trait"]

A --> B
A --> C
D --> B
E --> C

Sources: src/lib.rs(L9 - L13) 

Working Mechanism

The WeakMap data structure operates through a reference conversion process:

  1. Insertion: When a value is inserted, it's first converted to a weak reference using the downgrade method from the StrongRef trait
  2. Storage: The weak reference is stored in the underlying BTreeMap
  3. Retrieval: When retrieving a value, the weak reference is obtained from the BTreeMap
  4. Upgrade Attempt: The system attempts to upgrade the weak reference to a strong reference using the upgrade method from the WeakRef trait
  5. Result: If the original value has been dropped, the upgrade fails and returns None
  6. Cleanup: Periodically, after a certain number of operations (defined by OPS_THRESHOLD), the WeakMap removes expired references

Operation Flow

flowchart TD
Client["Client"]
WeakMap["WeakMap"]
WeakRef["WeakRef"]
BTreeMap["BTreeMap"]
Cleanup["Cleanup Process"]

Cleanup --> BTreeMap
Client --> WeakMap
WeakMap --> BTreeMap
WeakMap --> Cleanup
WeakMap --> Client
WeakMap --> WeakRef

Sources: src/map.rs

Reference Management System

The library defines two core traits that abstract over reference-counted types:

  1. StrongRef: Implemented for reference-counted types like Rc and Arc
  • Provides downgrade() to convert a strong reference to a weak reference
  • Provides ptr_eq() to check if two references point to the same value
  1. WeakRef: Implemented for weak reference types like Weak<T> from both Rc and Arc
  • Provides upgrade() to attempt converting a weak reference to a strong reference
  • Provides is_expired() to check if the referenced value has been dropped

This trait-based design allows WeakMap to work with different reference-counted types flexibly.

Reference Traits Implementation

classDiagram
class StrongRef {
    <<trait>>
    type Weak
    downgrade() -~ Self::Weak
    ptr_eq(other: &Self) -~ bool
}

class WeakRef {
    <<trait>>
    type Strong
    upgrade() -~ Option
    is_expired() -~ bool
}

class Rc {
    
    downgrade() -~ Weak
}

class Arc {
    
    downgrade() -~ Weak
}

class RcWeak {
    
    
}

class ArcWeak {
    
    
}

StrongRef  ..|>  Rc : "implements for"
StrongRef  ..|>  Arc : "implements for"
RcWeak  ..|>  Rc : "implements for"
ArcWeak  ..|>  Arc : "implements for"

Sources: src/traits.rs

Common Use Cases

The WeakMap is particularly useful in scenarios where:

  • Caching: Storing objects that may be dropped elsewhere without creating memory leaks
  • Observer Pattern: Implementing observers without creating reference cycles
  • Object Registry: Maintaining a registry of objects without preventing them from being garbage collected
  • Graph Data Structures: Working with graphs while avoiding circular reference memory leaks
  • Resource Management: Tracking resources without extending their lifetime

Sources: README.md src/lib.rs(L1 - L3) 

Project Structure

The weak-map library is organized into the following key files:

FilePurpose
src/lib.rsEntry point of the library, re-exporting the main components
src/map.rsContains the implementations ofWeakMapandStrongMap
src/traits.rsDefines theStrongRefandWeakReftraits

Project Structure Diagram

flowchart TD
lib["src/lib.rs"]
map["src/map.rs"]
traits["src/traits.rs"]
WeakMap["WeakMap implementation"]
StrongMap["StrongMap implementation"]
StrongRef["StrongRef trait"]
WeakRef["WeakRef trait"]

lib --> map
lib --> traits
map --> StrongMap
map --> WeakMap
traits --> StrongRef
traits --> WeakRef

Sources: src/lib.rs(L7 - L13) 

For more detailed information about specific aspects:

Sources: README.md

Core Components

Relevant source files

This document provides an overview of the main components that make up the weak-map library and their relationships. It explains the architectural structure and key mechanisms that enable the library's functionality of maintaining maps with weak references.

For detailed implementation details of each component, see WeakMap and StrongMap and Reference Traits. For usage examples, refer to the Usage Guide.

System Architecture

The weak-map library is built around several core components that work together to provide a map data structure that automatically removes entries when referenced values are dropped.

flowchart TD
subgraph subGraph2["Internal Management"]
    OC["OpsCounter"]
    CU["Cleanup mechanism"]
end
subgraph subGraph1["Reference Abstraction Layer"]
    WR["WeakRef trait"]
    SR["StrongRef trait"]
end
subgraph subGraph0["Core Data Structures"]
    WM["WeakMap<K,V>"]
    SM["StrongMap<K,V>(alias for BTreeMap)"]
end
BT["BTreeMap<K,V>"]

CU --> BT
OC --> CU
SM --> BT
WM --> BT
WM --> OC
WM --> WR
WR --> SR

Sources: src/map.rs(L57 - L65)  src/traits.rs(L3 - L40) 

Key Components

1. WeakMap

WeakMap<K, V> is the primary data structure provided by this library. It wraps a BTreeMap and stores weak references to values, automatically cleaning up entries when the referenced values are dropped.

Key characteristics:

  • Generic over key type K and weak reference type V
  • V must implement the WeakRef trait
  • Contains an operations counter to trigger periodic cleanup
  • Provides methods to insert, retrieve, and remove entries that handle weak reference conversion
classDiagram
class WeakMap~K,V~ {
    -BTreeMap~K,V~ inner
    -OpsCounter ops
    +new() WeakMap
    +get(key) Option~V::Strong~
    +insert(key, value) Option~V::Strong~
    +remove(key) Option~V::Strong~
    +cleanup()
    +len() usize
    +is_empty() bool
}

class OpsCounter {
    -AtomicUsize counter
    +bump()
    +reset()
    +reach_threshold() bool
}

WeakMap  o--  OpsCounter

Sources: src/map.rs(L62 - L307)  src/map.rs(L13 - L55) 

2. StrongMap

StrongMap<K, V> is a simple type alias for the standard BTreeMap<K, V>. It serves as a counterpart to WeakMap for situations where strong references are needed.

pub type StrongMap<K, V> = btree_map::BTreeMap<K, V>;

Sources: src/map.rs(L57 - L58) 

3. Reference Traits

The library defines two key traits that abstract over reference types:

WeakRef Trait

Defines the interface for weak references:

classDiagram
class WeakRef {
    <<trait>>
    type Strong
    upgrade() Option~Self::Strong~
    is_expired() bool
}

class RcWeak~T~ {
    
    upgrade() Option~Rc~T~~
    is_expired() bool
}

class ArcWeak~T~ {
    
    upgrade() Option~Arc~T~~
    is_expired() bool
}

WeakRef  ..|>  RcWeak : implements
WeakRef  ..|>  ArcWeak : implements

StrongRef Trait

Defines the interface for strong references:

classDiagram
class StrongRef {
    <<trait>>
    type Weak
    downgrade() Self::Weak
    ptr_eq(other) bool
}

class Rc~T~ {
    
    downgrade() Weak~T~
    ptr_eq(other) bool
}

class Arc~T~ {
    
    downgrade() Weak~T~
    ptr_eq(other) bool
}

StrongRef  ..|>  Rc : implements
StrongRef  ..|>  Arc : implements

Sources: src/traits.rs(L3 - L19)  src/traits.rs(L21 - L40)  src/traits.rs(L42 - L88) 

4. Operations Counter and Cleanup Mechanism

The OpsCounter is an internal component that:

  • Tracks the number of operations performed on a WeakMap
  • Triggers cleanup after a threshold (1000 operations)
  • Uses atomic operations for thread safety

The cleanup mechanism removes expired weak references from the map, preventing memory leaks and maintaining map efficiency.

Sources: src/map.rs(L13 - L48)  src/map.rs(L158 - L169) 

Component Interactions

The core components interact to provide the WeakMap functionality:

sequenceDiagram
    participant Client as Client
    participant WeakMap as WeakMap
    participant BTreeMap as BTreeMap
    participant WeakRefinstance as "WeakRef instance"
    participant OpsCounter as OpsCounter

    Client ->> WeakMap: insert(key, value)
    WeakMap ->> OpsCounter: bump()
    OpsCounter -->> WeakMap: reach_threshold()?
    alt threshold reached
        WeakMap ->> WeakMap: cleanup()
    loop for each entry
        WeakMap ->> WeakRefinstance: is_expired()?
    alt expired
        WeakMap ->> BTreeMap: remove(key)
    end
    end
    WeakMap ->> OpsCounter: reset()
    end
    WeakMap ->> WeakRefinstance: downgrade(value)
    WeakMap ->> BTreeMap: insert(key, weak_ref)
    BTreeMap -->> WeakMap: old_weak_ref?
    alt had old value
        WeakMap ->> WeakRefinstance: upgrade(old_weak_ref)
        WeakRefinstance -->> WeakMap: upgraded_value?
    end
    WeakMap -->> Client: result
    Note over WeakMap,WeakRefinstance: Later when retrieving...
    Client ->> WeakMap: get(key)
    WeakMap ->> OpsCounter: bump()
    WeakMap ->> BTreeMap: get(key)
    BTreeMap -->> WeakMap: weak_ref?
    alt has weak_ref
        WeakMap ->> WeakRefinstance: upgrade(weak_ref)
        WeakRefinstance -->> WeakMap: strong_ref?
        WeakMap -->> Client: strong_ref?
    else no entry
        WeakMap -->> Client: None
    end

Sources: src/map.rs(L152 - L277)  src/traits.rs(L3 - L40) 

Implementation Details

WeakMap Implementation

The WeakMap is implemented using a BTreeMap with the following key mechanisms:

ComponentPurposeImplementation
innerfieldStores the actual map dataBTreeMap<K, V>
opsfieldTracks operations for cleanupOpsCounter
cleanupmethodRemoves expired referencesCallsis_expired()on each value
getmethodRetrieves and upgrades referencesUsesupgrade()fromWeakRef
insertmethodStores new weak referencesUsesdowngrade()fromStrongRef

Sources: src/map.rs(L62 - L65)  src/map.rs(L158 - L169)  src/map.rs(L207 - L214)  src/map.rs(L258 - L263) 

Reference Trait Implementations

The library implements the reference traits for both Rc/Weak (for single-threaded use) and Arc/Weak (for multi-threaded use):

flowchart TD
subgraph Multi-threaded["Multi-threaded"]
    Arc["Arc<T>"]
    ArcWeak["Weak<T>"]
end
subgraph Single-threaded["Single-threaded"]
    Rc["Rc<T>"]
    SR["StrongRef"]
    RcWeak["Weak<T>"]
    WR["WeakRef"]
end

Arc --> ArcWeak
Arc --> SR
ArcWeak --> Arc
ArcWeak --> WR
Rc --> RcWeak
Rc --> SR
RcWeak --> Rc
RcWeak --> WR

Sources: src/traits.rs(L42 - L88) 

Operations Counter

The operations counter uses atomic operations to ensure thread safety when tracking operations:

  • Increments a counter with each operation
  • Triggers cleanup when the threshold of 1000 operations is reached
  • Resets after cleanup

Sources: src/map.rs(L13 - L48)  src/map.rs(L16) 

Iterator Support

WeakMap provides several iterator types to access its contents:

IteratorDescriptionReturned By
IterReferences to keys and upgraded valuesiter()
KeysReferences to just the keyskeys()
ValuesJust the upgraded valuesvalues()
IntoIterOwned keys and upgraded valuesinto_iter()
IntoKeysOwned keysinto_keys()
IntoValuesJust the upgraded valuesinto_values()

Each iterator handles weak reference upgrading automatically, skipping expired values.

Sources: src/map.rs(L382 - L623) 

WeakMap and StrongMap

Relevant source files

This document provides a detailed explanation of the WeakMap and StrongMap data structures, their implementation, and usage within the weak-map library. These structures are core components that enable efficient memory management through the use of weak references. For information about the reference traits that power these structures, see Reference Traits.

Overview

WeakMap is a specialized B-Tree map that stores weak references to values, automatically removing entries when the referenced values are dropped. StrongMap is a simpler alias for the standard BTreeMap structure. Together, they provide comprehensive solutions for storing mappings with different reference semantics.

classDiagram
class WeakMap {
    inner: BTreeMap
    ops: OpsCounter
    +new()
    +get(key)
    +insert(key, value)
    +remove(key)
    +cleanup()
}

class StrongMap {
    "Alias for BTreeMap"
    
}

class OpsCounter {
    count: AtomicUsize
    +bump()
    +reset()
    +reach_threshold()
}

class BTreeMap {
    "Standard Rust BTreeMap"
    
}

WeakMap  *--  OpsCounter : contains
WeakMap  *--  BTreeMap : stores data in
StrongMap  -->  BTreeMap : type alias for

Sources: src/map.rs(L57 - L65)  src/lib.rs(L9 - L10) 

Internal Structure

WeakMap

WeakMap<K, V> is implemented as a wrapper around a BTreeMap<K, V> with an additional OpsCounter to track operations for cleanup purposes.

flowchart TD
subgraph subGraph0["WeakMap"]
    A["inner: BTreeMap"]
    B["ops: OpsCounter"]
end
C["BTreeMap"]
D["AtomicUsize"]
E["Weak Reference"]
F["Strong Reference"]
G["OPS_THRESHOLD (1000)"]

C --> A
D --> B
E --> A
E --> F
F --> E
G --> B

Sources: src/map.rs(L62 - L65)  src/map.rs(L13 - L16) 

The key features of WeakMap:

  1. Inner Storage: Uses a standard BTreeMap to store key-value pairs
  2. Operations Counter: Tracks the number of operations to trigger periodic cleanup
  3. Weak References: Values are stored as weak references, allowing them to be automatically collected when no strong references remain
  4. Cleanup Mechanism: Periodically removes expired weak references after OPS_THRESHOLD operations

StrongMap

StrongMap is a simple type alias for the standard Rust BTreeMap:

pub type StrongMap<K, V> = btree_map::BTreeMap<K, V>;

It provides a counterpart to WeakMap for cases where strong references are needed.

Sources: src/map.rs(L57 - L58) 

Operations Counter

The OpsCounter structure manages automatic cleanup of expired references:

sequenceDiagram
    participant Client as Client
    participant WeakMap as WeakMap
    participant OpsCounter as OpsCounter
    participant BTreeMap as BTreeMap

    Client ->> WeakMap: insert/get/remove operation
    WeakMap ->> OpsCounter: bump()
    OpsCounter ->> OpsCounter: increment counter
    OpsCounter ->> WeakMap: check if reach_threshold()
    alt Threshold reached (1000 operations)
        WeakMap ->> WeakMap: cleanup()
    loop for all entries
        WeakMap ->> BTreeMap: get entry
        WeakMap ->> WeakMap: check if is_expired()
    alt is expired
        WeakMap ->> BTreeMap: remove entry
    end
    end
    WeakMap ->> OpsCounter: reset()
    end

Sources: src/map.rs(L13 - L48)  src/map.rs(L152 - L169) 

The cleanup mechanism has these characteristics:

  • Operations are counted using an atomic counter
  • After OPS_THRESHOLD (1000) operations, cleanup is triggered
  • During cleanup, all expired weak references are removed
  • The operations counter is reset after cleanup

Core API

Creation and Basic Operations

WeakMap provides the following core methods:

MethodDescriptionSource
new()Creates a new, emptyWeakMapsrc/map.rs72-77
insert(key, value)Inserts a key-value pair, storing a weak reference to the valuesrc/map.rs258-263
get(key)Returns the value corresponding to the key, if it exists and hasn't been droppedsrc/map.rs207-214
remove(key)Removes a key from the map, returning the value if presentsrc/map.rs270-277
clear()Removes all entries from the mapsrc/map.rs106-109
len()Returns the number of valid (non-expired) elementssrc/map.rs177-179
raw_len()Returns the total number of elements, including expired referencessrc/map.rs113-115

Sources: src/map.rs(L103 - L307) 

Iteration

WeakMap provides various iterators, all of which automatically filter out expired references:

flowchart TD
A["WeakMap"]
B["iter()"]
C["Iter"]
D["keys()"]
E["Keys"]
F["values()"]
G["Values"]
H["into_iter()"]
I["IntoIter"]
J["into_keys()"]
K["IntoKeys"]
L["into_values()"]
M["IntoValues"]

A --> B
A --> D
A --> F
A --> H
A --> J
A --> L
B --> C
D --> E
F --> G
H --> I
J --> K
L --> M

Sources: src/map.rs(L119 - L149)  src/map.rs(L382 - L623) 

Key aspects of iterators:

  • All iterators automatically filter out expired weak references
  • Both borrowing and consuming iterators are provided
  • Iterators for keys, values, and key-value pairs

Weak Reference Management

The core functionality of WeakMap revolves around its handling of weak references:

sequenceDiagram
    participant Client as Client
    participant WeakMap as WeakMap
    participant BTreeMap as BTreeMap
    participant WeakRef as WeakRef
    participant StrongRef as StrongRef

    Client ->> WeakMap: insert(key, &strong_ref)
    WeakMap ->> StrongRef: downgrade()
    StrongRef -->> WeakMap: weak_ref
    WeakMap ->> BTreeMap: store(key, weak_ref)
    Note over Client,BTreeMap: Later...
    Client ->> WeakMap: get(key)
    WeakMap ->> BTreeMap: retrieve weak_ref
    BTreeMap -->> WeakMap: weak_ref
    WeakMap ->> WeakRef: upgrade()
    alt Reference still valid
        WeakRef -->> WeakMap: Some(strong_ref)
        WeakMap -->> Client: strong_ref
    else Reference expired
        WeakRef -->> WeakMap: None
        WeakMap -->> Client: None
    end

Sources: src/map.rs(L207 - L214)  src/map.rs(L258 - L263) 

When the original strong reference is dropped (when no more strong references exist):

  1. The weak reference in the map becomes expired
  2. Future calls to get() will return None
  3. The entry will be removed during the next cleanup cycle

Conversion Operations

WeakMap provides several conversion methods and implementations:

FromToMethod/Trait
BTreeMap<K, V>WeakMap<K, V>Fromtrait
WeakMap<K, V>BTreeMap<K, V>Fromtrait
WeakMap<K, V>StrongMap<K, V::Strong>upgrade()
&StrongMap<K, V::Strong>WeakMap<K, V>Fromtrait
Iterator of(K, &V::Strong)WeakMap<K, V>FromIteratortrait
Array[(K, &V::Strong); N]WeakMap<K, V>Fromtrait

Sources: src/map.rs(L86 - L101)  src/map.rs(L296 - L307)  src/map.rs(L341 - L380) 

Example Use Cases

  1. Cache with Automatic Cleanup:
  • Store computation results keyed by input parameters
  • Results are automatically removed when no longer referenced elsewhere
  1. Observer Pattern:
  • Track observers without preventing them from being garbage collected
  • Automatically clean up references to observers that have been dropped
  1. Resource Pooling:
  • Maintain a pool of resources without keeping them alive indefinitely
  • Resources are automatically removed from the pool when no longer in use

Memory Management Considerations

  1. Memory Leaks: WeakMap helps prevent memory leaks by not keeping values alive when they're no longer needed elsewhere
  2. Cleanup Overhead: The periodic cleanup process introduces some overhead, but it's amortized over many operations
  3. Reference Counting Overhead: Using weak references incurs the overhead of reference counting, which is generally acceptable for most applications

Sources: src/map.rs(L152 - L169)  src/map.rs(L625 - L660) 

Performance Characteristics

For more detailed information about performance considerations, see Performance Considerations.

OperationTime ComplexityNotes
insert()O(log n)Plus potential O(n) cleanup once perOPS_THRESHOLDoperations
get()O(log n)
remove()O(log n)
len()O(n)Linear as it needs to check each entry for expiration
raw_len()O(1)
cleanup()O(n)

Sources: src/map.rs(L176 - L179)  src/map.rs(L158 - L161) 

Reference Traits

Relevant source files

This document explains the reference trait system that forms the foundation of the weak-map library. The reference traits provide a flexible abstraction layer for working with different types of references (both strong and weak) in a generic way, enabling the core functionality of WeakMap. For details about the map implementations themselves, see WeakMap and StrongMap.

Overview of Reference Traits

The weak-map library defines two fundamental traits that abstract over reference types:

  1. StrongRef - Represents a strong reference that keeps a value alive
  2. WeakRef - Represents a weak reference that doesn't prevent a value from being dropped

These traits allow the WeakMap to work with different types of reference-counted values without being tied to specific implementations.


Sources: src/traits.rs(L3 - L40) 

The StrongRef Trait

The StrongRef trait defines an interface for types that represent strong references to heap-allocated values. A strong reference keeps the referenced value alive for as long as the reference exists.

#![allow(unused)]
fn main() {
pub trait StrongRef {
    type Weak: WeakRef<Strong = Self>;
    fn downgrade(&self) -> Self::Weak;
    fn ptr_eq(&self, other: &Self) -> bool;
}
}

The trait requires:

MemberTypePurpose
WeakAssociated typeSpecifies the corresponding weak reference type
downgrade()MethodConverts a strong reference to a weak reference
ptr_eq()MethodCompares two strong references for pointer equality

The Weak associated type establishes a relationship with the WeakRef trait, ensuring that both traits are implemented in a compatible way.

Sources: src/traits.rs(L3 - L19) 

The WeakRef Trait

The WeakRef trait defines an interface for types that represent weak references to heap-allocated values. A weak reference does not keep the referenced value alive; it can be used to access the value only if it's still alive due to strong references elsewhere.

#![allow(unused)]
fn main() {
pub trait WeakRef {
    type Strong: StrongRef<Weak = Self>;
    fn upgrade(&self) -> Option<Self::Strong>;
    fn is_expired(&self) -> bool {
        self.upgrade().is_none()
    }
}
}

The trait requires:

MemberTypePurpose
StrongAssociated typeSpecifies the corresponding strong reference type
upgrade()MethodAttempts to convert a weak reference to a strong reference
is_expired()MethodChecks if the weak reference is expired (has default implementation)

The Strong associated type complements the Weak type in StrongRef, creating a bidirectional relationship between the two traits.

Sources: src/traits.rs(L21 - L40) 

Trait Implementations

The library provides implementations of these traits for standard Rust reference-counting types. This allows WeakMap to work with both single-threaded and thread-safe reference types.

flowchart TD
subgraph subGraph2["Thread-safe RC"]
    Arc["std::sync::Arc<T>"]
    ArcWeak["std::sync::Weak<T>"]
end
subgraph subGraph1["Single-threaded RC"]
    Rc["std::rc::Rc<T>"]
    RcWeak["std::rc::Weak<T>"]
end
subgraph subGraph0["Reference Traits"]
    S["StrongRef Trait"]
    W["WeakRef Trait"]
end

Arc --> ArcWeak
ArcWeak --> Arc
Rc --> RcWeak
RcWeak --> Rc
S --> Arc
S --> Rc
W --> ArcWeak
W --> RcWeak

Sources: src/traits.rs(L42 - L88) 

Implementation for std::rc

For single-threaded reference counting, the traits are implemented for std::rc::Rc<T> and std::rc::Weak<T>:

TypeTraitImplementation Details
RcStrongRefUsesRc::downgradeandRc::ptr_eq
WeakWeakRefUsesWeak::upgradeand customis_expiredthat checksstrong_count

The is_expired implementation for Weak<T> optimizes the check by directly using strong_count() == 0 instead of trying to upgrade the reference.

Sources: src/traits.rs(L42 - L64) 

Implementation for std::sync

For thread-safe reference counting, the traits are implemented for std::sync::Arc<T> and std::sync::Weak<T>:

TypeTraitImplementation Details
ArcStrongRefUsesArc::downgradeandArc::ptr_eq
WeakWeakRefUsesWeak::upgradeand customis_expiredthat checksstrong_count

Similar to the std::rc implementation, the is_expired implementation for std::sync::Weak<T> directly checks the strong count for efficiency.

Sources: src/traits.rs(L66 - L88) 

Reference Traits in Action

The reference traits enable WeakMap to work with different reference types in a generic way. Here's how these traits are used in the workflow of WeakMap:

sequenceDiagram
    participant Client as Client
    participant WeakMap as WeakMap
    participant StrongRef as StrongRef
    participant WeakRef as WeakRef

    Client ->> WeakMap: insert(key, strong_ref)
    WeakMap ->> StrongRef: downgrade()
    StrongRef -->> WeakMap: weak_ref
    WeakMap ->> WeakMap: store(key, weak_ref)
    Client ->> WeakMap: get(key)
    WeakMap ->> WeakMap: retrieve weak_ref
    WeakMap ->> WeakRef: upgrade()
    alt Value still alive
        WeakRef -->> WeakMap: Some(strong_ref)
        WeakMap -->> Client: Some(strong_ref)
    else Value dropped
        WeakRef -->> WeakMap: None
        WeakMap -->> Client: None
    end
    Client ->> WeakMap: cleanup()
    loop for each entry
        WeakMap ->> WeakRef: is_expired()
    alt Expired
        WeakRef -->> WeakMap: true
        WeakMap ->> WeakMap: remove entry
    else Not expired
        WeakRef -->> WeakMap: false
    end
    end

The use of traits allows WeakMap to be generic over the specific reference type, supporting both Rc and Arc with the same implementation.

Sources: src/traits.rs(L3 - L40) 

Type Relationships

The following diagram illustrates the relationships between the trait types and concrete implementations:

classDiagram
class StrongRef {
    <<trait>>
    type Weak
    downgrade() -~ Self::Weak
    ptr_eq(other: &Self) -~ bool
}

class WeakRef {
    <<trait>>
    type Strong
    upgrade() -~ Option
    is_expired() -~ bool
}

class Rc~T~ {
    
    downgrade() -~ Weak
    ptr_eq(other: &Rc) -~ bool
}

class Weak~T~ {
    
    upgrade() -~ Option~
    strong_count() -~ usize
}

class Arc~T~ {
    
    downgrade() -~ Weak
    ptr_eq(other: &Arc) -~ bool
}

class ArcWeak~T~ {
    
    upgrade() -~ Option~
    strong_count() -~ usize
}

StrongRef  ..|>  Rc : implements
StrongRef  ..|>  Arc : implements
Weak  ..|>  WeakRef : implements
ArcWeak  ..|>  Arc : implements

This abstraction allows the WeakMap implementation to remain independent of the specific reference type, supporting both Rc for single-threaded use cases and Arc for multi-threaded scenarios.

Sources: src/traits.rs(L42 - L88) 

Summary

The reference traits (StrongRef and WeakRef) provide a flexible abstraction for working with different types of references in the weak-map library. By implementing these traits for standard Rust reference types (Rc/Weak and Arc/Weak), the library allows for generic handling of references while maintaining the ability to use weak references that don't prevent values from being garbage collected.

These traits are fundamental to the operation of WeakMap, enabling it to store weak references to values and automatically clean up entries when the referenced values are dropped.

Usage Guide

Relevant source files

This guide provides comprehensive instructions on how to effectively use the weak-map library. The library offers a specialized WeakMap implementation - a B-Tree map that stores weak references to values, automatically removing entries when referenced values are dropped.

For detailed information about the core components, see Core Components and for implementation details, see Implementation Details.

Basic Concepts

The weak-map library centers around the WeakMap data structure, which combines the ordered key-value storage of a B-Tree map with automatic memory management through weak references.

flowchart TD
subgraph subGraph1["Reference Types"]
    E["StrongRef Trait"]
    F["Upgrade to Strong"]
    G["WeakRef Trait"]
    H["Downgrade to Weak"]
end
subgraph subGraph0["Key Concepts"]
    A["WeakMap<K, V>"]
    B["Weak References"]
    C["Automatic Cleanup"]
    D["B-Tree Structure"]
end

A --> B
A --> C
A --> D
B --> G
E --> F
F --> H
G --> H
H --> F

Key Benefits:

  • Prevents memory leaks in cyclic reference scenarios
  • Automatically cleans up entries when referenced values are dropped
  • Provides a familiar map interface with weak reference handling

Sources: src/map.rs(L60 - L65)  README.md(L1 - L6) 

Creating a WeakMap

There are several ways to create a WeakMap instance:


Basic Creation

The simplest way to create a WeakMap is using the new() method:

let map = WeakMap::<u32, Weak<String>>::new();

From Existing Collections

You can create a WeakMap from various sources:

  1. From a BTreeMap:
let btree_map = BTreeMap::<u32, Weak<String>>::new();
let weak_map = WeakMap::from(btree_map);
  1. From an iterator:
let items = [(1, &arc_value1), (2, &arc_value2)];
let weak_map = WeakMap::from_iter(items);
  1. From an array:
let weak_map = WeakMap::from([(1, &arc_value1), (2, &arc_value2)]);
  1. From a StrongMap:
let strong_map = StrongMap::<u32, Arc<String>>::new();
let weak_map = WeakMap::from(&strong_map);

Sources: src/map.rs(L68 - L77)  src/map.rs(L86 - L101)  src/map.rs(L341 - L380) 

Basic Operations

WeakMap provides standard map operations with weak reference handling:

Inserting Elements

To insert elements into a WeakMap, use the insert method:

use std::sync::{Arc, Weak};
let mut map = WeakMap::<u32, Weak<String>>::new();

// Create a strong reference
let value = Arc::new(String::from("example"));

// Insert into map (automatically creates weak reference)
map.insert(1, &value);

Note that insert takes a strong reference (&V::Strong) but stores it as a weak reference internally.

Retrieving Elements

To get an element from the map:

// Returns Option<Arc<String>> (or None if expired or not found)
if let Some(strong_ref) = map.get(&1) {
    println!("Value: {}", strong_ref);
}

The get method returns:

  • Some(value) if the key exists and the weak reference can be upgraded
  • None if the key doesn't exist or the reference has expired

Removing Elements

To remove elements:

// Remove and return the value if it exists and hasn't expired
let removed_value = map.remove(&1);

// Remove and return both key and value
let removed_entry = map.remove_entry(&1);

Sources: src/map.rs(L203 - L293) 

Working with WeakMap Iterators

WeakMap provides various iterators that automatically filter out expired references:

flowchart TD
subgraph subGraph0["WeakMap Iterator Methods"]
    A["iter()"]
    D["Iter<K, V>"]
    B["keys()"]
    E["Keys<K, V>"]
    C["values()"]
    F["Values<K, V>"]
    G["into_iter()"]
    H["IntoIter<K, V>"]
    I["into_keys()"]
    J["IntoKeys<K, V>"]
    K["into_values()"]
    L["IntoValues<K, V>"]
end
M["(&K, V::Strong)"]
N["&K"]
O["V::Strong"]
P["(K, V::Strong)"]
Q["K"]
R["V::Strong"]

A --> D
B --> E
C --> F
D --> M
E --> N
F --> O
G --> H
H --> P
I --> J
J --> Q
K --> L
L --> R

Non-consuming Iterators

// Iterate over key-value pairs
for (key, value) in map.iter() {
    // value is a strong reference (expired references are skipped)
}

// Iterate over keys only
for key in map.keys() {
    // ...
}

// Iterate over values only
for value in map.values() {
    // value is a strong reference
}

Consuming Iterators

// Convert map into iterator and consume it
for (key, value) in map.into_iter() {
    // value is a strong reference
}

// Or just get keys
for key in map.into_keys() {
    // ...
}

// Or just get values
for value in map.into_values() {
    // value is a strong reference
}

Sources: src/map.rs(L118 - L149)  src/map.rs(L382 - L622) 

Checking Map State

WeakMap provides methods to check its state:

// Number of valid entries (excludes expired references)
let valid_count = map.len();

// Total number of entries (including expired references)
let total_count = map.raw_len();

// Check if map is empty (contains no valid entries)
let is_empty = map.is_empty();

// Check if map contains a specific key
let has_key = map.contains_key(&1);

Note that len() is an O(n) operation as it needs to check if each reference is valid.

Sources: src/map.rs(L112 - L185)  src/map.rs(L235 - L246) 

Converting Between Map Types

You can convert between WeakMap and StrongMap:

// Convert WeakMap to StrongMap (includes only valid references)
let strong_map: StrongMap<K, V::Strong> = weak_map.upgrade();

// Convert StrongMap to WeakMap
let weak_map = WeakMap::from(&strong_map);

// Convert WeakMap to standard BTreeMap
let btree_map: BTreeMap<K, V> = weak_map.into();

// Convert BTreeMap to WeakMap
let weak_map = WeakMap::from(btree_map);

Sources: src/map.rs(L86 - L101)  src/map.rs(L296 - L306)  src/map.rs(L368 - L380) 

Understanding Automatic Cleanup

The WeakMap implements an automatic cleanup mechanism to remove expired weak references:

sequenceDiagram
    participant Client as Client
    participant WeakMap as WeakMap
    participant OpsCounter as OpsCounter
    participant BTreeMap as BTreeMap

    Note over WeakMap: OPS_THRESHOLD = 1000
    Client ->> WeakMap: insert/get/remove operation
    WeakMap ->> OpsCounter: increment counter
    OpsCounter ->> WeakMap: check if threshold reached
    alt Threshold reached
        WeakMap ->> BTreeMap: retain only non-expired entries
        WeakMap ->> OpsCounter: reset counter
    end
    Note over Client,BTreeMap: When referenced value is dropped elsewhere
    Note over WeakMap: During next operation that reaches threshold...
    WeakMap ->> BTreeMap: expired entry is automatically removed

The cleanup process works as follows:

  1. Each operation (get, insert, remove) increments an internal operations counter
  2. When the operations counter reaches OPS_THRESHOLD (1000), cleanup is triggered
  3. During cleanup, all expired references are removed from the map
  4. The operations counter is reset to zero

This amortizes the cost of cleanup across operations, preventing performance spikes.

Sources: src/map.rs(L14 - L48)  src/map.rs(L158 - L169) 

Practical Example

Here's a complete example demonstrating typical WeakMap usage:

use std::sync::{Arc, Weak};
use weak_map::WeakMap;

// Create a new WeakMap
let mut map = WeakMap::<String, Weak<i32>>::new();

// Create some values with separate lifetimes
let value1 = Arc::new(42);
let value2 = Arc::new(100);

// Insert values (automatically creates weak references)
map.insert("first".to_string(), &value1);
map.insert("second".to_string(), &value2);

// Verify both values are accessible
assert_eq!(map.get(&"first".to_string()), Some(value1.clone()));
assert_eq!(map.get(&"second".to_string()), Some(value2.clone()));
assert_eq!(map.len(), 2);

// Drop one of the strong references
drop(value2);

// The weak reference is now expired
assert_eq!(map.get(&"second".to_string()), None);
assert_eq!(map.len(), 1); // Only one valid entry remains

// After enough operations, expired entries are automatically removed
// (This happens after OPS_THRESHOLD operations)

This example demonstrates how entries are automatically managed based on the lifecycle of the referenced values.

Sources: src/map.rs(L625 - L646) 

Performance Considerations

When using WeakMap, keep these performance aspects in mind:

  1. Cleanup Frequency: Cleanup occurs after every 1000 operations (OPS_THRESHOLD), which balances overhead with memory efficiency
  2. Length Operations: The len() method is O(n) since it must check each reference's validity, while raw_len() is O(1)
  3. Iterator Performance: All iterators filter out expired references, so iteration complexity is affected by the number of expired items
  4. Memory Usage: WeakMap maintains weak references which don't prevent garbage collection of unused values, but the map entries themselves remain until cleanup

These characteristics make WeakMap particularly suitable for caching scenarios where you want to avoid memory leaks.

Sources: src/map.rs(L14 - L16)  src/map.rs(L158 - L169)  src/map.rs(L171 - L179) 

When to Use WeakMap

WeakMap is especially useful in these scenarios:

  1. Caching Systems: When you need to cache values but don't want to prevent them from being garbage collected when no longer needed elsewhere
  2. Observer Patterns: When tracking objects that may be destroyed independently from the tracking system
  3. Breaking Reference Cycles: When you need to break reference cycles that could cause memory leaks
  4. Resource Management: When associating metadata with resources without extending their lifetime

If you don't need weak reference semantics, consider using StrongMap (which is just an alias for BTreeMap) for better performance.

Sources: src/map.rs(L57 - L65) 

Summary

The weak-map library provides an elegant solution for scenarios requiring weak references with map semantics. The WeakMap implementation automatically handles reference lifecycle management while providing a familiar map interface.

Key takeaways:

  • Use WeakMap when you need map functionality with weak reference semantics
  • Weak references are automatically upgraded when accessed and expired entries are cleaned up periodically
  • The API closely mirrors standard map interfaces but handles weak reference conversion internally
  • Performance considerations include automatic cleanup and O(n) len() operation

For more advanced usage patterns, see Advanced Usage Patterns.

Basic Usage Examples

Relevant source files

This page provides practical examples demonstrating how to use the WeakMap and StrongMap data structures in the weak-map library. For advanced usage patterns and optimization techniques, see Advanced Usage Patterns.

Introduction to WeakMap

WeakMap is a B-Tree map that stores weak references to values, automatically removing entries when the referenced values are dropped. This makes it ideal for caching scenarios where you want to access objects as long as they're in use elsewhere, without preventing them from being garbage collected.

flowchart TD
subgraph subGraph1["Object Lifecycle"]
    A["Arc"]
    AW["Weak"]
    AD["Arc dropped"]
    CE["Cleanup on next operation"]
end
subgraph subGraph0["WeakMap Data Structure"]
    WM["WeakMap>"]
    BT["BTreeMap>"]
    API["Automatic Cleanup API"]
end

A --> AW
AD --> CE
AW --> A
AW --> WM
WM --> API
WM --> BT

Sources: src/map.rs(L60 - L65)  src/map.rs(L158 - L169) 

Creating a WeakMap

Creating a new WeakMap is straightforward:

sequenceDiagram
    participant ClientCode as "Client Code"
    participant WeakMap as "WeakMap"

    ClientCode ->> WeakMap: "new()"
    Note over WeakMap: "Creates empty map"
    ClientCode ->> WeakMap: "default()"
    Note over WeakMap: "Creates empty map (alternative)"
    ClientCode ->> WeakMap: "from(btree_map)"
    Note over WeakMap: "Creates from existing BTreeMap"
    ClientCode ->> WeakMap: "from_iter()"
    Note over WeakMap: "Creates from key-value pairs"

Basic Creation Examples:

  1. Creating an empty WeakMap:
let map = WeakMap::<String, Weak<u32>>::new();
  1. Using Default trait:
let map: WeakMap<String, Weak<u32>> = WeakMap::default();
  1. Creating from an iterator of key-value pairs:
let values = vec![(1, &Arc::new("one")), (2, &Arc::new("two"))];
let map = WeakMap::from_iter(values);

Sources: src/map.rs(L68 - L77)  src/map.rs(L80 - L84)  src/map.rs(L341 - L355) 

Basic Operations

Inserting Values

To insert values into a WeakMap, you need to provide a key and a strong reference to the value. The map will store a weak reference to the value internally.

let mut map = WeakMap::<u32, Weak<String>>::new();
let value = Arc::new(String::from("example"));
map.insert(1, &value);

The insert method returns an Option<V::Strong> containing the previous strong reference associated with the key, if one existed and hasn't been dropped.

Sources: src/map.rs(L258 - L263) 

Retrieving Values

To retrieve values, use the get method with a reference to the key:

if let Some(strong_ref) = map.get(&1) {
    // Use strong_ref
}

The get method returns Option<V::Strong>, containing the strong reference if the key exists and the weak reference could be upgraded.

Sources: src/map.rs(L207 - L214) 

Checking for Keys

To check if a key exists in the map without retrieving the value:

if map.contains_key(&1) {
    // Key exists and reference is not expired
}

Sources: src/map.rs(L239 - L246) 

Removing Entries

To remove an entry from the map:

if let Some(strong_ref) = map.remove(&1) {
    // Entry was removed and reference was still valid
}

Sources: src/map.rs(L270 - L277) 

Handling of Dropped References

The key feature of WeakMap is its ability to automatically clean up entries whose values have been dropped:

flowchart TD
subgraph subGraph1["Value Lifecycle"]
    AC["Arc Created"]
    ST["Strong References Exist"]
    WK["Weak Reference in Map"]
    NR["No Strong References"]
    EX["Expired Weak Reference"]
end
subgraph subGraph0["Automatic Cleanup Process"]
    OP["Operations Counter"]
    TH["Threshold Check"]
    CL["Cleanup"]
    EXP["Check expired"]
    REM["Remove Entry"]
    KEEP["Keep Entry"]
    RST["Reset Counter"]
end

AC --> ST
CL --> EXP
CL --> RST
EX --> CL
EXP --> KEEP
EXP --> REM
NR --> EX
OP --> TH
ST --> NR
ST --> WK
TH --> CL

Sources: src/map.rs(L14 - L48)  src/map.rs(L158 - L169) 

Example of Automatic Cleanup

When a referenced value is dropped, it doesn't immediately get removed from the WeakMap. Instead, the map detects and removes expired references during subsequent operations:

let mut map = WeakMap::<u32, Weak<String>>::new();

// Create a value in a nested scope so it gets dropped
{
    let value = Arc::new(String::from("temporary"));
    map.insert(1, &value);
} // value is dropped here

// The entry still exists in the underlying BTreeMap
assert_eq!(map.raw_len(), 1);

// But it won't be returned when getting or counting valid entries
assert_eq!(map.len(), 0);
assert_eq!(map.get(&1), None);

After a certain number of operations (defined by OPS_THRESHOLD which is 1000), the map will automatically perform a cleanup to remove all expired references:

// After many operations, the expired references are cleaned up
assert_eq!(map.raw_len(), 0);

Sources: src/map.rs(L16)  src/map.rs(L158 - L161)  src/map.rs(L625 - L660) 

Converting Between WeakMap and StrongMap

Upgrading to StrongMap

You can convert a WeakMap to a StrongMap (which contains only strong references) using the upgrade method:

let strong_map: StrongMap<K, Arc<V>> = weak_map.upgrade();

This creates a new StrongMap containing only the keys that have valid references in the WeakMap.

Sources: src/map.rs(L296 - L306) 

Creating WeakMap from StrongMap

Conversely, you can create a WeakMap from a StrongMap:

let weak_map = WeakMap::from(&strong_map);

Sources: src/map.rs(L368 - L380) 

Iterating Over WeakMap Contents

WeakMap provides various iteration methods that only yield entries with valid references:

flowchart TD
subgraph subGraph1["Iterator Behaviors"]
    KV["(&K, V::Strong)"]
    K["&K"]
    V["V::Strong"]
    OKV["(K, V::Strong)"]
    OK["K"]
    OV["V::Strong"]
end
subgraph subGraph0["Iteration Methods"]
    WM["WeakMap"]
    IT["Iter"]
    KS["Keys"]
    VS["Values"]
    II["IntoIter"]
    IK["IntoKeys"]
    IV["IntoValues"]
end

II --> OKV
IK --> OK
IT --> KV
IV --> OV
KS --> K
VS --> V
WM --> II
WM --> IK
WM --> IT
WM --> IV
WM --> KS
WM --> VS

Sources: src/map.rs(L119 - L149)  src/map.rs(L383 - L623) 

Iterating Over Key-Value Pairs

for (key, value) in map.iter() {
    // key is a reference to the key, value is a strong reference
}

Iterating Over Keys or Values Only

// Iterating over keys
for key in map.keys() {
    // Process key
}

// Iterating over values
for value in map.values() {
    // Process value
}

Sources: src/map.rs(L383 - L528) 

Complete Usage Example

Here's a more complete example demonstrating the key features of WeakMap:

use std::sync::{Arc, Weak};
use weak_map::WeakMap;

// Create a new WeakMap
let mut cache = WeakMap::<String, Weak<Vec<u8>>>::new();

// Create some data and insert it into the map
let data1 = Arc::new(vec![1, 2, 3]);
let data2 = Arc::new(vec![4, 5, 6]);

cache.insert("data1".to_string(), &data1);
cache.insert("data2".to_string(), &data2);

// Data can be retrieved as long as strong references exist
assert_eq!(cache.get(&"data1".to_string()).unwrap(), data1);

// When all strong references to a value are dropped, it can't be retrieved
drop(data1);
assert_eq!(cache.get(&"data1".to_string()), None);

// But data2 is still accessible
assert_eq!(cache.get(&"data2".to_string()).unwrap(), data2);

// The map can tell us how many valid entries it has
assert_eq!(cache.len(), 1);

Sources: src/map.rs(L625 - L646) 

Practical Use Cases

WeakMap is ideal for several common scenarios:

Use CaseDescription
CachesStore cached data without preventing garbage collection
Object RegistriesTrack objects by ID without affecting their lifecycle
ObserversMaintain a list of observers without creating reference cycles
Resource PoolsTrack resources that can be released when no longer needed

Example: Simple Cache Implementation

#![allow(unused)]
fn main() {
struct Cache {
    data: WeakMap<String, Weak<Vec<u8>>>
}

impl Cache {
    fn new() -> Self {
        Self { data: WeakMap::new() }
    }
    
    fn store(&mut self, key: String, value: &Arc<Vec<u8>>) {
        self.data.insert(key, value);
    }
    
    fn retrieve(&self, key: &str) -> Option<Arc<Vec<u8>>> {
        self.data.get(key)
    }
}
}

Sources: src/map.rs(L60 - L65)  src/map.rs(L207 - L214)  src/map.rs(L258 - L263) 

Advanced Usage Patterns

Relevant source files

This page explores complex usage patterns, optimization techniques, and best practices for the WeakMap implementation. While Basic Usage Examples covers fundamental operations, this section focuses on advanced scenarios that leverage the full potential of weak references in map data structures.

Understanding the Cleanup Mechanism

The WeakMap implementation includes an automatic cleanup mechanism that purges expired weak references. Understanding this mechanism is crucial for optimizing performance.

flowchart TD
OP["Operation on WeakMap"]
BUMP["Counter.bump()"]
CHECK["reach_threshold() check"]
CLEAN["cleanup()"]
RETAIN["retain(!is_expired())"]
RESET["Counter.reset()"]
EXIT["Continue"]

BUMP --> CHECK
CHECK --> CLEAN
CHECK --> EXIT
CLEAN --> RESET
CLEAN --> RETAIN
OP --> BUMP

By default, cleanup occurs after OPS_THRESHOLD (1000) operations on the map. This value is defined as a constant in src/map.rs(L16) 

Sources: src/map.rs(L14 - L47)  src/map.rs(L158 - L169) 

Examining Cleanup Performance

Understanding the performance characteristics of the cleanup process is important for applications with stringent timing requirements:

  1. Cleanup Complexity: The cleanup operation is O(n) as it iterates through all entries in the map.
  2. Lazy Cleanup: Entries are only removed during cleanup operations, not immediately when they expire.
  3. Actual vs. Raw Length: The len() method reports only valid entries, while raw_len() includes expired entries.
sequenceDiagram
    participant Client as Client
    participant WeakMap as WeakMap
    participant OpsCounter as OpsCounter
    participant BTreeMap as BTreeMap

    Client ->> WeakMap: Multiple operations
    loop Each operation
        WeakMap ->> OpsCounter: bump()
    end
    WeakMap ->> OpsCounter: reach_threshold()?
    OpsCounter -->> WeakMap: true
    WeakMap ->> WeakMap: cleanup()
    WeakMap ->> BTreeMap: retain(!is_expired())
    WeakMap ->> OpsCounter: reset()
    Note over Client,BTreeMap: After cleanup
    Client ->> WeakMap: raw_len()
    WeakMap ->> BTreeMap: len()
    BTreeMap -->> Client: Count of all entries
    Client ->> WeakMap: len()
    WeakMap ->> WeakMap: iter().count()
    WeakMap -->> Client: Count of valid entries only

Sources: src/map.rs(L158 - L169)  src/map.rs(L113 - L115)  src/map.rs(L171 - L185) 

Custom Reference Types

The WeakMap can work with any reference type that implements the WeakRef trait, while the values must be from types implementing the StrongRef trait.

classDiagram
class StrongRef {
    <<trait>>
    type Weak
    downgrade() -~ Self::Weak
    ptr_eq(other: &Self) -~ bool
}

class WeakRef {
    <<trait>>
    type Strong
    upgrade() -~ Option
    is_expired() -~ bool
}

class CustomStrongRef {
    data: T
    downgrade() -~ CustomWeakRef
    ptr_eq(other: &Self) -~ bool
}

class CustomWeakRef {
    reference: WeakInner~T~
    upgrade() -~ Option~CustomStrongRef~
    is_expired() -~ bool
}

StrongRef  ..|>  CustomStrongRef : implements
WeakRef  ..|>  CustomWeakRef : implements
CustomStrongRef  -->  CustomWeakRef : creates
CustomWeakRef  -->  CustomStrongRef : may upgrade to

Implementing these traits for custom reference types allows you to integrate them with WeakMap:

flowchart TD
A["Custom Type Conversion"]
B["Implement StrongRef for CustomStrong"]
C["Implement WeakRef for CustomWeak"]
D["Use in WeakMapUnsupported markdown: delT~~"]

A --> B
A --> C
B --> D
C --> D

Sources: src/traits.rs(L3 - L19)  src/traits.rs(L21 - L40) 

Advanced Conversion Operations

Converting Between WeakMap and StrongMap

The WeakMap implementation provides methods for converting between weak and strong maps:

flowchart TD
A["WeakMapUnsupported markdown: del"]
B["StrongMapUnsupported markdown: del"]

A --> B
B --> A

The upgrade() method creates a new StrongMap containing only the valid entries:

Sources: src/map.rs(L296 - L306)  src/map.rs(L368 - L380) 

Working with Iterators

WeakMap provides various iterator types to work with different aspects of the map:

Iterator TypeDescriptionReturnsImplementation
IterReferences to entries(&'a K, V::Strong)src/map.rs382-430
KeysReferences to keys&'a Ksrc/map.rs444-485
ValuesValid valuesV::Strongsrc/map.rs487-528
IntoIterOwned entries(K, V::Strong)src/map.rs530-571
IntoKeysOwned keysKsrc/map.rs573-597
IntoValuesOwned valuesV::Strongsrc/map.rs599-623

Note that all iterators automatically filter out expired references, so you only get valid entries.

flowchart TD
WM["WeakMapUnsupported markdown: del"]
Iter["IterUnsupported markdown: del"]
Keys["KeysUnsupported markdown: del"]
Values["ValuesUnsupported markdown: del"]
IntoIter["IntoIterUnsupported markdown: del"]
IntoKeys["IntoKeysUnsupported markdown: del"]
IntoValues["IntoValuesUnsupported markdown: del"]
EntryRef["(&'a K, V::Strong)"]
KeyRef["&'a K"]
Value["V::Strong"]
Entry["(K, V::Strong)"]
Key["K"]
OwnedValue["V::Strong"]

IntoIter --> Entry
IntoKeys --> Key
IntoValues --> OwnedValue
Iter --> EntryRef
Keys --> KeyRef
Values --> Value
WM --> IntoIter
WM --> IntoKeys
WM --> IntoValues
WM --> Iter
WM --> Keys
WM --> Values

Sources: src/map.rs(L119 - L149)  src/map.rs(L382 - L623) 

Memory Management Strategies

Minimizing Memory Overhead

When working with WeakMap, consider these strategies to minimize memory overhead:

  1. Preemptive Cleanup: For large maps, consider manually triggering cleanup before critical operations.
  2. Monitoring Raw Size: Use raw_len() to monitor the total size including expired entries.
  3. Strategic Insert/Remove: Batch insertions and removals to minimize cleanup frequency.
flowchart TD
A["Memory Optimization"]
B["Preemptive Cleanup"]
C["Size Monitoring"]
D["Batch Operations"]
B1["Call retain() manually"]
C1["raw_len() vs len()"]
D1["Insert/remove in batches"]

A --> B
A --> C
A --> D
B --> B1
C --> C1
D --> D1

Sources: src/map.rs(L113 - L115)  src/map.rs(L158 - L169)  src/map.rs(L187 - L201) 

Thread Safety Considerations

The WeakMap can be used with both single-threaded (Rc/Weak) and thread-safe (Arc/Weak) reference types.

flowchart TD
A["Reference Type Selection"]
B["Single-Threaded"]
C["Multi-Threaded"]
B1["RcUnsupported markdown: del / WeakUnsupported markdown: del"]
B2["Implements StrongRef/WeakRef"]
B3["Use in WeakMapUnsupported markdown: delT~~"]
C1["ArcUnsupported markdown: del / WeakUnsupported markdown: del"]
C2["Implements StrongRef/WeakRef"]
C3["Use in WeakMapUnsupported markdown: delT~~"]

A --> B
A --> C
B --> B1
B --> B2
B --> B3
C --> C1
C --> C2
C --> C3

Selection depends on your concurrency requirements:

Reference TypeThread-SafeUse Case
Rc/WeakNoSingle-threaded applications, better performance
Arc/WeakYesMulti-threaded applications, safe concurrent access

Sources: src/traits.rs(L42 - L64)  src/traits.rs(L66 - L88) 

Advanced Usage Patterns

Caching with Automatic Cleanup

WeakMap is particularly well-suited for implementing caches that automatically evict entries when they are no longer used elsewhere:

flowchart TD
Client["Client"]
Cache["Cache System"]
WeakMap["WeakMap"]
Compute["Compute Value"]
StoreRef["Store in Application"]
AppData["Application Data"]
Expire["Entry Expires"]
NextCleanup["Next Cleanup"]

AppData --> Expire
Cache --> WeakMap
Client --> Cache
Compute --> StoreRef
Compute --> WeakMap
Expire --> NextCleanup
NextCleanup --> WeakMap
StoreRef --> AppData
WeakMap --> Client
WeakMap --> Compute

Sources: src/map.rs(L203 - L214)  src/map.rs(L258 - L263) 

Observer Pattern Implementation

WeakMap can be used to implement observer patterns without memory leaks:

flowchart TD
Subject["Observable Subject"]
Observer1["Observer 1"]
Observer2["Observer 2"]
ObserverN["Observer N"]
WeakMap["WeakMapUnsupported markdown: delObserver~~"]
Expired["Reference Expired"]
Removed["Entry Removed"]

Expired --> Removed
Observer2 --> Expired
Subject --> WeakMap
WeakMap --> Observer1
WeakMap --> Observer2
WeakMap --> ObserverN

Sources: src/map.rs(L203 - L214)  src/map.rs(L258 - L263) 

Breaking Reference Cycles

WeakMap is ideal for breaking reference cycles in complex data structures:

flowchart TD
ParentNode["Parent Node (Strong)"]
ChildNodes["Child Nodes (Strong)"]
ParentRefs["Parent References (Weak)"]

ChildNodes --> ParentRefs
ParentNode --> ChildNodes
ParentRefs --> ParentNode

This pattern avoids memory leaks while maintaining bidirectional relationships.

Sources: src/traits.rs(L3 - L19)  src/traits.rs(L21 - L40) 

Performance Optimizations

Choosing the Right Cleanup Strategy

The default cleanup strategy may not be optimal for all use cases:

Usage PatternRecommended Approach
High churn (many entries added/removed)LowerOPS_THRESHOLDor manual cleanup
Mostly static data with few expirationsDefault cleanup is adequate
Memory-constrained environmentsPreemptive cleanup after critical operations
Performance-critical code pathsConsider manual cleanup during idle periods

Optimizing Map Operations

For performance-critical applications, consider these strategies:

  1. Pre-sizing: If approximate size is known, create with appropriate capacity
  2. Batch Processing: Group insertions and retrievals to minimize cleanup overhead
  3. Strategic Cleanup: Trigger cleanup during low-activity periods
  4. Monitoring: Track raw_len() vs. len() to gauge cleanup effectiveness

Sources: src/map.rs(L158 - L169)  src/map.rs(L113 - L115)  src/map.rs(L171 - L185) 

Conclusion

Advanced usage of WeakMap requires understanding its internal cleanup mechanism, reference type interactions, and memory management characteristics. By applying the patterns and strategies outlined in this document, you can leverage WeakMap effectively in complex applications while maintaining optimal performance.

Implementation Details

Relevant source files

This document provides a deep dive into the internal implementation of the weak-map library. It covers the core mechanisms that enable automatic cleanup of expired references, the internal data structures, and how different components interact to provide efficient weak reference management. For information about usage patterns and API, see Usage Guide.

Internal Structure

The WeakMap is implemented as a wrapper around Rust's standard BTreeMap with additional logic to handle weak references and their lifecycle management.

classDiagram
class WeakMap {
    inner: BTreeMap
    ops: OpsCounter
    new()
    cleanup()
    try_bump()
    insert(key, value)
    get(key)
    len()
    raw_len()
}

class OpsCounter {
    0: AtomicUsize
    new()
    add(ops)
    bump()
    reset()
    get()
    reach_threshold()
}

class BTreeMap {
    <<standard library>>
    
    
}

WeakMap  -->  OpsCounter : contains
WeakMap  -->  BTreeMap : wraps

Sources: src/map.rs(L11 - L65)  src/map.rs(L13 - L55) 

The WeakMap structure has two main components:

  • inner: A standard BTreeMap<K, V> that stores the actual key-value pairs
  • ops: An OpsCounter that tracks operations to trigger cleanup at appropriate intervals

The OpsCounter is a simple wrapper around an atomic counter that helps determine when to perform cleanup operations.

Cleanup Mechanism

One of the most important aspects of the WeakMap implementation is its automatic cleanup mechanism, which ensures that expired references are removed.

flowchart TD
A["Operation on WeakMap"]
B["ops.bump()"]
C["ops.reach_threshold()?"]
D["cleanup()"]
E["Continue operation"]
F["ops.reset()"]
G["Remove expired entries"]

A --> B
B --> C
C --> D
C --> E
D --> F
D --> G
G --> E

Sources: src/map.rs(L157 - L169)  src/map.rs(L13 - L47) 

The cleanup process:

  1. Each operation increments the operation counter
  2. When the counter reaches the threshold (1000 operations, defined as OPS_THRESHOLD), cleanup is triggered
  3. The cleanup process resets the counter and removes all expired references from the map

This approach balances performance with memory usage:

  • The map doesn't need to check every entry on every operation
  • Cleanup is amortized over multiple operations
  • Expired entries will eventually be removed without manual intervention

Reference Management

The weak-map library relies on two core traits for reference management:

classDiagram
class StrongRef {
    <<trait>>
    type Weak
    downgrade() -~ Self::Weak
    ptr_eq(other: &Self) -~ bool
}

class WeakRef {
    <<trait>>
    type Strong
    upgrade() -~ Option
    is_expired() -~ bool
}

class Rc {
    
    downgrade() -~ Weak
    ptr_eq(other: &Self) -~ bool
}

class Weak {
    
    upgrade() -~ Option~
    strong_count() -~ usize
}

class Arc {
    
    downgrade() -~ Weak
    ptr_eq(other: &Self) -~ bool
}

class ArcWeak {
    
    upgrade() -~ Option~
    strong_count() -~ usize
}

Weak  -->  WeakRef : associated type
StrongRef  ..|>  Rc : implements
Weak  ..|>  WeakRef : implements
StrongRef  ..|>  Arc : implements
ArcWeak  ..|>  Arc : implements

Sources: src/traits.rs(L3 - L40)  src/traits.rs(L42 - L88) 

The trait implementations enable the WeakMap to work with different types of weak references:

  • StrongRef is implemented for Rc<T> and Arc<T>
  • WeakRef is implemented for Weak<T> (from Rc) and Weak<T> (from Arc)

This abstraction allows the WeakMap to be agnostic about the specific reference type being used, as long as it conforms to the trait requirements.

Operation Flow

When performing operations on a WeakMap, there's a specific flow that handles the weak references correctly:

sequenceDiagram
    participant Client as Client
    participant WeakMap as WeakMap
    participant BTreeMap as BTreeMap
    participant WeakRef as WeakRef
    participant StrongRef as StrongRef

    Client ->> WeakMap: insert(key, strong_ref)
    WeakMap ->> WeakMap: try_bump()
    WeakMap ->> StrongRef: downgrade(strong_ref)
    StrongRef -->> WeakMap: weak_ref
    WeakMap ->> BTreeMap: insert(key, weak_ref)
    BTreeMap -->> WeakMap: optional old_weak_ref
    WeakMap ->> WeakRef: upgrade(old_weak_ref)
    WeakRef -->> WeakMap: optional old_strong_ref
    WeakMap -->> Client: optional old_strong_ref
    Note over WeakMap,BTreeMap: Later...
    Client ->> WeakMap: get(key)
    WeakMap ->> WeakMap: ops.bump()
    WeakMap ->> BTreeMap: get(key)
    BTreeMap -->> WeakMap: optional weak_ref
    WeakMap ->> WeakRef: upgrade(weak_ref)
    WeakRef -->> WeakMap: optional strong_ref
    WeakMap -->> Client: optional strong_ref

Sources: src/map.rs(L203 - L214)  src/map.rs(L258 - L263) 

Key points in the operation flow:

  1. For insertion (insert):
  • The strong reference is downgraded to a weak reference
  • The weak reference is stored in the map
  • If an existing reference is replaced, it's upgraded before being returned
  1. For retrieval (get):
  • The weak reference is retrieved from the map
  • The weak reference is upgraded to a strong reference if still valid
  • If the reference has expired, None is returned

Iterator Implementation

Iterators in WeakMap are designed to filter out expired references automatically:

flowchart TD
A["WeakMap Iter"]
B["BTreeMap Iter"]
C["For each entry"]
D["Is referenceexpired?"]
E["Yield (key, value)"]
F["Skip entry"]

A --> B
B --> C
C --> D
D --> E
D --> F
E --> C
F --> C

Sources: src/map.rs(L382 - L430)  src/map.rs(L445 - L485)  src/map.rs(L488 - L528) 

The library provides several iterator types:

Iterator TypeDescriptionReturns
IterReferences to entries(&'a K, V::Strong)
KeysReferences to keys&'a K
ValuesValues as strong referencesV::Strong
IntoIterOwned entries(K, V::Strong)
IntoKeysOwned keysK
IntoValuesOwned values as strong referencesV::Strong

Each iterator automatically filters out entries with expired references by attempting to upgrade the weak reference. If the upgrade fails, the entry is skipped.

Performance Considerations

The performance of WeakMap is influenced by several implementation choices:

  1. Cleanup threshold: The cleanup process only runs after a certain number of operations (OPS_THRESHOLD = 1000), which amortizes the cost of cleanup.
  2. BTreeMap as the underlying data structure: The choice of BTreeMap provides O(log n) complexity for most operations.
  3. Lazy iteration: Iterators only yield valid entries, but they must attempt to upgrade each weak reference, which can be expensive.
flowchart TD
subgraph subGraph0["Performance Trade-offs"]
    A["Immediate Cleanup"]
    B["High Operation Cost"]
    C["Minimal Memory Usage"]
    D["Lazy Cleanup"]
    E["Fast Operations"]
    F["Temporary Memory Overhead"]
    G["Current Approach(Threshold-based)"]
    H["Amortized Cost"]
    I["Bounded Memory Overhead"]
end

A --> B
A --> C
D --> E
D --> F
G --> H
G --> I

Sources: src/map.rs(L15 - L16)  src/map.rs(L157 - L169)  src/map.rs(L625 - L660) 

The current implementation strikes a balance between operation speed and memory usage:

  • Operations are fast most of the time (no cleanup)
  • Memory overhead is bounded (cleanup happens periodically)
  • The cost of cleanup is amortized over multiple operations

The test cases in the codebase demonstrate this behavior, showing that after many operations the map will clean up expired references automatically.

Memory Management

The core memory management feature of WeakMap is its ability to automatically handle expired references.

flowchart TD
A["Object Creation"]
B["Strong Reference(Rc/Arc)"]
C["WeakMap storesWeak Reference"]
D["Object Drop"]
E["Strong Count = 0"]
F["Weak ReferenceExpires"]
G["Operation on WeakMap"]
H["Cleanup Triggered?"]
I["Remove ExpiredReferences"]
J["Continue Operation"]

A --> B
B --> C
C --> F
D --> E
E --> F
F --> I
G --> H
H --> I
H --> J

Sources: src/map.rs(L157 - L161)  src/traits.rs(L33 - L39)  src/map.rs(L632 - L660) 

When an object is dropped elsewhere in the program:

  1. Its strong count reaches zero
  2. Any weak references to it become expired
  3. The next time the cleanup mechanism runs in WeakMap, these expired references will be removed

This ensures that WeakMap doesn't hold onto memory for objects that are no longer needed elsewhere in the program.

Memory Management

Relevant source files

Purpose and Scope

This document explains how memory is managed in the weak-map library, focusing on weak references and the automatic cleanup process. It details how the WeakMap data structure prevents memory leaks while allowing values to be deallocated when they're no longer needed elsewhere.

For information about the core components and API of the WeakMap and StrongMap implementations, see WeakMap and StrongMap. For details about the reference traits, see Reference Traits.

Weak vs Strong References

The weak-map library is built around the concept of weak references, which are references that don't prevent the referenced object from being deallocated.

flowchart TD
subgraph subGraph0["Reference Types"]
    A["StrongReference (Rc/Arc)"]
    B["Referenced Object"]
    C["WeakReference (Weak)"]
end

A --> B
A --> C
C --> A
C --> B

Sources: src/traits.rs(L3 - L40) 

Key characteristics:

  • Strong references (Rc<T>, Arc<T>) increase the reference count and prevent deallocating the object
  • Weak references (Weak<T>) don't affect the reference count used for deallocation decisions
  • When all strong references are dropped, the object is deallocated, even if weak references still exist
  • Weak references can be "upgraded" to strong references, but this will fail if the object has been deallocated

Reference Management Architecture

The library provides traits that abstract over different reference types:

classDiagram
class StrongRef {
    <<trait>>
    type Weak
    downgrade() -~ Self::Weak
    ptr_eq(other: &Self) -~ bool
}

class WeakRef {
    <<trait>>
    type Strong
    upgrade() -~ Option
    is_expired() -~ bool
}

class Rc {
    
    downgrade() -~ Weak
    ptr_eq(other: &Rc) -~ bool
}

class RcWeak {
    
    upgrade() -~ Option~
    is_expired() -~ bool
}

class Arc {
    
    downgrade() -~ Weak
    ptr_eq(other: &Arc) -~ bool
}

class ArcWeak {
    
    upgrade() -~ Option~
    is_expired() -~ bool
}

StrongRef  -->  WeakRef : associated types
StrongRef  ..|>  Rc : implements
StrongRef  ..|>  Arc : implements
RcWeak  ..|>  Rc : implements
ArcWeak  ..|>  Arc : implements

Sources: src/traits.rs(L3 - L88) 

This architecture allows the WeakMap to work with different types of references, making it flexible for various use cases:

  • Single-threaded applications can use Rc/Weak references
  • Multi-threaded applications can use Arc/Weak references

Automatic Cleanup Mechanism

A key feature of WeakMap is its automatic cleanup of expired weak references:

flowchart TD
subgraph subGraph0["Cleanup Process"]
    C["OpsCounter"]
    D["cleanup()"]
end
A["WeakMap"]
B["BTreeMap"]

A --> B
A --> C
C --> D
D --> B

Sources: src/map.rs(L13 - L47)  src/map.rs(L158 - L169) 

Operations Counter

The operations counter tracks the number of operations performed on the map:

ComponentPurposeImplementation
OpsCounterTracks operations on the mapUses an atomic counter (AtomicUsize)
OPS_THRESHOLDDetermines when cleanup occursConstant set to 1000 operations
try_bump()Increments counter and checks thresholdCalled on mutations like insert/remove
cleanup()Removes expired entriesRetains only non-expired entries

Sources: src/map.rs(L13 - L47) 

The operations counter uses atomic operations to ensure thread safety. Each mutation operation (insert, remove) increments the counter, and when it reaches the threshold (1000 operations), the cleanup process is triggered.

sequenceDiagram
    participant Client as Client
    participant WeakMap as WeakMap
    participant OpsCounter as OpsCounter
    participant BTreeMap as BTreeMap

    Client ->> WeakMap: insert(key, value)
    WeakMap ->> OpsCounter: bump()
    OpsCounter ->> OpsCounter: increment count
    OpsCounter ->> WeakMap: check if count >= OPS_THRESHOLD
    alt Threshold reached
        WeakMap ->> OpsCounter: reset()
        WeakMap ->> BTreeMap: retain(!v.is_expired())
    end
    WeakMap ->> BTreeMap: insert(key, downgraded_value)

Sources: src/map.rs(L164 - L169)  src/map.rs(L258 - L263) 

Reference Lifecycle in WeakMap

The lifecycle of references in the WeakMap follows this pattern:

flowchart TD
A["Client code"]
B["WeakMap"]
C["Weak reference"]
D["BTreeMap"]
E["Client code"]
F["Strong reference (if still valid)"]
G["Original object"]
H["Weak reference becomes expired"]
I["Any operation"]
J["Remove expired references"]

A --> B
B --> C
B --> D
B --> E
B --> F
E --> B
G --> H
I --> J

Sources: src/map.rs(L207 - L214)  src/map.rs(L258 - L263) 

Key Stages:

  1. Storage Phase:
  • The client provides a key and a strong reference (&V::Strong)
  • WeakMap downgrades the strong reference to a weak reference
  • The key and weak reference are stored in the underlying BTreeMap
  1. Retrieval Phase:
  • The client requests a value by key
  • WeakMap retrieves the weak reference from the BTreeMap
  • It attempts to upgrade the weak reference to a strong reference
  • If successful (object still exists), it returns the strong reference
  • If unsuccessful (object has been deallocated), it returns None
  1. Cleanup Phase:
  • After a threshold number of operations, cleanup is triggered
  • All expired weak references are removed from the map
  • The operations counter is reset

Sources: src/map.rs(L158 - L161) 

Memory Management in Practice

The following test demonstrates how memory is automatically managed:

sequenceDiagram
    participant Test as Test
    participant WeakMap as WeakMap
    participant InnerScope as "Inner Scope"

    Test ->> WeakMap: create WeakMap<u32, Weak<&str>>
    Test ->> Test: create elem1 = Arc::new("1")
    Test ->> WeakMap: insert(1, &elem1)
    Test ->> InnerScope: enter inner scope
    InnerScope ->> InnerScope: create elem2 = Arc::new("2")
    InnerScope ->> WeakMap: insert(2, &elem2)
    InnerScope ->> Test: exit scope (elem2 is dropped)
    Test ->> WeakMap: get(1)
    WeakMap ->> Test: return Some(elem1)
    Test ->> WeakMap: get(2)
    WeakMap ->> Test: return None (elem2 was dropped)
    Test ->> WeakMap: len()
    WeakMap ->> Test: return 1 (only elem1 is still valid)

Sources: src/map.rs(L632 - L646) 

In this test:

  1. Two values are inserted into the WeakMap
  2. The second value (elem2) goes out of scope and is dropped
  3. When retrieving the values, only the first one is still available
  4. The len() method accurately reports only one valid element, even though there are two entries in the underlying map

Performance Considerations

The automatic cleanup mechanism balances memory usage with performance:

ConsiderationImplementationTrade-off
Delayed cleanupCleanup occurs afterOPS_THRESHOLDoperationsAmortizes cleanup cost across operations
Lazy iterationIterators skip expired referencesAvoids unnecessary memory allocations
On-demand lengthlen()counts only valid referencesMore expensive than raw_len() but accurate
Targeted cleanupretain()function used to filterMore efficient than rebuilding the map

Sources: src/map.rs(L158 - L161)  src/map.rs(L172 - L179) 

For high-performance scenarios where cleanup frequency needs to be tuned, the OPS_THRESHOLD constant (set to 1000) determines how often the cleanup process runs. This value represents a balance between memory usage (keeping expired references around) and CPU usage (cleaning up frequently).

For more in-depth performance considerations, see Performance Considerations.

Performance Considerations

Relevant source files

This document covers the performance characteristics of the WeakMap implementation, including its automatic cleanup mechanism, operation complexity, and memory management considerations. For information about memory management internals, see Memory Management.

Automatic Cleanup Mechanism

WeakMap implements an automatic garbage collection mechanism that periodically removes expired weak references from its internal storage. This prevents the map from accumulating dead entries indefinitely.

flowchart TD
subgraph subGraph0["Cleanup Process"]
    D["Run cleanup()"]
    F["Reset counter"]
    G["Remove expired references"]
end
A["Operation on WeakMap"]
B["Increment OpsCounter"]
C["Counter ≥ OPS_THRESHOLD?"]
E["Continue"]

A --> B
B --> C
C --> D
C --> E
D --> F
D --> G
G --> E

The cleanup process is controlled by an operations counter that triggers garbage collection after a set threshold:

  • Each map operation increments a counter
  • When counter reaches OPS_THRESHOLD (1000 operations), cleanup runs
  • Cleanup removes all entries with expired weak references
  • Counter resets after cleanup

Sources: src/map.rs(L13 - L47)  src/map.rs(L158 - L169) 

Operation Complexity

WeakMap is built on top of BTreeMap and inherits its characteristics, with additional overhead for weak reference handling.

OperationTime ComplexityNotes
get(key)O(log n)Plus weak reference upgrade cost
insert(key, value)O(log n)Plus weak reference downgrade cost
remove(key)O(log n)Plus weak reference upgrade cost
len()O(n)Must iterate all entries to filter expired refs
is_empty()O(n)Callslen()under the hood
contains_key(key)O(log n)Must check if reference is expired
cleanup()O(n)Full scan removing expired references
upgrade()toStrongMapO(n)Must attempt to upgrade all references
IterationO(n)Filters out expired references during iteration

Sources: src/map.rs(L158 - L161)  src/map.rs(L176 - L179)  src/map.rs(L383 - L430) 

Memory Management Model

The key performance advantage of WeakMap is its ability to avoid memory leaks by not keeping values alive when they're no longer needed elsewhere in the program.

sequenceDiagram
    participant Client as Client
    participant WeakMapKV as "WeakMap<K, V>"
    participant BTreeMapKV as "BTreeMap<K, V>"
    participant OpsCounter as OpsCounter

    Client ->> WeakMapKV: insert(key, strong_ref)
    WeakMapKV ->> WeakMapKV: downgrade(strong_ref)
    WeakMapKV ->> BTreeMapKV: store(key, weak_ref)
    WeakMapKV ->> OpsCounter: bump()
    Note over Client,BTreeMapKV: Later, value dropped elsewhere
    Client ->> WeakMapKV: get(key)
    WeakMapKV ->> BTreeMapKV: get(key)
    BTreeMapKV ->> WeakMapKV: return weak_ref
    WeakMapKV ->> WeakMapKV: attempt upgrade()
    WeakMapKV ->> Client: return None (reference expired)
    Note over WeakMapKV,OpsCounter: After OPS_THRESHOLD operations
    WeakMapKV ->> OpsCounter: check threshold
    OpsCounter ->> WeakMapKV: threshold reached
    WeakMapKV ->> WeakMapKV: cleanup()
    WeakMapKV ->> BTreeMapKV: remove expired entries

This diagram illustrates how WeakMap interacts with its references and automatic cleanup mechanism throughout the lifecycle of operations.

Sources: src/map.rs(L158 - L169)  src/map.rs(L207 - L214)  src/map.rs(L258 - L263) 

Performance Implications

Cleanup Overhead

While the automatic cleanup provides memory safety, it comes with performance costs:

  • Periodic O(n) cleanup operations that scan all entries
  • Unpredictable timing of these operations can cause occasional latency spikes
  • Cleanup frequency depends on operation patterns (thrashing can occur with certain workloads)

In the worst case, if a WeakMap contains many expired references and few valid ones, a significant portion of operations can be spent on cleanup rather than useful work.

Sources: src/map.rs(L158 - L161)  src/map.rs(L16) 

Iterator Performance

Iterators in WeakMap must filter out expired references during iteration, adding overhead compared to regular collection iterators:

flowchart TD
subgraph subGraph0["Regular BTreeMap iteration"]
    G["BTreeMap::iter()"]
    H["Iterator.next() called"]
    I["Return (key, value) directly"]
end
A["WeakMap::iter()"]
B["BTreeMap::iter()"]
C["Wrapped in WeakMap::Iter"]
D["Iterator.next() called"]
E["Reference expired?"]
F["Return (key, upgraded_value)"]

A --> B
B --> C
C --> D
D --> E
E --> D
E --> F
G --> H
H --> I

This filtering during iteration means:

  • Size hints are less accurate (0 to n rather than exact counts)
  • Iteration may be slower than with regular collections
  • Memory usage during iteration remains efficient due to lazy evaluation

Sources: src/map.rs(L383 - L405)  src/map.rs(L390 - L399) 

Memory Usage vs Regular Collections

The weak reference approach offers significant memory advantages in certain scenarios:

Collection TypeMemory BehaviorReference Behavior
BTreeMap<K, V>Stores full valuesValues kept alive even when unused elsewhere
WeakMap<K, V>Stores weak referencesValues collected when no strong refs exist

When storing large objects that may be dropped elsewhere in the program, WeakMap allows for automatic reclamation of memory without manual bookkeeping.

Sources: src/map.rs(L60 - L65) 

Performance Tuning

The main tunable parameter for WeakMap performance is the OPS_THRESHOLD constant:

flowchart TD
A["OPS_THRESHOLD"]
B["Cleanup Frequency"]
C["More frequent cleanup"]
D["Less frequent cleanup"]
E["(+) Less memory overhead"]
F["(–) More CPU overhead"]
G["(–) More latency spikes"]
H["(+) Less CPU overhead"]
I["(+) Fewer latency spikes"]
J["(–) More memory overhead"]

A --> B
B --> C
B --> D
C --> E
C --> F
C --> G
D --> H
D --> I
D --> J

The default threshold (1000 operations) aims to balance:

  • Memory usage (keeping expired references consumes memory)
  • CPU overhead (running cleanup too frequently is expensive)
  • Latency consistency (avoiding frequent pauses for cleanup)

Sources: src/map.rs(L16) 

Real-World Performance Behavior

The test cases demonstrate key performance characteristics:

  1. Basic Functionality Test: Shows how expired references are automatically excluded from operations like len() and get().
  2. Cleanup Trigger Test: Shows how the cleanup mechanism is automatically triggered after OPS_THRESHOLD operations, removing expired references from the map's internal storage.

Testing shows that after many operations, the map correctly maintains its state:

  • Only counts valid references in its logical length (len())
  • Still tracks the total entries in its raw length (raw_len())
  • Automatically cleans up entries when the threshold is reached

Sources: src/map.rs(L625 - L660) 

Optimization Recommendations

When using WeakMap in performance-sensitive code, consider these guidelines:

  1. Avoid frequent len() calls: Since this operation is O(n), cache the length if needed repeatedly.
  2. Be aware of operation count: Operations that might trigger cleanup can cause occasional performance spikes.
  3. Use raw_len() for debugging: This gives you the total entries including expired ones without the O(n) scan.
  4. Consider selective cleanup: For very large maps, consider manually cleaning up at strategic times rather than relying solely on the automatic threshold.
  5. Use appropriate data structures: If you don't need the weak reference behavior, consider using StrongMap which avoids the overhead of reference handling.

Sources: src/map.rs(L113 - L115)  src/map.rs(L176 - L179) 

Project Information

Relevant source files

This document provides essential information about the weak-map project structure, development workflow, contribution guidelines, and licensing. For detailed technical information about the implementation, please refer to Core Components and Implementation Details.

Project Overview

The weak-map repository provides a Rust implementation of WeakMap - a B-Tree map data structure that stores weak references to values, automatically removing entries when referenced values are dropped. It is hosted on GitHub at https://github.com/Starry-OS/weak-map and published as a crate on crates.io.

flowchart TD
A["weak-map Repository"]
B["Source Code"]
C["Project Metadata"]
D["CI Configuration"]
E["lib.rs"]
F["map.rs"]
G["traits.rs"]
H["README.md"]
I["Cargo.toml"]
J["License Files"]
K[".github/workflows/ci.yml"]

A --> B
A --> C
A --> D
B --> E
B --> F
B --> G
C --> H
C --> I
C --> J
D --> K

Sources: README.md, .github/workflows/ci.yml

Repository Structure

The weak-map project follows a standard Rust crate organization with a clean separation between the core implementations and trait definitions.

classDiagram
class SourceCode {
    src/lib.rs
    src/map.rs
    src/traits.rs
    
}

class Implementation {
    WeakMap
    StrongMap
    
}

class Traits {
    StrongRef
    WeakRef
    
}

class ProjectFiles {
    README.md
    Cargo.toml
    LICENSE-MIT
    LICENSE-APACHE-2.0
    
}

class CIConfig {
    .github/workflows/ci.yml
    
}

SourceCode  -->  Implementation : contains
SourceCode  -->  Traits : contains
Implementation  ..>  Traits : uses

Sources: README.md

Development Workflow

The weak-map project employs GitHub Actions for continuous integration to ensure code quality and test coverage.

CI Process

The CI workflow runs automatically on:

  • Push to the main branch
  • Pull requests targeting the main branch
flowchart TD
A["Push/PR to main branch"]
B["CI Workflow Triggered"]
C["Matrix Setup"]
D["Rust Toolchain Versions"]
E["stable"]
F["nightly"]
G["nightly-2025-01-18"]
H["Run Clippy"]
I["Run Tests"]
J["Build Success/Failure Report"]

A --> B
B --> C
C --> D
D --> E
D --> F
D --> G
E --> H
F --> H
G --> H
H --> I
I --> J

Sources: .github/workflows/ci.yml(L3 - L10) 

CI Actions

The CI performs these specific checks:

  1. Clippy Linting: Runs with all features and targets, with warnings treated as errors:
cargo clippy --all-features --all-targets -- -Dwarnings
  1. Comprehensive Testing: Runs all tests with all features enabled:
cargo test --all-features

Sources: .github/workflows/ci.yml(L28 - L31) 

Contributing Guidelines

Contributions to the weak-map project are welcome. Based on the repository structure and CI configuration, here are the recommended steps for contributing:

  1. Fork the repository on GitHub
  2. Create a feature branch for your changes
  3. Make your changes following the code style of the project
  4. Add tests for your changes to ensure they work correctly
  5. Run the checks locally that will be performed by CI:
cargo clippy --all-features --all-targets -- -Dwarnings
cargo test --all-features
  1. Submit a pull request to the main branch
sequenceDiagram
    participant Developer as Developer
    participant Repository as Repository
    participant CI as CI

    Developer ->> Repository: Fork repository
    Developer ->> Developer: Create feature branch
    Developer ->> Developer: Make changes
    Developer ->> Developer: Add tests
    Developer ->> Developer: Run local checks
    Developer ->> Repository: Submit pull request
    Repository ->> CI: Trigger CI checks
    CI ->> Repository: Report results
    alt Tests Pass
        Repository ->> Developer: Approve and merge
    else Tests Fail
        Repository ->> Developer: Request changes
    end

Sources: .github/workflows/ci.yml(L14 - L31) 

License Information

The weak-map project is dual-licensed under both the MIT License and the Apache License 2.0, allowing users to choose the license that best suits their needs.

Dual License Approach

flowchart TD
A["weak-map Project"]
B["MIT License"]
C["Apache License 2.0"]
D["Simple, permissive license"]
E["Includes explicit patent grants"]
F["Users can choose either license"]

A --> B
A --> C
B --> D
C --> E
D --> F
E --> F

License Usage

  • MIT License: A permissive license that allows users to do almost anything with the code, including using it in proprietary software, as long as they provide attribution.
  • Apache License 2.0: Also permissive, but includes explicit patent grants and more detailed terms around trademark usage.

The license files (LICENSE-MIT and LICENSE-APACHE-2.0) are included in the repository root directory, as indicated by project structure diagrams.

Sources: README.md

Package Information

The weak-map package is published on crates.io and documentation is available on docs.rs.

flowchart TD
A["weak-map Package"]
B["crates.io"]
C["docs.rs"]
D["Rust dependency management"]
E["API documentation"]
F["Your Project"]
G["Add dependency in Cargo.toml"]

A --> B
A --> C
B --> D
C --> E
F --> G
G --> B

Project Origins

As noted in the README, weak-map is "similar to and inspired by weak-table but using BTreeMap as underlying implementation."

Sources: README.md(L6) 

Contributing Guide

Relevant source files

This document provides guidelines and instructions for contributing to the weak-map library. It covers the development workflow, code standards, CI process, and pull request procedures. For information on how to use the library, see the Usage Guide or Core Components for implementation details.

Development Environment Setup

Prerequisites

To contribute to weak-map, you'll need:

  • Rust toolchain (stable, though the project is tested on nightly as well)
  • Cargo (Rust's package manager)
  • Git

Getting Started

flowchart TD
A["Fork Repositoryon GitHub"]
B["Clone Repositorygit clone Unsupported markdown: link"]
C["Set Upstreamgit remote add upstream Unsupported markdown: link"]
D["Create Branchgit checkout -b feature/your-feature"]
E["Make Changes"]
F["Run Testscargo test --all-features"]
G["Run Clippycargo clippy --all-features --all-targets"]
H["Commit Changesgit commit -m 'Add feature X'"]
I["Push Changesgit push origin feature/your-feature"]
J["Create Pull Requeston GitHub"]

A --> B
B --> C
C --> D
D --> E
E --> F
F --> G
G --> H
H --> I
I --> J

Sources: .github/workflows/ci.yml(L1 - L32) 

Code Standards and Guidelines

The weak-map codebase follows standard Rust coding conventions. All contributions should:

  1. Pass Clippy checks with no warnings (cargo clippy --all-features --all-targets -- -Dwarnings)
  2. Include appropriate tests
  3. Maintain or improve test coverage
  4. Include documentation for public API items
  5. Follow the existing code style

Project Structure

When making changes, it's important to understand the project's structure:

flowchart TD
A["src/lib.rsMain Entry Point"]
B["src/map.rsWeakMap & StrongMap Implementations"]
C["src/traits.rsWeakRef & StrongRef Traits"]
D["WeakMap Struct"]
E["StrongMap Struct"]
F["WeakRef Trait"]
G["StrongRef Trait"]

A --> B
A --> C
B --> D
B --> E
C --> F
C --> G

Sources: README.md(L1 - L7) 

Testing Requirements

All contributions must include appropriate tests:

  • Unit Tests: Test individual functions and methods
  • Integration Tests: Test interactions between components
  • Edge Cases: Include tests for boundary conditions

Run tests locally before submitting a PR:

cargo test --all-features

CI Process

The weak-map repository uses GitHub Actions for continuous integration:

flowchart TD
A["Pull RequestCreated/Updated"]
B["GitHub ActionsCI Workflow Triggered"]
C["Check Job"]
D["Rust StableClippy + Tests"]
E["Rust NightlyClippy + Tests"]
F["Rust Nightly-2025-01-18Clippy + Tests"]
G["All ChecksPass?"]
H["Ready for Review"]
I["Fix Issues"]
J["Push Updates"]

A --> B
B --> C
C --> D
C --> E
C --> F
D --> G
E --> G
F --> G
G --> H
G --> I
I --> J
J --> B

The CI process checks:

  1. Clippy static analysis with warnings treated as errors
  2. All tests passing across multiple Rust versions
  3. All features enabled during testing

Sources: .github/workflows/ci.yml(L1 - L32) 

Pull Request Guidelines

PR Submission

When submitting a pull request:

  1. Provide a clear, descriptive title
  2. Include a detailed description of changes
  3. Reference any related issues
  4. Explain your testing approach
  5. Highlight any breaking changes

Review Process

The review process typically involves:

  1. CI checks passing
  2. Code review by maintainers
  3. Addressing feedback
  4. Final approval and merge
sequenceDiagram
    participant Contributor as Contributor
    participant CISystem as CI System
    participant Maintainer as Maintainer

    Contributor ->> CISystem: Submit PR
    CISystem ->> CISystem: Run tests & checks
    CISystem ->> Maintainer: Report results
    Maintainer ->> Contributor: Provide feedback
    Contributor ->> CISystem: Address feedback
    CISystem ->> CISystem: Re-run tests
    CISystem ->> Maintainer: Report updated results
    Maintainer ->> Maintainer: Final review
    Maintainer ->> Contributor: Approve/request changes
    Contributor ->> Maintainer: Address final requests
    Maintainer ->> Contributor: Merge PR

Sources: .github/workflows/ci.yml(L1 - L32) 

Documentation

Documentation is a crucial part of the weak-map project:

  • Code Documentation: All public APIs should have rustdoc comments
  • Examples: Include examples for non-trivial functionality
  • Wiki Contributions: Update relevant wiki pages when changing functionality

Documentation Style

/// A map containing weak references to values.
/// 
/// Values are automatically removed when the original reference is dropped.
/// 
/// # Examples
/// 
/// ```
/// use weak_map::WeakMap;
/// use std::rc::Rc;
/// 
/// let mut map = WeakMap::new();
/// let value = Rc::new("value");
/// 
/// map.insert("key", value.clone());
/// assert!(map.contains_key("key"));
/// 
/// drop(value);  // Drop the strong reference
/// assert!(!map.get("key").is_some());
/// ```

Licensing

The weak-map project is dual-licensed under MIT and Apache 2.0 licenses. By contributing to this project, you agree that your contributions will be licensed under both licenses.

For details about the project's licenses, see the License Information page.

Technical Requirements Checklist

Before submitting your PR, ensure you've completed the following:

RequirementDescriptionStatus
Clippy Checkscargo clippy --all-features --all-targetspasses with no warnings
TestsAll existing tests pass and new functionality has tests
DocumentationPublic APIs are documented with rustdoc comments
CI PassingAll CI checks pass on GitHub
Code StyleCode follows existing style and conventions
Breaking ChangesBreaking changes are clearly documented

Sources: .github/workflows/ci.yml(L1 - L32) 

License Information

Relevant source files

This document details the licensing structure of the weak-map library, explaining the dual-licensing approach that allows users to choose between two open-source licenses when using or modifying the codebase.

Dual-Licensing Model

The weak-map library is dual-licensed, allowing users to choose between the MIT License and the Apache License 2.0. This is specified in the Cargo.toml configuration file:

license = "MIT OR Apache-2.0"

The "OR" operator indicates that users may select either license according to their preferences and requirements, without needing to comply with both.

Sources: Cargo.toml(L7) 

License Comparison

The following table compares key aspects of both licenses:

FeatureMIT LicenseApache License 2.0
License lengthBrief (22 lines)Comprehensive (200+ lines)
Patent protectionNo explicit patent grantExplicit patent grant (Section 3)
Trademark provisionsNoneExplicit restrictions (Section 6)
Modification noticesRequires copyright notice preservationRequires indicating significant changes (Section 4)
Contribution termsNot specifiedExplicitly addressed (Section 5)
Warranty disclaimerSimple disclaimerDetailed disclaimer (Section 7)
Liability limitationSimple limitationDetailed limitation (Section 8)

Sources: LICENSE-MIT LICENSE-APACHE-2.0

License Files

The repository contains two license files:

  1. LICENSE-MIT: Contains the full text of the MIT License, dated 2025 with copyright attributed to Asakura Mizu.
  2. LICENSE-APACHE-2.0: Contains the full text of the Apache License 2.0, with copyright attributed to Asakura Mizu.

These files serve as the authoritative license texts for the project.

Sources: LICENSE-MIT LICENSE-APACHE-2.0

Dual-Licensing Structure

flowchart TD
A["weak-map library"]
B["Dual License Structure"]
C["MIT License"]
D["Apache License 2.0"]
E["LICENSE-MIT file"]
F["LICENSE-APACHE-2.0 file"]
G["Cargo.toml license declaration"]

A --> B
B --> C
B --> D
C --> E
D --> F
G --> B

Sources: Cargo.toml(L7)  LICENSE-MIT LICENSE-APACHE-2.0

Compliance Requirements

MIT License Compliance

To comply with the MIT License when using weak-map:

  1. Include the following copyright notice: "Copyright (c) 2025 Asakura Mizu"
  2. Include the complete MIT license text from the LICENSE-MIT file
  3. Include both items in all copies or substantial portions of the software

Apache License 2.0 Compliance

To comply with the Apache License 2.0 when using weak-map:

  1. Include the copyright notice: "Copyright 2025 Asakura Mizu"
  2. Include a complete copy of the Apache License 2.0
  3. For modified files, add notices stating that you changed the files
  4. Retain all copyright, patent, trademark, and attribution notices
  5. If the original contains a NOTICE file, include readable copy of attribution notices

Sources: LICENSE-MIT(L3 - L21)  LICENSE-APACHE-2.0(L89 - L201) 

License Selection Decision Flow

The following diagram provides guidance on selecting the appropriate license for your use case:

flowchart TD
A["License selection factors"]
B["Need patent protection?"]
C["Apache 2.0 preferred"]
D["Simple project needs?"]
E["MIT preferred"]
F["Need explicit contribution terms?"]
G["Either license acceptable"]
H["Use under Apache License 2.0"]
I["Use under MIT License"]
J["Choose based on ecosystem or preference"]

A --> B
B --> C
B --> D
C --> H
D --> E
D --> F
E --> I
F --> C
F --> G
G --> J

Sources: LICENSE-MIT LICENSE-APACHE-2.0 Cargo.toml(L7) 

Implications for Contributors

Contributors to the weak-map project should understand:

  1. Their contributions will be available under both licenses
  2. By submitting a contribution, they agree their work may be distributed under either license
  3. The project maintainers can relicense their contributions as needed within these two options
  4. Any separate agreements with the project maintainers take precedence

This follows standard practice for dual-licensed Rust projects, ensuring maximum flexibility for users of the library.

Sources: LICENSE-MIT LICENSE-APACHE-2.0 Cargo.toml(L7) 

License Coverage

The dual-licensing covers all components of the weak-map library, including:

  1. The core implementation files in src/
  2. Documentation and examples
  3. Build configurations and metadata

All these components can be used, modified, and distributed according to either license at the user's discretion.

Sources: Cargo.toml(L7) 

Project Metadata License Information

The licensing information is also reflected in the project metadata, which is important for users who install the library via Cargo. The specification in Cargo.toml ensures that the licensing information is properly included in the package registry (crates.io) and documentation (docs.rs).

[package]
name = "weak-map"
version = "0.1.0"
edition = "2024"
authors = ["Asakura Mizu <asakuramizu111@gmail.com>"]
description = "BTreeMap with weak references"
license = "MIT OR Apache-2.0"
repository = "https://github.com/Starry-OS/weak-map"
documentation = "https://docs.rs/weak-map"

This metadata ensures transparency about the licensing terms and helps users make informed decisions about incorporating the library into their projects.

Sources: Cargo.toml(L1 - L11) 

Overview

Relevant source files

AXNS (Resource Namespace System) is a Rust library providing a unified interface for managing and controlling access to system resources across different deployment scenarios. It enables configurable resource sharing and isolation between processes and threads in various operating system environments, from unikernels with shared resources to monolithic kernels or containerized environments requiring isolation.

For more detailed information about specific components, see Core Concepts and Thread-Local Features.

Purpose and Scope

AXNS addresses several key requirements for flexible resource management:

  • Unified Resource Access: Providing a consistent interface to system resources
  • Configurable Isolation: Supporting varying degrees of resource sharing between threads
  • Deployment Flexibility: Working effectively in different system architectures
  • Memory Safety: Ensuring proper resource initialization and cleanup
  • Type Safety: Providing strongly-typed access to resources

The system manages resources such as virtual address spaces, working directories, file descriptors, and other system facilities that might need to be shared or isolated.

Sources: README.md(L5 - L14) 

Core Architecture

AXNS follows a modular design with several key architectural patterns:

flowchart TD
subgraph subGraph2["Access Patterns"]
    F["Global Namespace"]
    G["Shared resources"]
    H["Thread-local Namespaces"]
    I["Isolated resources"]
end
subgraph subGraph1["Namespace Management Layer"]
    C["Namespace"]
    D["ResArc references"]
    E["Resource instances"]
end
subgraph subGraph0["Resource Definition Layer"]
    A["def_resource! macro"]
    B["Static Resource"]
end
J["Unikernel Mode"]
K["Process/Container Mode"]

A --> B
B --> E
C --> D
D --> E
F --> G
G --> J
H --> I
I --> K

The architecture consists of these primary components:

ComponentDescriptionRole
NamespaceContainer for resourcesStores and provides access to system resources
ResourceResource type metadataDefines memory layout and lifecycle functions
ResWrapperStatic resource handleProvides the public API for resource access
ResArcReference-counted pointerManages resource lifecycle and memory
def_resource!Resource definition macroSimplifies creation of new resource types

Sources: src/lib.rs(L10 - L14) 

Component Relationships

classDiagram
class Namespace {
    +ptr: NonNull~ResArc~
    +new() Namespace
    +get(Resource) &ResArc
    +get_mut(Resource) &mut ResArc
}

class Resource {
    +layout: Layout
    +init: fn pointer
    +drop: fn pointer
    +index() usize
}

class ResWrapper~T~ {
    +res: &'static Resource
    +current() ResCurrent~T~
    +get(Namespace) &T
    +get_mut(Namespace) &mut T
    +share_from(dst, src)
    +reset(Namespace)
}

class ResArc~T~ {
    +ptr: NonNull~ResInner~
    +as_ref() &T
    +get_mut() Option~&mut T~
}

class CurrentNs {
    <<trait>>
    
    +new() Self
    +as_ref() &Namespace
}

Namespace "1" --> "*" ResArc : contains
ResArc "*" -->  Resource : references
ResWrapper "1" -->  Resource : describes
ResWrapper  ..> "1" Namespace : accesses through
CurrentNs  ..> "1" Namespace : provides context

Sources: src/lib.rs(L10 - L14)  src/lib.rs(L32 - L59) 

Resource Access Flow

Accessing resources in AXNS follows this pattern:

sequenceDiagram
    participant ApplicationCode as "Application Code"
    participant ResWrapperT as "ResWrapper<T>"
    participant Namespace as "Namespace"
    participant ResArcT as "ResArc<T>"

    ApplicationCode ->> ResWrapperT: Define with def_resource!
    ApplicationCode ->> Namespace: Create namespace
    ApplicationCode ->> ResWrapperT: resource.get(&namespace)
    ResWrapperT ->> Namespace: namespace.get(resource)
    Namespace ->> ResArcT: Get ResArc
    ResArcT -->> ApplicationCode: Return reference to T
    ApplicationCode ->> ResWrapperT: resource.get_mut(&mut namespace)
    ResWrapperT ->> Namespace: namespace.get_mut(resource)
    Namespace ->> ResArcT: Get mutable ResArc
    ResArcT -->> ApplicationCode: Return mutable reference to T
    ApplicationCode ->> ResWrapperT: resource.current()
    ResWrapperT ->> Namespace: Get current_ns()
    Note over Namespace: Uses thread-local or global NS
    Namespace -->> ApplicationCode: Access through current namespace

Sources: src/lib.rs(L16 - L59) 

Thread-Local Feature

AXNS provides an optional thread-local feature for fine-grained resource isolation:

stateDiagram-v2
state UseThreadLocalNS {
    [*] --> CheckTLS
    CheckTLS --> InitializeNew : First access
    CheckTLS --> UseExisting : Subsequent access
}
state AccessResources {
    [*] --> GetResource
    GetResource --> ModifyResource : get_mut()
    GetResource --> ShareResource : share_from()
    GetResource --> ResetResource : reset()
}
[*] --> FeatureCheck
FeatureCheck --> UseGlobalNS : thread-local OFF
FeatureCheck --> UseThreadLocalNS : thread-local ON
UseGlobalNS --> AccessResources
UseThreadLocalNS --> AccessResources

This feature is controlled by the thread-local feature flag in Cargo.toml:

[features]
thread-local = ["dep:extern-trait"]

When enabled, AXNS uses the CurrentNs trait to provide thread-local namespaces. When disabled, all access goes through the global namespace.

Sources: src/lib.rs(L32 - L59)  Cargo.toml(L14 - L15) 

Deployment Scenarios

AXNS supports various deployment models by adjusting namespace isolation:

flowchart TD
subgraph subGraph0["Deployment Models"]
    A["Unikernel"]
    D["Single Global Namespace"]
    B["Monolithic Kernel"]
    E["Per-Process Namespaces"]
    C["Container Environment"]
    F["Grouped Namespaces"]
end
G["Shared Resources"]
H["Process-Isolated Resources"]
I["Container-Isolated Resources"]

A --> D
B --> E
C --> F
D --> G
E --> H
F --> I
  1. Unikernel Mode: A single global namespace shared by all threads (default)
  2. Monolithic Kernel Mode: Each process has its own namespace, with threads in the same process sharing resources
  3. Container Mode: System resources grouped into namespaces that are shared between specific processes

Sources: README.md(L5 - L14) 

Summary

AXNS provides a flexible, efficient system for managing resource namespaces across different operating system environments. Its architecture balances the need for shared resources with isolation requirements, providing a consistent API regardless of the deployment scenario. The system's design ensures proper resource lifecycle management through reference counting, while the optional thread-local feature provides additional isolation when needed.

For practical guidance on using AXNS, see Usage Guide.

Sources: src/lib.rs(L1 - L59)  README.md(L1 - L14) 

Core Concepts

Relevant source files

This page explains the fundamental concepts of the AXNS namespace system, providing an overview of key components and their relationships. For detailed implementation information, see Namespaces, Resources and ResWrapper, and The def_resource! Macro.

System Purpose

AXNS (Axiomatic Namespace System) provides a unified interface for managing system resources in a structured, namespace-based approach. Its core purpose is to enable:

  1. Resource isolation or sharing between different parts of a system
  2. Consistent access to resources through well-defined interfaces
  3. Proper resource lifecycle management including initialization and cleanup
  4. Flexibility across different deployment scenarios from shared unikernel environments to containerized systems

Sources: src/lib.rs(L1 - L15) 

Key Components Overview

The following diagram illustrates the primary components of the AXNS system and their relationships:


Sources: src/res.rs(L11 - L15)  src/res.rs(L53 - L56)  src/res.rs(L107 - L119)  src/ns.rs(L7 - L10) 

Component Descriptions

Namespace

A Namespace is a collection of resources that can be managed as a unit. It serves as a container that holds references to various system resources and provides controlled access to them.

Key characteristics:

  • Contains an array of ResArc pointers (one for each defined resource)
  • Provides methods to access resources both immutably and mutably
  • Can be created explicitly or accessed implicitly via the current namespace
  • Manages the lifecycle of its contained resources

Sources: src/ns.rs(L7 - L10)  src/ns.rs(L22 - L36) 

Resource

A Resource represents system resource metadata including memory layout and lifecycle functions. Resources are defined statically and stored in a special program section called "axns_resources".

Key characteristics:

  • Contains memory layout information
  • Provides initialization and cleanup functions
  • Stored in a special section of the compiled program
  • Referenced by index in a namespace

Sources: src/res.rs(L11 - L15)  src/res.rs(L36 - L44) 

ResWrapper

ResWrapper<T> provides a type-safe interface to a specific resource. It acts as the primary API for interacting with resources across namespaces.

Key characteristics:

  • References a static Resource instance
  • Provides methods to access the resource in different namespaces
  • Enables resource sharing between namespaces
  • Allows resetting resources to their default values

Sources: src/res.rs(L53 - L56)  src/res.rs(L58 - L105) 

ResCurrent

ResCurrent<T> provides access to a resource in the "current" namespace, which might be a global namespace or a thread-local namespace depending on configuration.

Key characteristics:

  • References a static Resource instance
  • Contains a reference to the current namespace
  • Implements Deref for convenient access to the resource

Sources: src/res.rs(L107 - L119)  src/res.rs(L121 - L128) 

Resource Access Flow

The following diagram shows how resources are accessed within the AXNS system:

flowchart TD
subgraph subGraph1["Resource Management"]
    J["resource.reset(&mut ns)"]
    L["Reset to initial value"]
    K["resource.share_from(&mut dst, &src)"]
    M["Clone ResArc pointer"]
end
subgraph subGraph0["Resource Access Methods"]
    D["resource.get(&ns)"]
    G["Read resource data"]
    E["resource.get_mut(&mut ns)"]
    H["Modify resource data"]
    F["resource.current()"]
    I["Get resource from current namespace"]
end
A["Client Code"]
B["Define Static Resource"]
C["Create Namespace"]
N["Feature enabled?"]
O["Thread-local namespace"]
P["Global namespace"]

A --> B
A --> C
B --> D
B --> E
B --> F
B --> J
B --> K
C --> D
C --> E
D --> G
E --> H
F --> I
I --> N
J --> L
K --> M
N --> O
N --> P

Sources: src/res.rs(L70 - L76)  src/res.rs(L80 - L82)  src/res.rs(L90 - L92)  src/res.rs(L96 - L98)  src/res.rs(L102 - L104)  src/lib.rs(L54 - L59) 

Resource Definition

Resources are defined using the def_resource! macro, which creates both a static Resource instance and a corresponding ResWrapper<T> accessor:

flowchart TD
A["def_resource! macro"]
B["Static Resource"]
C["ResWrapper instance"]
D["Memory layout"]
E["Init function"]
F["Drop function"]
G["API to access resource in namespaces"]
H["Client code"]
I["Namespace"]
J["ResArc to resource data"]

A --> B
A --> C
B --> D
B --> E
B --> F
C --> B
C --> G
C --> I
H --> C
I --> J

Sources: src/res.rs(L144 - L168) 

Thread-Local vs. Global Namespace Behavior

AXNS supports both global and thread-local namespaces through a feature flag:

sequenceDiagram
    participant ClientCode as Client Code
    participant ResWrapper as ResWrapper
    participant ResCurrent as ResCurrent
    participant current_ns as current_ns()
    participant ThreadLocal as Thread-Local
    participant Global as Global

    ClientCode ->> ResWrapper: resource.current()
    ResWrapper ->> ResCurrent: create ResCurrent
    ResCurrent ->> current_ns: crate::current_ns()
    alt thread-local feature enabled
        current_ns ->> ThreadLocal: CurrentNsImpl::new()
        ThreadLocal -->> current_ns: Thread-local namespace
    else thread-local feature disabled
        current_ns ->> Global: global_ns()
        Global -->> current_ns: Global namespace
    end
    current_ns -->> ResCurrent: Return CurrentNsImpl
    ResCurrent -->> ResWrapper: Return ResCurrent<T>
    ResWrapper -->> ClientCode: Resource access via deref

Sources: src/lib.rs(L16 - L59)  src/res.rs(L70 - L76) 

Key Operational Patterns

The AXNS system revolves around several key operational patterns:

  1. Resource Definition: Static resources are defined using the def_resource! macro, which creates metadata and accessor objects.
  2. Namespace Creation: Namespaces can be created explicitly (Namespace::new()) or accessed implicitly via the current namespace.
  3. Resource Access: Resources can be accessed in four main ways:
  • resource.get(&ns): Immutable access in a specific namespace
  • resource.get_mut(&mut ns): Mutable access in a specific namespace (if not shared)
  • resource.current(): Access in the current namespace
  • Direct namespace access: ns.get(res) and ns.get_mut(res)
  1. Resource Sharing: Resources can be shared between namespaces using resource.share_from(&mut dst, &src).
  2. Resource Reset: Resources can be reset to their default values using resource.reset(&mut ns).

Sources: src/res.rs(L58 - L105)  src/ns.rs(L22 - L46) 

Memory Management Model

AXNS implements careful memory management to ensure resources are properly initialized and cleaned up:

flowchart TD
A["Namespace Creation"]
B["Array of ResArc pointers"]
C["Initialize ResArc"]
D["Resource Access"]
E["Get ResArc"]
F["Share safely"]
G["Resource Mutation"]
H["Shared?"]
I["Return None"]
J["Return mutable reference"]
K["Namespace Destruction"]
L["Decrement ref count"]
M["Call resource drop fn"]
N["Deallocate memory"]

A --> B
B --> C
D --> E
E --> F
G --> H
H --> I
H --> J
K --> L
L --> M
M --> N

Sources: src/ns.rs(L22 - L36)  src/ns.rs(L55 - L62) 

Summary

The core concepts of AXNS revolve around:

  1. Resources - System objects with defined memory layouts and lifecycle functions
  2. Namespaces - Collections of resources that can be managed as units
  3. Wrappers - Type-safe interfaces for accessing resources in namespaces
  4. Current Namespace - A concept that provides easy access to resources without explicit namespace references

These components work together to provide a flexible, memory-safe system for managing resources in various deployment environments, from shared global resources to fully isolated per-thread resources.

Sources: src/lib.rs(L1 - L15)  src/res.rs(L11 - L15)  src/res.rs(L53 - L56)  src/ns.rs(L7 - L10) 

Namespaces

Relevant source files

Purpose and Scope

This document provides a detailed explanation of the Namespace struct in the AXNS system, which serves as a container for resources. It covers the internal structure, creation, resource access methods, and memory management of namespaces. For information about the resources themselves and how they're wrapped, see Resources and ResWrapper. For details on thread-local namespace features, see Thread-Local Features.

Namespace Structure

In AXNS, a Namespace is a collection of resources, each accessed through a reference-counted pointer (ResArc). The Namespace struct is defined in src/ns.rs and consists of a single pointer field that points to an array of ResArc instances.

flowchart TD
subgraph subGraph0["Namespace Structure"]
    Namespace["Namespace {ptr: NonNull}"]
    ResArcArray["Array of ResArcs (size = Resources.len())"]
    ResArc1["ResArc[0]"]
    ResArc2["ResArc[1]"]
    ResArcN["ResArc[n-1]"]
    Resource1["Resource Data 1"]
    Resource2["Resource Data 2"]
    ResourceN["Resource Data n"]
end

Namespace --> ResArcArray
ResArc1 --> Resource1
ResArc2 --> Resource2
ResArcArray --> ResArc1
ResArcArray --> ResArc2
ResArcArray --> ResArcN
ResArcN --> ResourceN

Sources: src/ns.rs(L6 - L13) 

Namespace Creation and Initialization

When a new Namespace is created using Namespace::new(), it:

  1. Allocates memory for an array of ResArc instances (one for each resource in the system)
  2. Initializes each ResArc with its corresponding resource's default value
  3. Returns the constructed Namespace
sequenceDiagram
    participant CodecreatingNamespace as "Code creating Namespace"
    participant Namespacenew as "Namespace::new()"
    participant MemoryAllocator as "Memory Allocator"
    participant ResourcesCollection as "Resources Collection"

    CodecreatingNamespace ->> Namespacenew: Call new()
    Namespacenew ->> ResourcesCollection: Get resources count (Resources.len())
    ResourcesCollection -->> Namespacenew: Return count
    Namespacenew ->> MemoryAllocator: Allocate array of ResArc (size)
    MemoryAllocator -->> Namespacenew: Return allocated memory
    loop For each resource in Resources
        Namespacenew ->> ResourcesCollection: Get resource
        ResourcesCollection -->> Namespacenew: Return resource
        Namespacenew ->> Namespacenew: Initialize ResArc for resource
    end
    Namespacenew -->> CodecreatingNamespace: Return new Namespace

Sources: src/ns.rs(L16 - L36) 

Resource Access

The Namespace provides two primary methods for accessing resources:

  1. get(&self, res: &'static Resource) -> &ResArc: Returns a reference to the ResArc for a given resource.
  2. get_mut(&mut self, res: &'static Resource) -> &mut ResArc: Returns a mutable reference to the ResArc for a given resource.

Both methods use the resource's index (obtained via res.index()) to locate the corresponding ResArc in the array.

MethodDescriptionImplementation
getReturns a reference to a resource'sResArcUses the resource's index to find the correspondingResArcin the array
get_mutReturns a mutable reference to a resource'sResArcUses the resource's index to find the correspondingResArcin the array

Sources: src/ns.rs(L38 - L46) 

Global and Thread-Local Namespaces

AXNS supports two namespace access patterns:

  1. Global Namespace: A singleton namespace accessible from anywhere via the global_ns() function
  2. Thread-Local Namespaces: When the "thread-local" feature is enabled, each thread can have its own namespace
flowchart TD
subgraph subGraph0["Namespace Resolution"]
    A["current_ns()"]
    B["thread-local feature?"]
    C["Thread-Local CurrentNsImpl"]
    D["Global CurrentNsImpl"]
    E["Thread's Namespace"]
    F["Global Namespace (from global_ns())"]
    G["Access Resources"]
end

A --> B
B --> C
B --> D
C --> E
D --> F
E --> G
F --> G

Sources: src/lib.rs(L16 - L59) 

Memory Management

The Namespace struct carefully manages memory for all its resources. When a Namespace is dropped:

  1. It calls drop_in_place() on the array of ResArc instances, which decrements the reference count for each resource
  2. It deallocates the memory used for the array itself

This ensures that resources are properly cleaned up when they're no longer needed.

Sources: src/ns.rs(L55 - L63) 

Namespace in the AXNS Architecture

The Namespace is a central component in the AXNS system, working closely with other components:

flowchart TD
subgraph subGraph0["AXNS Component Relationships"]
    Namespace["Namespace(Container for resources)"]
    ResArc["ResArc(Reference-counted resource)"]
    Resource["Resource(Resource metadata)"]
    ResWrapper["ResWrapper(Type-safe resource access)"]
    GlobalNs["global_ns()(Returns static Namespace)"]
    CurrentNs["current_ns()(Thread-local or global)"]
end

CurrentNs --> Namespace
GlobalNs --> Namespace
Namespace --> ResArc
ResArc --> Resource
ResWrapper --> Namespace
ResWrapper --> Resource

Sources: src/lib.rs(L10 - L14)  src/ns.rs(L1 - L4) 

Implementation Details

The Namespace implementation includes several important features:

  • Memory Efficiency: Uses a single pointer to an array rather than a standard Rust collection to minimize overhead
  • Safety Markers: Implements Send and Sync traits to indicate thread safety
  • Default Implementation: Provides a Default implementation that calls new()
  • Manual Memory Management: Performs explicit allocation and deallocation to maintain control over memory layout

API Summary

MethodDescriptionExample Usage
Namespace::new()Creates a newNamespacewith default valueslet ns = Namespace::new();
ns.get(resource)Gets a reference to a resourcelet r = ns.get(&MY_RESOURCE);
ns.get_mut(resource)Gets a mutable reference to a resourcelet r = ns.get_mut(&MY_RESOURCE);
global_ns()Gets the global namespacelet ns = global_ns();
current_ns()Gets the current namespace (global or thread-local)let ns = current_ns();

Sources: src/ns.rs(L15 - L63)  src/lib.rs(L16 - L59) 

Resources and ResWrapper

Relevant source files

This page documents the resource system in AXNS, focusing on the Resource struct and ResWrapper<T> container that provide typed access to resources in namespaces. For information about how resources are defined using macros, see The def_resource! Macro.

Resource System Overview

Resources in AXNS are statically defined objects that can be accessed through namespaces. The resource system provides:

  1. Type-safe access to resources
  2. Reference counting for memory management
  3. Sharing capabilities between namespaces
  4. Thread-local or global contexts depending on configuration

Sources: src/res.rs(L11 - L43)  src/res.rs(L53 - L105)  src/res.rs(L115 - L128)  src/arc.rs(L17 - L47)  src/arc.rs(L49 - L120) 

The Resource Struct

The Resource struct is the foundational building block of the resource system:

flowchart TD
subgraph subGraph1["Storage Mechanism"]
    E["Resources Collection"]
    F["Link-Time Section"]
    G["Resources::deref()"]
end
subgraph subGraph0["Resource Definition"]
    A["Resource Struct"]
    B["Memory Layout"]
    C["Init Function"]
    D["Drop Function"]
end

A --> B
A --> C
A --> D
A --> E
E --> F
F --> G

Sources: src/res.rs(L11 - L15)  src/res.rs(L17 - L44) 

The Resource struct is defined in src/res.rs and contains three key components:

  • layout: Specifies the memory layout for the resource type
  • init: A function pointer to initialize the resource
  • drop: A function pointer to clean up the resource when it's dropped

Resources are stored in a special link-time section named "axns_resources", which allows them to be accessed as a collection through the Resources struct. This implementation mimics the behavior of the linkme crate.

ResWrapper: Type-Safe Resource Access

The ResWrapper<T> struct provides a typed interface to access resources:

flowchart TD
A["Client Code"]
B["ResWrapper::get()"]
C["Namespace"]
D["ResArc"]
E["Resource Instance"]
F["ResWrapper::get_mut()"]
G["Can be mutated?"]
H["&mut Resource Instance"]
I["None"]
J["ResWrapper::current()"]
K["current_ns()"]
L["ResCurrent"]

A --> B
A --> F
A --> J
B --> C
C --> D
D --> E
F --> C
F --> G
G --> H
G --> I
J --> K
J --> L

Sources: src/res.rs(L53 - L105) 

ResWrapper<T> is a wrapper around a static Resource that provides type-safe access methods:

  • get: Obtains a reference to the resource in a given namespace
  • get_mut: Attempts to get a mutable reference (only if the resource isn't shared)
  • current: Creates a ResCurrent<T> that references the resource in the current namespace
  • share_from: Shares a resource from one namespace to another
  • reset: Resets a resource to its default value in a namespace

ResCurrent: Accessing the Current Namespace

ResCurrent<T> provides a convenient way to access resources in the current namespace:

flowchart TD
A["ResCurrent<T>"]
B["Deref<Target=T>"]
C["res: &'static Resource"]
D["ns: CurrentNsImpl"]
E["Client Code"]
F["creates ResCurrent"]
G["auto-derefs to T"]
H["current_ns()"]
I["Thread-local or global namespace"]

A --> B
A --> C
A --> D
E --> F
E --> G
F --> H
H --> I

Sources: src/res.rs(L115 - L128) 

ResCurrent<T> implements Deref<Target = T>, which allows for transparent access to the underlying resource value. When you dereference a ResCurrent<T> (using * or accessing fields/methods), it automatically retrieves the resource from the current namespace.

Memory Management with ResArc

Resources are managed using a reference-counted smart pointer called ResArc:

flowchart TD
subgraph subGraph0["ResArc Structure"]
    A["ResArc"]
    B["NonNull<ResInner>"]
    C["ResInner"]
    D["res: &'static Resource"]
    E["strong: AtomicUsize"]
    F["Resource Data"]
end
subgraph subGraph1["Memory Layout"]
    G["Memory Block"]
    H["ResInner Header"]
    I["Resource Value"]
end
J["clone()"]
K["drop()"]
L["deallocate memory"]

A --> B
B --> C
C --> D
C --> E
C --> F
G --> H
H --> I
J --> E
K --> E
K --> L

Sources: src/arc.rs(L17 - L47)  src/arc.rs(L49 - L120) 

ResArc uses a pattern similar to Rust's standard Arc<T> to implement reference counting:

  1. When a new ResArc is created, it allocates memory for both the ResInner header and the resource data
  2. The ResInner contains a reference to the static Resource definition and an atomic reference counter
  3. Cloning an ResArc increments the reference counter
  4. Dropping an ResArc decrements the counter; when it reaches zero, the memory is deallocated

This system ensures that resources are properly cleaned up when they're no longer needed, while allowing efficient sharing between namespaces.

Resource Sharing and Reset

ResWrapper<T> provides methods to share resources between namespaces and reset them to their default values:

sequenceDiagram
    participant ClientCode as "Client Code"
    participant SourceNamespace as "Source Namespace"
    participant DestNamespace as "Dest Namespace"
    participant ResWrapper as "ResWrapper"

    ClientCode ->> ResWrapper: share_from(dst, src)
    ResWrapper ->> SourceNamespace: get(res)
    SourceNamespace -->> ResWrapper: Return ResArc
    ResWrapper ->> ResWrapper: clone ResArc
    ResWrapper ->> DestNamespace: get_mut(res)
    ResWrapper ->> DestNamespace: Replace with cloned ResArc
    ClientCode ->> ResWrapper: reset(ns)
    ResWrapper ->> ResWrapper: ResArc::new(res)
    ResWrapper ->> DestNamespace: get_mut(res)
    ResWrapper ->> DestNamespace: Replace with new ResArc

Sources: src/res.rs(L94 - L104) 

  • share_from: This method clones the ResArc from the source namespace and assigns it to the destination namespace. This creates a shared reference to the same resource value.
  • reset: This method creates a new ResArc with the default value for the resource and assigns it to the namespace, effectively resetting it to its initial state.

Resource Collection System

Resources are stored in a special link-time section and accessed through the Resources collection:

flowchart TD
subgraph subGraph0["Link-Time Storage"]
    A["__start_axns_resources"]
    B["Resource Array"]
    C["__stop_axns_resources"]
end
D["Resources::deref()"]
E["Array Length"]
F["&[Resource]"]
G["Resource::index()"]
H["Resource Index"]
I["Namespace"]

A --> B
B --> C
D --> E
D --> F
G --> H
H --> I

Sources: src/res.rs(L17 - L44) 

The resource collection system uses link-time sections to create an array of Resource objects:

  1. Resources::deref() calculates the length of the resource array by finding the difference between the start and end addresses
  2. The Resource::index() method calculates the index of a resource in this array
  3. The index is used by the Namespace to store and retrieve resource instances

This approach allows for efficient storage and lookup of resources with minimal runtime overhead.

Complete Resource Flow

The complete flow of defining and accessing resources involves several components:

flowchart TD
A["def_resource! macro"]
B["static Resource"]
C["static ResWrapper"]
D["Client Code"]
E["ns.get(res)"]
F["ResArc"]
G["&T"]
H["ns.get_mut(res)"]
I["&mut ResArc"]
J["Option<&mut T>"]
K["ResCurrent"]
L["&T"]

A --> B
A --> C
C --> E
C --> H
C --> K
D --> C
D --> K
E --> F
F --> G
H --> I
I --> J
K --> L

Sources: src/res.rs(L53 - L128)  src/res.rs(L144 - L168) 

This diagram illustrates the complete flow from resource definition to access:

  1. Resources are defined using the def_resource! macro
  2. The macro generates a static Resource object and a ResWrapper<T> to access it
  3. Client code can access the resource in multiple ways:
  • Using get() to get a reference in a specific namespace
  • Using get_mut() to get a mutable reference if possible
  • Using current() to access the resource in the current namespace

Summary

The Resource system in AXNS provides a flexible and type-safe way to define and access resources in different namespaces. Key components include:

ComponentPurpose
ResourceDefines the layout and lifecycle functions for a resource
ResWrapperProvides type-safe access to resources in namespaces
ResCurrentEnables access to resources in the current namespace
ResArcImplements reference counting for resource management

This system enables efficient sharing of resources between namespaces while maintaining memory safety and proper cleanup when resources are no longer needed.

The def_resource! Macro

Relevant source files

The def_resource! macro is a core component of the AXNS system that provides a declarative syntax for defining static resources that can be managed within namespaces. This page explains how the macro works, its syntax, and how it integrates with the rest of the AXNS resource namespace system.

For information about the general resource system and ResWrapper structure, see Resources and ResWrapper.

Purpose and Function

The def_resource! macro serves as the primary entry point for users to define resources that can be managed by the AXNS namespace system. It:

  1. Creates static resources with type safety and initialization logic
  2. Registers these resources in a global resource registry
  3. Generates wrapper objects that provide a consistent interface for resource access
  4. Handles proper memory layout and lifecycle management for resources

Sources: src/res.rs(L144 - L168) 

Macro Syntax and Usage

The def_resource! macro follows this syntax pattern:

def_resource! {
    /// Optional documentation
    [visibility] static RESOURCE_NAME: ResourceType = default_value;
    
    // Multiple resources can be defined in a single macro invocation
    [visibility] static ANOTHER_RESOURCE: AnotherType = another_default_value;
}

Key components:

  • Visibility modifier (pub, pub(crate), etc.) - controls access to the resource
  • Resource name - a static identifier for accessing the resource
  • Resource type - any valid Rust type
  • Default value - the initial value assigned to the resource in each namespace

Sources: src/res.rs(L144 - L168)  tests/all.rs(L11 - L13)  tests/all.rs(L31 - L33) 

How the Macro Works

flowchart TD
subgraph subGraph0["For each resource definition"]
    B["Generate static Resource struct"]
    C["Place in 'axns_resources' section"]
    D["Create ResWrapper<T> static variable"]
    E["Expose resource access methods"]
end
A["def_resource! macro invocation"]
F["Resources collected in global registry"]
G["Resource accessible via namespaces"]

A --> B
A --> D
B --> C
C --> F
D --> E
E --> G

Sources: src/res.rs(L144 - L168)  src/res.rs(L17 - L34) 

When you use the def_resource! macro, it expands to create two key components for each defined resource:

  1. A static Resource structure that contains:
  • Memory layout information for the resource type
  • Initialization function that creates the default value
  • Drop function that handles cleanup when resources are deallocated
  1. A static ResWrapper<T> that provides methods to access the resource in different namespaces

The macro places each Resource structure in a special ELF section named "axns_resources", which allows the system to discover all resources at runtime without requiring explicit registration.

Sources: src/res.rs(L144 - L168)  src/res.rs(L10 - L15) 

Generated Code Structure


The expanded macro code generates:

  1. A static Resource instance with fixed layout and lifecycle functions
  2. A static ResWrapper<T> instance that wraps the resource
  3. Assertions to ensure the resource type isn't zero-sized

Sources: src/res.rs(L144 - L168)  src/res.rs(L53 - L56) 

Generated Code Example

When you write:

def_resource! {
    pub static COUNTER: AtomicUsize = AtomicUsize::new(0);
}

The macro expands to something equivalent to:

pub static COUNTER: ResWrapper<AtomicUsize> = {
    #[unsafe(link_section = "axns_resources")]
    static RES: Resource = Resource {
        layout: Layout::new::<AtomicUsize>(),
        init: |ptr| {
            let val: AtomicUsize = AtomicUsize::new(0);
            unsafe { ptr.cast().write(val) }
        },
        drop: |ptr| unsafe {
            ptr.cast::<AtomicUsize>().drop_in_place();
        },
    };

    assert!(RES.layout.size() != 0, "Resource has zero size");

    ResWrapper::new(&RES)
};

This generated code creates a static Resource structure with the proper memory layout and lifecycle functions for an AtomicUsize, then wraps it in a ResWrapper<AtomicUsize> that provides type-safe access methods.

Sources: src/res.rs(L144 - L168) 

Memory Layout and Resource Registry

flowchart TD
subgraph subGraph0["axns_resources section"]
    A["Resource #1"]
    B["Resource #2"]
    C["Resource #3"]
    D["..."]
end
E["__start_axns_resources"]
F["__stop_axns_resources"]
G["Resources struct"]
H["Slice of all resources"]
I["User code"]
J["Resources.as_ptr()"]

A --> B
B --> C
C --> D
D --> F
E --> A
G --> H
H --> E
H --> F
I --> J
J --> H

The def_resource! macro places all resources in a dedicated ELF section called "axns_resources". The AXNS system uses special linker symbols (__start_axns_resources and __stop_axns_resources) to locate the beginning and end of this section, allowing it to create a slice of all defined resources.

The Resources struct (implementing Deref<Target=[Resource]>) provides access to all registered resources as a contiguous array, which is used for resource indexing and lookup.

Sources: src/res.rs(L17 - L34)  src/res.rs(L36 - L44) 

Resource Access Methods

The ResWrapper<T> generated by the macro provides several methods to access the wrapped resource:

MethodPurposeReturn Type
get(ns)Get immutable reference from namespace&T
get_mut(ns)Get mutable reference if not sharedOption<&mut T>
current()Access resource in current namespaceResCurrent
share_from(dst, src)Share resource between namespaces()
reset(ns)Reset resource to default value()

These methods provide a complete interface for interacting with resources in both specific and current namespaces.

Sources: src/res.rs(L58 - L105) 

Usage Examples

Basic Resource Definition and Access

// Define resources
def_resource! {
    pub static CONFIG: Config = Config::default();
    pub static COUNTER: AtomicUsize = AtomicUsize::new(0);
}

// Create a namespace
let mut ns = Namespace::new();

// Access resources
let config = CONFIG.get(&ns);
let counter_val = COUNTER.get(&ns).load(Relaxed);

// Modify resources (if not shared)
if let Some(counter) = COUNTER.get_mut(&mut ns) {
    counter.store(42, Relaxed);
}

Sources: tests/all.rs(L11 - L24)  tests/all.rs(L31 - L38) 

Using the Current Namespace

The current() method provides convenient access to resources in the current namespace, which is determined by the CurrentNs implementation (globally shared or thread-local depending on feature flags):

// Access and modify resource in current namespace
let counter = COUNTER.current();
counter.fetch_add(1, Relaxed);

Sources: src/res.rs(L69 - L76)  tests/all.rs(L35 - L37) 

Resource Sharing and Resetting

// Create two namespaces
let mut ns1 = Namespace::new();
let mut ns2 = Namespace::new();

// Modify resource in ns1
if let Some(counter) = COUNTER.get_mut(&mut ns1) {
    counter.store(42, Relaxed);
}

// Share the resource from ns1 to ns2
COUNTER.share_from(&mut ns2, &ns1);

// Later, reset the resource in ns2 to its default value
COUNTER.reset(&mut ns2);

Sources: src/res.rs(L94 - L104)  tests/all.rs(L96 - L123) 

Thread-Local Considerations

When using the thread-local feature flag, resources defined with def_resource! can be accessed in a thread-local context, providing isolation between threads.

sequenceDiagram
    participant Thread1 as Thread 1
    participant Thread2 as Thread 2
    participant RESOURCE as RESOURCE

    Note over Thread1,Thread2: With thread-local feature enabled
    Thread1 ->> RESOURCE: RESOURCE.current()
    RESOURCE -->> Thread1: Access Thread 1's instance
    Thread2 ->> RESOURCE: RESOURCE.current()
    RESOURCE -->> Thread2: Access Thread 2's instance
    Note over Thread1,Thread2: Without thread-local feature
    Thread1 ->> RESOURCE: RESOURCE.current()
    RESOURCE -->> Thread1: Access global instance
    Thread2 ->> RESOURCE: RESOURCE.current()
    RESOURCE -->> Thread2: Access same global instance

The current() method on resources defined with def_resource! respects the thread-local configuration of the AXNS system, making it easier to write code that works in both shared and isolated resource modes.

Sources: src/res.rs(L69 - L76)  tests/all.rs(L40 - L159) 

Implementation Details

Resource Indexing

Each Resource defined by the macro has an index() method that computes its position in the global resource registry. This index is used by the Namespace to efficiently store and retrieve resources without requiring hash lookups.

Sources: src/res.rs(L36 - L44) 

Zero-Sized Types

The macro includes an assertion to ensure the resource type isn't zero-sized, as this would cause issues with the memory management system:

assert!(RES.layout.size() != 0, "Resource has zero size");

This prevents potential errors when defining resources with types like () or empty structs.

Sources: src/res.rs(L162) 

Resource Initialization

The init function generated by the macro creates the default value and writes it to memory when a namespace is initialized. This ensures that every namespace starts with properly initialized resources.

Sources: src/res.rs(L153 - L156) 

Summary

The def_resource! macro is the foundation of the AXNS resource system, providing a clean, type-safe interface for defining resources that can be managed within namespaces. It generates the necessary code to handle resource lifecycle management, access control, and integration with the namespace system.

By placing all resources in a special ELF section and providing wrapper types with access methods, it enables efficient resource lookup and manipulation while maintaining a simple user interface.

Thread-Local Features

Relevant source files

Purpose and Scope

This page documents the thread-local namespace functionality in AXNS, which provides isolation of resources between threads. When enabled, this feature allows each thread to maintain its own separate namespace, as opposed to sharing a global namespace. For information about namespaces in general, see Namespaces; for information about resource lifecycle management, see Resource Lifecycle.

Sources: src/lib.rs(L20 - L59)  Cargo.toml(L14 - L15) 

Feature Flag Overview

Thread-local functionality is controlled by the thread-local feature flag in AXNS. When this feature is enabled, it adds the ability for each thread to have its own isolated namespace for resources.

flowchart TD
subgraph subGraph0["Implementation Details"]
    B["Thread-local Namespaces"]
    C["Global Namespace Only"]
    F["CurrentNs trait with extern_trait"]
    G["Simple CurrentNsImpl struct"]
end
A["Feature Check: thread-local"]
D["Each thread has isolated resources"]
E["All threads share the same resources"]

A --> B
A --> C
B --> D
B --> F
C --> E
C --> G

Thread-local Feature Control Flow

Sources: Cargo.toml(L14 - L15)  src/lib.rs(L35 - L59) 

Implementation Architecture

The thread-local feature implementation centers around the CurrentNs trait, which is only defined when the feature flag is enabled. This trait abstracts the retrieval of the current namespace for a thread.

With Thread-Local Feature Disabled

When the thread-local feature is disabled, all resource accesses use the global namespace. This is implemented through a simple CurrentNsImpl struct that returns the global namespace when asked for the current namespace.

classDiagram
class CurrentNsImpl {
    
    +as_ref() &Namespace
}

class current_ns {
    
    +() CurrentNsImpl
}

class global_ns {
    
    +() &'static Namespace
}

class Namespace {
    
    +new() Namespace
}

current_ns  -->  CurrentNsImpl : returns
CurrentNsImpl  -->  global_ns : calls
global_ns  -->  Namespace : returns static ref

Implementation without Thread-Local Feature

Sources: src/lib.rs(L44 - L59)  src/lib.rs(L17 - L25) 

With Thread-Local Feature Enabled

When the thread-local feature is enabled, the system uses the CurrentNs trait which can be implemented by user code to provide thread-local namespaces. The trait is marked as unsafe because implementations must ensure proper thread safety.

classDiagram
class CurrentNs {
    <<trait>>
    
    +new() Self
    +as_ref() &Namespace
}

class CurrentNsImpl {
    +impl CurrentNs
    
}

class current_ns {
    
    +() CurrentNsImpl
}

class external_implementation {
    user provided
    
}

current_ns  -->  CurrentNs : returns
CurrentNsImpl  --|>  CurrentNs : implements
external_implementation  --|>  CurrentNs : implements

Implementation with Thread-Local Feature

Sources: src/lib.rs(L27 - L42)  src/lib.rs(L54 - L59) 

Accessing Resources with Thread-Local Namespaces

When a resource is accessed using its current() method, the behavior differs based on whether the thread-local feature is enabled:

sequenceDiagram
    participant ClientCode as "Client Code"
    participant ResCurrentT as "ResCurrent<T>"
    participant CurrentNsImpl as "CurrentNsImpl"
    participant ThreadLocalStorage as "Thread Local Storage"
    participant GlobalNamespace as "Global Namespace"

    ClientCode ->> ResCurrentT: current()
    ResCurrentT ->> CurrentNsImpl: current_ns()
    alt thread-local feature enabled
        CurrentNsImpl ->> ThreadLocalStorage: Check if thread has namespace
    alt First access
        ThreadLocalStorage ->> ThreadLocalStorage: Initialize thread-local namespace
    end
    ThreadLocalStorage -->> CurrentNsImpl: Return thread's namespace
    else thread-local feature disabled
        CurrentNsImpl ->> GlobalNamespace: global_ns()
        GlobalNamespace -->> CurrentNsImpl: Return global static namespace
    end
    CurrentNsImpl -->> ResCurrentT: as_ref() -> &Namespace
    ResCurrentT -->> ClientCode: Return resource in current namespace

Resource Access Flow with Thread-Local Feature

Sources: src/lib.rs(L17 - L59)  tests/all.rs(L40 - L159) 

Implementation Example

The tests in the codebase provide a practical example of implementing thread-local namespaces. This implementation uses a thread-local Once value to store an Arc<RwLock<Namespace>>.

Example Thread-Local Implementation

classDiagram
class ThreadLocal {
    static NS: Once~~
    
}

class CurrentNsImpl {
    -Option~ guard
    +new() Self
    +as_ref() &Namespace
}

class CurrentNs {
    <<trait>>
    
    +new() Self
    +as_ref() &Namespace
}

CurrentNs  --|>  CurrentNsImpl : implements
CurrentNs  -->  ThreadLocal : accesses

Thread-Local Implementation Example

Sources: tests/all.rs(L49 - L70) 

Resource Isolation Between Threads

When the thread-local feature is enabled, resources can be isolated between threads. Each thread can create its own namespace and modify resources independently of other threads.

flowchart TD
subgraph subGraph1["Thread 2"]
    D["Namespace 2"]
    E["Resource A: value=100"]
    F["Resource B: value=456"]
end
subgraph subGraph0["Thread 1"]
    A["Namespace 1"]
    B["Resource A: value=42"]
    C["Resource B: value=123"]
end
G["Thread-Local Storage"]
H["Resource Definitions"]
I["static DATA_A: Type = initial_value"]
J["static DATA_B: Type = initial_value"]

A --> B
A --> C
D --> E
D --> F
G --> A
G --> D
H --> I
H --> J
I --> B
I --> E
J --> C
J --> F

Thread Isolation with Thread-Local Namespaces

Sources: tests/all.rs(L40 - L159) 

Resource Sharing Between Thread-Local Namespaces

Even with thread isolation, AXNS allows for controlled sharing of resources between namespaces using the share_from method. This creates a shared reference to the same resource instance across different namespaces.

sequenceDiagram
    participant Thread1 as "Thread 1"
    participant Namespace1 as "Namespace 1"
    participant Thread2 as "Thread 2"
    participant Namespace2 as "Namespace 2"
    participant Resource as "Resource"

    Thread1 ->> Namespace1: Create namespace
    Thread1 ->> Namespace1: Modify resource
    Thread2 ->> Namespace2: Create namespace
    Thread2 ->> Resource: share_from(&mut NS2, &NS1)
    Note over Namespace1,Namespace2: Both namespaces now reference the same resource instance
    Thread1 ->> Namespace1: Access resource
    Thread2 ->> Namespace2: Access resource (sees same value)

Resource Sharing Between Thread-Local Namespaces

Sources: tests/all.rs(L125 - L159) 

Thread-Local Features in Action

The following table summarizes key operations and their behavior with thread-local features enabled:

OperationWith Thread-Local EnabledWith Thread-Local Disabled
resource.current()Returns resource from thread's namespaceReturns resource from global namespace
current_ns()Returns thread-specificCurrentNsImplReturns global-referencingCurrentNsImpl
Resource CreationCreated in thread-local namespaceCreated in global namespace
Resource SharingMust explicitly useshare_from()Automatically shared (single namespace)
Resource ResetOnly affects thread's namespaceAffects all threads

Sources: src/lib.rs(L17 - L59)  tests/all.rs(L40 - L159) 

Best Practices for Thread-Local Features

When implementing thread-local namespaces, consider the following best practices:

  1. Initialization: Initialize thread-local namespaces lazily (on first access)
  2. Thread Safety: Ensure proper locking or synchronization when accessing the namespace
  3. Resource Management: Be mindful of resource lifecycle with thread-local namespaces
  4. Custom Implementation: Implement CurrentNs trait for your specific thread-local storage needs
flowchart TD
A["Define Resources"]
B["Implement CurrentNs trait"]
C["Thread-Local Storage Setup"]
D["Lazy Initialization"]
E["Resource Access Patterns"]
F["Thread-Local Access"]
G["Cross-Thread Sharing"]
H["Resource Cleanup"]
I["Reset Resources"]
J["Drop Thread-Local Storage"]

A --> B
B --> C
C --> D
E --> F
E --> G
H --> I
H --> J

Thread-Local Feature Usage Guide

Sources: tests/all.rs(L40 - L159)  src/lib.rs(L27 - L42) 

Implementation Considerations

When designing an implementation of thread-local namespaces, you should address:

  1. Storage Strategy: How thread-local namespaces are stored and retrieved
  2. Initialization Logic: When and how thread-local namespaces are created
  3. Default Behavior: What happens when a thread doesn't have a namespace
  4. Thread Cleanup: Ensuring resources are properly cleaned up when threads exit

The test implementation provides one approach using thread-local storage with Once and Arc<RwLock<Namespace>>, but other approaches may be more suitable depending on your specific requirements.

Sources: tests/all.rs(L49 - L70)  src/lib.rs(L27 - L42) 

Resource Lifecycle

Relevant source files

Purpose and Scope

This document explains in detail how resources are created, accessed, shared, and cleaned up throughout their lifecycle in the AXNS system. Understanding the resource lifecycle is crucial for effectively utilizing the namespace system and ensuring proper resource management.

For information about the specific implementation of reference counting used for resource management, see Resource Reference Counting.

Resource Creation Process

Definition and Initialization

Resources in AXNS begin their lifecycle when they are defined using the def_resource! macro. This macro generates static resource definitions along with their accessor wrappers.

flowchart TD
A["def_resource! macro"]
B["Static Resource object"]
C["Layout information"]
D["init function"]
E["drop function"]
F["def_resource! macro"]
G["ResWrapper instance"]

A --> B
B --> C
B --> D
B --> E
F --> G
G --> B

Sources: src/res.rs(L144 - L168) 

When a resource is defined using the def_resource! macro, several key components are created:

  1. A static Resource object with:
  • Memory layout information for the resource type via Layout::new::<T>()
  • An initialization function that creates the default value
  • A drop function that properly cleans up the resource when no longer needed
  1. A static ResWrapper<T> that provides the API for accessing this resource across namespaces

Namespace Resource Initialization

When a namespace is created, it initializes all resources with their default values.

flowchart TD
A["Namespace::new()"]
B["Memory block for all ResArcs"]
C["ResArc::new(res)"]
D["ResInner + Resource data"]
E["res.init function"]
F["strong count = 1"]

A --> B
A --> C
C --> D
C --> E
D --> F

Sources: src/ns.rs(L22 - L36)  src/arc.rs(L57 - L72) 

The initialization flow works as follows:

  1. Namespace::new() allocates a single contiguous memory block to hold all ResArc instances for every defined resource
  2. For each resource in the resources list:
  • It creates a new ResArc using ResArc::new(res)
  • ResArc::new allocates memory for both the ResInner structure and the resource data
  • The resource is initialized by calling its init function with the allocated memory
  • The reference count (strong count) is set to 1

Resource Access Patterns

AXNS provides several ways to access resources, each designed for different use cases.

sequenceDiagram
    participant ClientCode as "Client Code"
    participant ResWrapperT as "ResWrapper<T>"
    participant Namespace as "Namespace"
    participant ResArc as "ResArc"
    participant ResourceData as "Resource Data"

    ClientCode ->> ResWrapperT: get(namespace)
    ResWrapperT ->> Namespace: ns.get(resource)
    Namespace ->> ResArc: Get ResArc reference
    ResArc -->> ClientCode: Immutable &T reference
    ClientCode ->> ResWrapperT: get_mut(namespace)
    ResWrapperT ->> Namespace: ns.get_mut(resource)
    Namespace ->> ResArc: Get mutable ResArc
    ResArc ->> ResArc: Check if strong count == 1
    alt Strong count == 1
        ResArc -->> ClientCode: Mutable &mut T reference
    else Resource is shared
        ResArc -->> ClientCode: None (cannot mutate shared resource)
    end
    ClientCode ->> ResWrapperT: current()
    ResWrapperT ->> ResWrapperT: Gets current namespace
    ResWrapperT -->> ClientCode: ResCurrent<T> for current namespace access

Sources: src/res.rs(L69 - L128)  src/arc.rs(L79 - L85) 

Access Methods

AXNS provides three primary methods for accessing resources:

  1. Immutable Access - ResWrapper::get(&Namespace) -> &T:
  • Always succeeds, providing read-only access to the resource
  • Safe to use regardless of whether the resource is shared
  1. Mutable Access - ResWrapper::get_mut(&mut Namespace) -> Option<&mut T>:
  • Only succeeds if the resource has a reference count of 1 (not shared)
  • Returns None if the resource is shared with other namespaces
  • Ensures memory safety by preventing concurrent mutation
  1. Current Namespace Access - ResWrapper::current() -> ResCurrent<T>:
  • Provides access to the resource in the current namespace
  • The current namespace is determined by the thread-local feature status
  • Returns a ResCurrent<T> that implements Deref for transparent access

Resource Sharing Mechanism

AXNS allows resources to be shared between namespaces, which is useful for both memory efficiency and communication.

flowchart TD
A["Namespace A"]
B["ResArc A"]
C["Resource instance X"]
D["Not shared"]
E["Namespace B"]
F["ResArc B"]
G["Resource instance Y"]
H["Not shared"]
I["share_from(B, A)"]
J["ResArc A'"]
K["A.strong count = 2"]
L["B now points to X"]
M["If count=0, Y is freed"]

A --> B
B --> C
B --> D
E --> F
F --> G
F --> H
I --> J
I --> L
J --> K
L --> M

Sources: src/res.rs(L96 - L98)  src/arc.rs(L95 - L102) 

The sharing process works as follows:

  1. Initially, each namespace has its own independent resource instances
  2. When ResWrapper::share_from(dst, src) is called:
  • It gets the ResArc from the source namespace
  • Clones it, which increments the reference count
  • Replaces the destination namespace's existing ResArc with this clone
  • The destination's original resource may be freed if no other references exist

This creates a situation where multiple namespaces point to the same underlying resource data, with reference counting ensuring it remains alive until all namespaces are done with it.

Resource Cleanup Process

Resources are automatically cleaned up when they are no longer needed, which happens when their reference count reaches zero.

flowchart TD
A["Namespace dropped"]
B["ResArc::drop() called"]
C["ResArc::drop()"]
D["strong.fetch_sub(1)"]
E["Last reference gone"]
F["Memory barrier"]
G["Clean up resource data"]
H["Free memory"]
I["Other references exist"]
J["Resource stays alive"]

A --> B
C --> D
D --> E
D --> I
E --> F
F --> G
G --> H
I --> J

Sources: src/arc.rs(L104 - L120)  src/ns.rs(L55 - L63) 

The cleanup process has several stages:

  1. When a Namespace is dropped:
  • It drops all its ResArc instances in its destructor
  • The memory for the namespace's ResArc array is deallocated
  1. When a ResArc is dropped:
  • Its reference count is atomically decremented
  • If the result is 1 (meaning this was the last reference):
  • A memory fence is executed for proper synchronization
  • The resource's drop function is called to clean up the resource data
  • The memory for the ResInner and resource data is deallocated
  • If the count remains positive, the resource stays alive for other references

Resource Reinitialization

AXNS allows resources to be reset to their default values using the reset method.

flowchart TD
A["ResWrapper::reset(&mut ns)"]
B["New ResArc"]
C["Fresh resource instance"]
D["*ns.get_mut(res) = new ResArc"]
E["Decrement reference count"]
F["Clean up old resource"]

A --> B
A --> D
B --> C
D --> E
E --> F

Sources: src/res.rs(L100 - L104) 

The reset process works as follows:

  1. A new ResArc is created with a fresh instance of the resource
  2. The namespace's existing ResArc is replaced with this new one
  3. The reference count of the old ResArc is decremented
  4. If the old reference count reaches 0, that resource instance is cleaned up

This provides a way to return resources to their initial state without affecting other namespaces that might be sharing the previous instance.

Memory Layout and Management

Understanding the memory layout of resources is important for comprehending the complete lifecycle.

flowchart TD
A["Memory Layout"]
B["ResInner structure"]
C["res: &'static Resource"]
D["strong: AtomicUsize"]
E["Resource Data"]
F["Memory Management"]
G["Allocation"]
H["Single contiguous block"]
I["ResInner at start"]
J["Resource data after offset"]
K["Access"]
L["Calculate offset"]
M["body() returns pointer to data"]
N["Deallocation"]
O["One operation frees both"]
P["ResInner and resource data"]

A --> B
A --> E
B --> C
B --> D
F --> G
F --> K
F --> N
G --> H
H --> I
H --> J
K --> L
L --> M
N --> O
O --> P

Sources: src/arc.rs(L17 - L47)  src/arc.rs(L23 - L27) 

The resource memory system works as follows:

  1. Memory Layout: Each resource allocation consists of:
  • A ResInner structure containing the metadata and reference count
  • The actual resource data, placed after the ResInner at a calculated offset
  1. Memory Allocation:
  • A single contiguous memory block is allocated for both the ResInner and resource data
  • The layout is calculated using Layout::new::<ResInner>().extend(body)
  • This approach minimizes allocations and improves memory locality
  1. Memory Access:
  • The body() method calculates the offset to the resource data
  • This provides direct access to the data portion without extra indirection
  1. Memory Deallocation:
  • When the reference count reaches 0, the entire memory block is deallocated
  • The resource's drop function is called first to clean up any internal resources

This approach minimizes allocations while providing safe, efficient memory management for resources.

Reference Counting Safeguards

AXNS implements several safeguards to ensure reference counting works correctly:

  1. Overflow Prevention: The reference counter is checked against MAX_REFCOUNT to prevent overflow
  2. Safe Mutation: Mutable access is only allowed when a resource has a reference count of 1
  3. Atomic Operations: All reference count operations use atomic operations with appropriate ordering
  4. Memory Fences: Proper memory barriers ensure visibility across threads

These safeguards ensure that the resource lifecycle is managed correctly and safely, preventing memory leaks and use-after-free errors.

Resource Reference Counting

Relevant source files

Purpose and Scope

This document details how AXNS implements reference counting for resources through the ResArc type, which provides memory management and safe resource sharing between namespaces. This page covers the internal memory layout of resources, the reference counting mechanism, and the resource lifecycle from allocation to deallocation.

For information about the broader resource lifecycle, see Resource Lifecycle. For details on how resources are defined, see Resources and ResWrapper.

Memory Layout

The AXNS resource system uses a custom memory layout that combines metadata with the actual resource data in a contiguous memory block.

Resource Memory Structure

flowchart TD
subgraph subGraph1["ResInner Structure"]
    C["res: &'static Resource"]
    D["strong: AtomicUsize"]
end
subgraph subGraph0["Memory Block"]
    A["ResInner (Metadata)"]
    B["Resource Data"]
end

A --> B
A --> C
A --> D

The memory layout consists of two key sections:

  1. Metadata Section (ResInner): Contains:
  • A reference to the static resource descriptor
  • An atomic counter for tracking references
  1. Resource Data Section: Contains the actual data of the resource, with its layout defined during resource creation

Sources: src/arc.rs(L17 - L21)  src/arc.rs(L23 - L27) 

Memory Allocation Process

sequenceDiagram
    participant ClientCode as "Client Code"
    participant ResArcnew as "ResArc::new"
    participant MemoryAllocator as "Memory Allocator"
    participant ResInner as "ResInner"

    ClientCode ->> ResArcnew: "new(&'static Resource)"
    ResArcnew ->> ResArcnew: "Calculate layout requirements"
    ResArcnew ->> MemoryAllocator: "alloc(layout)"
    MemoryAllocator -->> ResArcnew: "raw memory pointer"
    ResArcnew ->> ResInner: "Initialize metadata"
    ResArcnew ->> ResInner: "Initialize resource data (res.init)"
    ResArcnew -->> ClientCode: "Return ResArc instance"

When a resource is created, ResArc::new:

  1. Calculates the combined layout of the metadata and resource data
  2. Allocates a single memory block
  3. Initializes the metadata section with a reference count of 1
  4. Calls the resource's init function to initialize the data section

Sources: src/arc.rs(L57 - L72) 

Reference Counting Implementation

The ResArc type implements a custom atomic reference counting mechanism to track resource usage.

ResArc Architecture

classDiagram
class ResArc {
    +NonNull~ResInner~ ptr
    +new(Resource) Self
    +get_mut() Option~&mut T~
    +as_ref() &T
    +clone() Self
}

class ResInner {
    +&'static Resource res
    +AtomicUsize strong
    +body() NonNull~() ~
}

ResArc  -->  ResInner : references

The reference counting is implemented through the AtomicUsize strong field in ResInner, which ensures thread-safe operations on the reference count.

Sources: src/arc.rs(L49 - L51)  src/arc.rs(L18 - L21) 

Reference Counter Operations

flowchart TD
A["Start"]
B["ResArc::clone()"]
C["Atomically increment strong count"]
D["Count > MAX_REFCOUNT?"]
E["Panic: Counter overflow"]
F["Return new ResArc with same ptr"]
G["ResArc::drop()"]
H["Atomically decrement strong count"]
I["Count == 0?"]
J["Return - Resource still in use"]
K["Call resource drop function"]
L["Deallocate memory"]

A --> B
A --> G
B --> C
C --> D
D --> E
D --> F
G --> H
H --> I
I --> J
I --> K
K --> L

Key aspects of the reference counting implementation:

  1. Incrementing: When clone() is called, the strong count is atomically incremented with a relaxed ordering
  2. Overflow Protection: Checks ensure the counter doesn't exceed MAX_REFCOUNT (isize::MAX)
  3. Decrementing: When drop() is called, the strong count is atomically decremented
  4. Cleanup: When the count reaches zero, the resource's drop function is called and memory is deallocated

Sources: src/arc.rs(L95 - L102)  src/arc.rs(L104 - L120) 

Thread Safety

ResArc implements both Send and Sync traits, allowing it to be safely shared between threads. The atomic operations ensure that reference counting works correctly in a multi-threaded environment.

flowchart TD
subgraph subGraph1["Ordering Used"]
    E["Relaxed: For increment"]
    F["Release: For decrement"]
    G["Acquire: For get_mut and drop fence"]
end
subgraph subGraph0["Thread Safety Mechanisms"]
    A["impl Send for ResArc"]
    B["impl Sync for ResArc"]
    C["AtomicUsize for reference counting"]
    D["Memory ordering guarantees"]
end

C --> E
C --> F
C --> G

The implementation uses specific memory orderings to balance performance and correctness:

  • Relaxed ordering for incrementing the counter (lighter weight)
  • Release ordering when decrementing to ensure visibility of all previous operations
  • Acquire fence after the last reference is dropped to ensure all operations complete before deallocation

Sources: src/arc.rs(L53 - L54)  src/arc.rs(L5 - L9)  src/arc.rs(L95 - L102)  src/arc.rs(L104 - L120) 

Resource Access Patterns

Accessing and Mutating Resources

flowchart TD
A["Start"]
B["ResArc::as_ref()"]
C["Access resource data immutably"]
D["ResArc::get_mut()"]
E["strong count == 1?"]
F["Return mutable reference to resource data"]
G["Return None - Resource is shared"]

A --> B
A --> D
B --> C
D --> E
E --> F
E --> G

ResArc provides two primary access patterns:

  1. Immutable access (as_ref()): Always available, returns a reference to the resource data
  2. Mutable access (get_mut()): Only available when the reference count is exactly 1, ensuring exclusive access

Sources: src/arc.rs(L79 - L85)  src/arc.rs(L88 - L93) 

Integration with Namespace System

The reference counting mechanism integrates with the namespace system to enable safe resource sharing between namespaces.

sequenceDiagram
    participant ClientCode as "Client Code"
    participant ResWrapper as "ResWrapper"
    participant SourceNamespace as "Source Namespace"
    participant DestinationNamespace as "Destination Namespace"
    participant ResArc as "ResArc"

    ClientCode ->> ResWrapper: "share_from(dst, src)"
    ResWrapper ->> SourceNamespace: "get(resource)"
    SourceNamespace -->> ResWrapper: "ResArc reference"
    ResWrapper ->> ResArc: "clone()"
    ResArc -->> ResWrapper: "New ResArc with incremented count"
    ResWrapper ->> DestinationNamespace: "Update with cloned ResArc"
    Note over SourceNamespace,DestinationNamespace: Both namespaces now share the same resource instance

When a resource is shared between namespaces:

  1. The share_from method obtains a reference to the resource in the source namespace
  2. This reference is cloned, incrementing the strong count
  3. The destination namespace's reference is replaced with the cloned reference
  4. Both namespaces now point to the same underlying resource data

Sources: src/res.rs(L94 - L98) 

Resource Reset and Memory Management

flowchart TD
A["Start"]
B["ResWrapper::reset()"]
C["Get mutable reference to ResArc in namespace"]
D["Create new ResArc instance"]
E["Replace old ResArc with new one"]
F["Old ResArc is dropped"]
G["Was this the last reference?"]
H["Resource memory is deallocated"]
I["Resource continues to exist for other namespaces"]

A --> B
B --> C
C --> D
D --> E
E --> F
F --> G
G --> H
G --> I

When a resource is reset in a namespace:

  1. A new ResArc instance is created with a fresh copy of the resource
  2. The old ResArc in the namespace is replaced and dropped
  3. If this was the last reference to the old resource, its memory is deallocated
  4. Otherwise, the resource continues to exist for other namespaces that share it

Sources: src/res.rs(L100 - L104) 

Technical Limitations and Safeguards

The ResArc implementation includes several important safeguards:

  1. Reference Count Maximum: Limited to isize::MAX to prevent overflow
  2. Mutable Access Safety: Mutable access is only granted when the reference count is exactly 1
  3. Memory Layout Handling: Carefully manages the layout and offset calculations for resource data
  4. Drop Sequence: Ensures proper ordering of deallocation operations

Sources: src/arc.rs(L14 - L15)  src/arc.rs(L79 - L85) 

Summary

The resource reference counting system in AXNS provides:

  1. Memory Safety: Ensuring resources are only deallocated when all references are dropped
  2. Thread Safety: Allowing resources to be safely shared between threads
  3. Efficient Sharing: Enabling namespaces to share resources without unnecessary duplication
  4. Controlled Mutability: Preventing data races by restricting mutable access to unshared resources

This system forms the foundation for the resource lifecycle management in AXNS, balancing safety, performance, and flexibility.

Usage Guide

Relevant source files

This guide provides practical instructions for using the AXNS library to manage resources across namespaces. It covers how to define resources, create namespaces, access and modify resources, and advanced operations like sharing and resetting resources. For conceptual understanding of the AXNS architecture, see Core Concepts, and for deeper details on resource lifecycle, see Resource Lifecycle.

Defining Resources

The first step to using AXNS is defining your resources using the def_resource! macro. This macro creates static resource instances with proper initialization and cleanup.

def_resource! {
    /// A simple integer resource
    pub static COUNTER: i32 = 0;
    
    /// A more complex resource with custom type
    pub static CONFIG: Configuration = Configuration::default();
}

The macro creates a static ResWrapper<T> for each resource, providing methods to access and manipulate the resource within namespaces.

Sources: src/res.rs(L144 - L168) 

Creating and Managing Namespaces

A namespace is a container for resource instances. Create a new namespace using the Namespace::new() constructor:

let mut my_namespace = Namespace::new();

When using the thread-local feature, you can also access the current namespace through resource wrappers:

let counter_ref = COUNTER.current(); // Gets the resource in the current namespace

Sources: src/lib.rs(L16 - L59) 

flowchart TD
A["Application Code"]
B["Create Namespace"]
C["Namespace::new()"]
D["Define Resources"]
E["def_resource! macro"]
F["Access Resources"]
G["resource.get(&ns)"]
H["resource.get_mut(&mut ns)"]
I["resource.current()"]

A --> B
A --> D
A --> F
B --> C
D --> E
F --> G
F --> H
F --> I

Basic Resource Access

AXNS provides several methods to access resources within namespaces:

Read-only Access

To get a reference to a resource in a namespace:

// Get reference to the resource in the specific namespace
let counter = COUNTER.get(&my_namespace);

Mutable Access

To modify a resource in a namespace:

// Get mutable reference if the resource isn't shared
if let Some(counter) = COUNTER.get_mut(&mut my_namespace) {
    *counter += 1;
}

Note that get_mut() returns Option<&mut T> because it will return None if the resource is shared with other namespaces.

Current Namespace Access

When the thread-local feature is enabled, you can access resources in the current namespace:

// Access resource in current namespace
let current_value = COUNTER.current();

Sources: src/res.rs(L69 - L92)  tests/all.rs(L15 - L25) 

sequenceDiagram
    participant Application as "Application"
    participant ResWrapperT as "ResWrapper<T>"
    participant Namespace as "Namespace"
    participant ResArcT as "ResArc<T>"

    Application ->> ResWrapperT: get(&namespace)
    ResWrapperT ->> Namespace: ns.get(self.res)
    Namespace ->> ResArcT: Returns ResArc<T>
    ResArcT ->> Application: Returns &T
    Application ->> ResWrapperT: get_mut(&mut namespace)
    ResWrapperT ->> Namespace: ns.get_mut(self.res)
    Namespace ->> ResArcT: Returns ResArc<T>
    ResArcT -->> Application: Returns Option<&mut T>

Advanced Operations

Sharing Resources Between Namespaces

To share a resource from one namespace to another:

let src_namespace = Namespace::new();
let mut dst_namespace = Namespace::new();

// Share the COUNTER resource from src to dst
COUNTER.share_from(&mut dst_namespace, &src_namespace);

This creates a shared reference to the same resource instance in both namespaces. Note that once shared, you won't be able to get mutable access to the resource in either namespace via get_mut().

Resetting Resources

To reset a resource in a namespace to its default value:

// Reset the COUNTER resource to its default value
COUNTER.reset(&mut my_namespace);

This discards the current resource instance and creates a new one with the default value specified in the def_resource! macro.

Sources: src/res.rs(L94 - L104)  tests/all.rs(L96 - L123) 

Using Thread-Local Namespaces

When the thread-local feature is enabled, AXNS provides thread-local namespaces for resource isolation:

// With thread-local feature enabled:
let counter = COUNTER.current(); // Gets from thread-local namespace

// You can implement the CurrentNs trait to define how thread-local
// namespaces are managed

The thread-local feature is particularly useful in multi-threaded applications where you want to isolate resources between threads.

Sources: src/lib.rs(L35 - L42)  tests/all.rs(L40 - L159) 

flowchart TD
subgraph subGraph2["Resource Access Patterns"]
    A["Application Code"]
    F["Feature Check"]
    G["Current Thread?"]
    subgraph subGraph1["Feature: thread-local ON"]
        C["Thread 1 Namespace"]
        D["Thread 2 Namespace"]
        E["Thread 3 Namespace"]
    end
    subgraph subGraph0["Feature: thread-local OFF"]
        B["Global Namespace"]
    end
end

A --> F
F --> B
F --> G
G --> C
G --> D
G --> E

Working with Custom Resource Types

AXNS can work with any type as a resource, including custom structs, atomics, and reference-counted types:

#![allow(unused)]
fn main() {
use std::sync::atomic::{AtomicUsize, Ordering};

struct Configuration {
    name: String,
    enabled: bool,
}

impl Configuration {
    fn default() -> Self {
        Self {
            name: "default".to_string(),
            enabled: false,
        }
    }
}

def_resource! {
    // Integer resource
    static COUNTER: i32 = 0;
    
    // Atomic for thread-safe access
    static ATOMIC_COUNTER: AtomicUsize = AtomicUsize::new(0);
    
    // Custom struct
    static CONFIG: Configuration = Configuration::default();
}

// Working with atomic resources
ATOMIC_COUNTER.current().fetch_add(1, Ordering::SeqCst);
}

Sources: tests/all.rs(L1 - L38) 

Best Practices

  1. Use appropriate resource types: For shared resources that need concurrent access, consider atomic types or other thread-safe structures.
  2. Minimize resource sharing: While AXNS makes it easy to share resources between namespaces, excessive sharing can reduce the benefits of namespace isolation.
  3. Reset when done: Reset resources when they're no longer needed to free up memory.
  4. Implement CurrentNs carefully: If using the thread-local feature, ensure your CurrentNs implementation correctly handles namespace lifecycle.
  5. Check return value of get_mut(): Always check if get_mut() returns Some before attempting to modify a resource, as it may be shared.
flowchart TD
A["Resource Definition"]
B["Namespace Creation"]
C["Resource Access"]
D["Modify Resource?"]
E["Is Shared?"]
F["Use get_mut()"]
G["Can't modify directly"]
H["Options"]
I["Unsupported markdown: list"]
J["Unsupported markdown: list"]
K["Unsupported markdown: list"]
L["Use get()"]

A --> B
B --> C
C --> D
D --> E
D --> L
E --> F
E --> G
G --> H
H --> I
H --> J
H --> K

For more detailed information on specific aspects of AXNS, refer to:

Basic Resource Access

Relevant source files

This page provides practical instructions for the fundamental operations in AXNS: defining resources, accessing them from namespaces, and modifying their values. For more advanced operations such as sharing resources between namespaces or resetting them to initial values, see Sharing and Resetting Resources.

Defining Resources

Resources in AXNS are defined using the def_resource! macro, which creates statically allocated resources with their default values.

Syntax

def_resource! {
    /// Documentation for the resource
    pub static RESOURCE_NAME: ResourceType = default_value;
    
    // Multiple resources can be defined in a single macro call
    pub static ANOTHER_RESOURCE: AnotherType = another_default_value;
}

Behind the scenes, this macro creates a static ResWrapper<T> instance that provides methods for accessing the resource in different namespaces.

flowchart TD
A["def_resource! macro"]
B["Define Static Resource"]
C["Create Resource Struct"]
D["Define layout, init and drop functions"]
E["Create ResWrapper instance"]
F["Resource accessible via RESOURCE_NAME"]

A --> B
B --> C
B --> D
B --> E
E --> F

Sources: src/res.rs(L144 - L168) 

Examples

Simple value types:

def_resource! {
    /// A static integer resource
    pub static MY_NUMBER: i32 = 42;
}

Complex types:

def_resource! {
    /// A custom data structure
    pub static MY_DATA: MyStruct = MyStruct { 
        field1: "default",
        field2: 100 
    };
}

Atomic types for thread-safe access:

def_resource! {
    /// An atomic counter
    pub static COUNTER: AtomicUsize = AtomicUsize::new(0);
}

Sources: src/res.rs(L130 - L168)  tests/all.rs(L11 - L13)  tests/all.rs(L31 - L33) 

Creating Namespaces

Before accessing resources, you need to create a namespace:

let mut ns = Namespace::new();

Resources will be automatically initialized with their default values when first accessed in a namespace.

sequenceDiagram
    participant ClientCode as "Client Code"
    participant Namespace as "Namespace"
    participant ResArc as "ResArc"
    participant Resource as "Resource"

    ClientCode ->> Namespace: "Namespace::new()"
    Note over Namespace: Creates empty namespace
    ClientCode ->> Namespace: "resource.get(&ns)"
    Namespace ->> ResArc: "Look up resource"
    alt First access
        Namespace ->> ResArc: "Create new ResArc"
        ResArc ->> Resource: "Initialize with default value"
    else Subsequent access
        Namespace ->> ResArc: "Return existing ResArc"
    end
    ResArc -->> ClientCode: "Return resource reference"

Sources: tests/all.rs(L15) 

Accessing Resources

AXNS provides two main methods to access resources:

Direct Access with Namespace Reference

// Get a reference to the resource in the given namespace
let value = RESOURCE_NAME.get(&namespace);

This method requires explicitly passing the namespace reference.

Current Namespace Access

// Access the resource in the current namespace
let current_value = RESOURCE_NAME.current();

The current() method uses the current namespace, which depends on the thread-local feature:

  • When enabled: uses a thread-local namespace
  • When disabled: uses a global namespace
flowchart TD
A["Client Code"]
B["Access Method"]
C["Direct Access with Namespace"]
D["Current Namespace Access"]
E["thread-local feature"]
F["Thread-local Namespace"]
G["Global Namespace"]

A --> B
B --> C
B --> D
D --> E
E --> F
E --> G

Sources: src/res.rs(L78 - L82)  src/res.rs(L69 - L76)  tests/all.rs(L22 - L24)  tests/all.rs(L35 - L37) 

Modifying Resources

To modify a resource, you need a mutable reference to the namespace and the resource must not be shared with other namespaces:

// Try to get a mutable reference
if let Some(mut_value) = RESOURCE_NAME.get_mut(&mut namespace) {
    // Modify the resource
    *mut_value = new_value;
} else {
    // Resource is shared, cannot modify
}

For atomic types, you can modify them without a mutable reference:

// No mutable reference needed for atomic operations
COUNTER.current().fetch_add(1, Ordering::Relaxed);
flowchart TD
A["Client Code"]
B["Resource Type"]
C["resource.get_mut(&mut ns)"]
D["Is resource shared?"]
E["Returns Some(&mut T)"]
F["Returns None"]
G["Modify resource value"]
H["resource.current()"]
I["Call atomic methods"]
J["Resource modified safely"]

A --> B
B --> C
B --> H
C --> D
D --> E
D --> F
E --> G
H --> I
I --> J

Sources: src/res.rs(L89 - L92)  tests/all.rs(L17 - L20)  tests/all.rs(L36 - L37) 

Complete Example

Here's a complete example demonstrating resource definition, access, and modification:

use axns::{Namespace, def_resource};

// Define resources
def_resource! {
    /// A custom data structure
    static DATA: MyStruct = MyStruct { 
        value: 100, 
        name: "hello".to_string() 
    };
}

// Create a namespace
let mut ns = Namespace::new();

// Access the resource (will have default value)
let data = DATA.get(&ns);
assert_eq!(data.value, 100);
assert_eq!(data.name, "hello");

// Modify the resource
if let Some(mut_data) = DATA.get_mut(&mut ns) {
    mut_data.value = 42;
    mut_data.name = "world".to_string();
}

// Verify changes
let modified_data = DATA.get(&ns);
assert_eq!(modified_data.value, 42);
assert_eq!(modified_data.name, "world");

// Access via current() method
let current_data = DATA.current();
// Note: If using the default global namespace, this would still
// have the default values, not the modified ones from our local namespace

Sources: tests/all.rs(L4 - L25) 

Key Considerations

  1. Resource Initialization: Resources are initialized with their default values when first accessed.
  2. Thread Safety:
  • Use atomic types for thread-safe modifications.
  • Regular types require a mutable reference and cannot be modified if shared.
  1. Current Namespace:
  • The behavior of current() depends on the thread-local feature.
  • Be aware of which namespace you're accessing when using current().
  1. Resource Sharing:

Sources: src/res.rs(L53 - L105) 

Sharing and Resetting Resources

Relevant source files

This page explains how to share resources between namespaces and how to reset resources to their default values in the AXNS system. These operations are essential for effective resource management in multi-namespace environments. For information about basic resource access and modification, see Basic Resource Access.

Resource Sharing

AXNS allows resources to be shared between namespaces using the share_from method. When a resource is shared, both namespaces reference the same underlying data, though through separate ResArc instances. This means changes to the resource will be visible across all namespaces that share it.

How Resource Sharing Works

Resource sharing transfers the reference from one namespace to another by cloning the ResArc that wraps the resource:

flowchart TD
subgraph subGraph0["After Sharing"]
    A2["Namespace A"]
    ResA2["ResArc (ref_count=2)"]
    B2["Namespace B"]
    ResB2["Cloned ResArc"]
    RD["Shared Resource Data X"]
end
A["Namespace A"]
ResA["ResArc"]
B["Namespace B"]
ResB["Different ResArc"]
RD1["Resource Data X"]
RD2["Resource Data Y"]
S["ResWrapper.share_from(&mut B, &A)"]
C["Clone operation"]

A --> ResA
A2 --> ResA2
B --> ResB
B2 --> ResB2
ResA --> RD1
ResA2 --> RD
ResB --> RD2
ResB2 --> RD
S --> C

The share_from method implementation is straightforward:

#![allow(unused)]
fn main() {
pub fn share_from<'ns>(&self, dst: &'ns mut Namespace, src: &'ns Namespace) {
    *dst.get_mut(self.res) = src.get(self.res).clone();
}
}

This method clones the ResArc from the source namespace and replaces the existing ResArc in the destination namespace. The reference count for the resource increases, ensuring it won't be deallocated until all namespaces release their references.

Sources: src/res.rs(L96 - L98) 

Example of Resource Sharing

Here's a practical example of sharing a resource between namespaces:

// Define a resource
def_resource! {
    static SHARED_DATA: Arc<()> = Arc::new(());
}

// Set up source namespace with custom data
let mut src_ns = Namespace::new();
DATA.get_mut(&mut src_ns).unwrap().clone_from(&MY_CUSTOM_DATA);

// Create destination namespace
let mut dst_ns = Namespace::new();

// Share the resource from source to destination
DATA.share_from(&mut dst_ns, &src_ns);

// Now both namespaces reference the same underlying data
// Changes in one namespace will be visible in the other

Sources: tests/all.rs(L126 - L158) 

Resource Resetting

The reset method allows you to discard the current state of a resource in a namespace and reinitialize it to the default value defined in the def_resource! macro.

How Resource Resetting Works

Resetting creates a new ResArc instance with freshly initialized resource data:

flowchart TD
subgraph subGraph1["After Reset"]
    OLDRA["Old ResArc (ref_count=n-1)"]
    OLDRD["Original Resource Data"]
    subgraph subGraph0["Before Reset"]
        NS2["Namespace"]
        RA3["New ResArc"]
        RD3["Default Resource Data"]
        NS1["Namespace"]
        RA1["ResArc (ref_count=n)"]
        RD1["Modified Resource Data"]
    end
end
RST["ResWrapper.reset(&mut Namespace)"]
RA2["New ResArc (ref_count=1)"]
RD2["Default Resource Data"]

NS1 --> RA1
NS2 --> RA3
OLDRA --> OLDRD
RA1 --> RD1
RA2 --> RD2
RA3 --> RD3
RST --> RA2

The implementation of reset is equally straightforward:

#![allow(unused)]
fn main() {
pub fn reset(&self, ns: &mut Namespace) {
    *ns.get_mut(self.res) = ResArc::new(self.res);
}
}

This method creates a new ResArc for the resource with default initialization and replaces the existing one in the namespace. The reference count of the old ResArc decreases, potentially triggering deallocation if no other namespaces reference it.

Sources: src/res.rs(L102 - L104) 

Example of Resource Resetting

Here's how to reset a resource to its default value:

// Define a resource
def_resource! {
    static DATA: Arc<()> = Arc::new(());
}

// Get a namespace and potentially modify the resource
let mut ns = Namespace::new();
// ...modify resource...

// Reset the resource to its default value
DATA.reset(&mut ns);

// Now the resource has been reinitialized with the default value
// Any previous sharing relationships are broken for this namespace

Sources: tests/all.rs(L97 - L123) 

Use Cases and Considerations

When to Use Sharing vs. Resetting

  • Use sharing when:

  • Multiple namespaces need to access the same resource instance

  • You want to propagate changes from one namespace to others

  • You need to conserve memory by avoiding duplication of large resources

  • Use resetting when:

  • You need to return a resource to its initial state

  • You want to break sharing relationships with other namespaces

  • You're reinitializing a namespace for reuse

Memory Management Implications

Both operations affect resource reference counting:

OperationEffect on Reference CountMemory Impact
share_fromIncreases reference count for source resourcePrevents deallocation as long as any namespace references it
resetDecreases reference count of old resourceMay trigger deallocation if count reaches zero

Thread Safety Considerations

When using these operations in a multi-threaded environment:

  • Ensure proper synchronization when accessing shared resources
  • For thread-local namespaces (with the thread-local feature enabled), be aware that resources are isolated by default
  • Consider using thread-safe types (atomics, mutexes) for resources that may be shared across threads

Sources: src/res.rs(L53 - L105) 

Example: Resource Lifecycle with Sharing and Resetting

The following sequence diagram illustrates a typical resource lifecycle involving sharing and resetting:

sequenceDiagram
    participant NamespaceA as "Namespace A"
    participant NamespaceB as "Namespace B"
    participant Resource as "Resource"

    Note over NamespaceA,NamespaceB: Initial state: each namespace has its own resource instance
    NamespaceA ->> Resource: get_mut() - Modify resource
    NamespaceA ->> NamespaceB: share_from(&mut NS2, &NS1)
    Note over NamespaceA,NamespaceB: Both namespaces now reference the same resource instance
    NamespaceB ->> Resource: get() - Access shared resource
    Note over NamespaceB: Changes made in NS1 are visible in NS2
    NamespaceB ->> NamespaceB: reset()
    Note over NamespaceB: NS2 now has a fresh resource instance
    Note over NamespaceA,NamespaceB: Sharing relationship is broken
    NamespaceA ->> Resource: get() - Still has modified resource
    NamespaceB ->> Resource: get() - Has default resource

Sources: src/res.rs(L96 - L104) 

Best Practices

  1. Be mindful of sharing: Shared resources cannot be safely mutated through get_mut(), so plan your resource usage accordingly.
  2. Consider reference counting: When sharing or resetting resources that hold external allocations (like Arc), be aware of the reference counting implications.
  3. Use reset for cleanup: Reset resources in namespaces before reusing them to prevent resource leaks and ensure consistent initial states.
  4. Share judiciously: While sharing can save memory, it can make resource management more complex by creating implicit dependencies between namespaces.

Sources: src/res.rs(L90 - L92)  tests/all.rs(L97 - L158) 

Development and Testing

Relevant source files

This page provides comprehensive information for developers working on the AXNS resource namespace library itself. It covers the development environment, testing methodology, CI/CD workflow, and guidelines for contributing to the project. For information about using the AXNS library in your applications, please see Usage Guide.

Development Environment

AXNS is developed as a Rust library with minimal dependencies. To work on AXNS, you need:

  1. Rust Toolchain: The project uses the nightly Rust toolchain for development to leverage advanced features and documentation tools.
  2. Cargo: For building, testing, and package management.
  3. Git: For version control.

The repository is organized in a standard Rust project structure:

flowchart TD
A["src/"]
B["lib.rs (Core library code)"]
C[".github/workflows/"]
D["ci.yml (CI pipeline configuration)"]
E["tests/"]
F["all.rs (Test suite)"]
G["Cargo.toml (Dependencies and metadata)"]

A --> B
C --> D
E --> F

Sources: .github/workflows/ci.yml tests/all.rs

Testing Methodology

AXNS employs a comprehensive testing methodology to ensure correctness and reliability of the namespace system. The test suite in tests/all.rs validates the core functionality through various test cases.

Test Categories

The test suite includes several categories of tests:


Sources: tests/all.rs

Test Case Examples

The test suite validates several key aspects of the AXNS system:

  1. Basic namespace operations tests/all.rs(L4 - L25) 
  • Creating a namespace
  • Defining resources with def_resource!
  • Getting and modifying resources
  1. Current resource access tests/all.rs(L27 - L38) 
  • Accessing the current value of a resource
  • Modifying resources in the current namespace
  1. Thread-local feature tests tests/all.rs(L40 - L159) 
  • Resource cleanup and recycling
  • Resetting resources
  • Sharing resources between namespaces

Sources: tests/all.rs

Feature Flag Testing

AXNS uses feature flags to enable optional functionality. The most significant feature is thread-local, which enables thread-local namespace support.

Thread-Local Feature Testing

The thread-local feature is tested in a dedicated module that is only compiled when the feature is enabled:

flowchart TD
subgraph subGraph0["Thread-Local Tests"]
    D["recycle() - Tests resource cleanup in thread-local context"]
    E["reset() - Tests resetting resources in thread-local context"]
    F["clone_from() - Tests sharing resources between thread-local namespaces"]
end
A["Test Suite"]
B["Basic Tests"]
C["Basic Tests + Thread-Local Tests"]

A --> B
A --> C
C --> D
C --> E
C --> F

The thread-local tests validate several important aspects:

  1. Resource lifecycle - Ensuring resources are properly cleaned up when threads terminate
  2. Resource sharing - Verifying resources can be shared between namespaces
  3. Resource resetting - Testing the reset functionality in thread-local contexts

Sources: tests/all.rs(L40 - L159) 

CI/CD Pipeline

AXNS employs a comprehensive CI/CD pipeline implemented with GitHub Actions to ensure code quality and consistency.

CI Workflow

flowchart TD
A["Push/PR to main branch"]
B["CI Workflow Trigger"]
C["check job"]
C1["Run cargo fmt"]
C2["Run cargo clippy"]
D["test job"]
D1["Run cargo test"]
D2["Run cargo test with thread-local"]
E["doc job"]
E1["Build documentation"]
F["deploy job"]
F1["Deploy to GitHub Pages"]
G["Successful Check"]
H["Successful Tests"]

A --> B
B --> C
B --> D
B --> E
C --> C1
C --> C2
C1 --> G
C2 --> G
D --> D1
D --> D2
D1 --> H
D2 --> H
E --> E1
E1 --> F
F --> F1

The CI pipeline consists of the following jobs:

JobDescriptionCommands
checkValidates code formatting and checks for linting issuescargo fmt --all --checkcargo clippy --all-targets --all-features -- -D warnings
testRuns the test suite in both standard and thread-local modescargo test --verbosecargo test --verbose -F thread-local
docBuilds the documentation with all featurescargo doc --all-features --no-deps
deployDeploys the documentation to GitHub PagesGitHub Actions deployment task

Sources: .github/workflows/ci.yml

Test-Driven Development

The AXNS development process follows test-driven development principles:

  1. Write Tests First: New features should be accompanied by tests that validate their behavior
  2. Validate Core Functionality: Tests should cover the full range of functionality
  3. Feature Flag Testing: Both standard and feature-enabled configurations must be tested

Test to Code Relationship

flowchart TD
subgraph subGraph1["Core Components"]
    B["Namespace Struct"]
    C["Resource and ResWrapper"]
    D["def_resource! Macro"]
    E["Thread-Local Features"]
    F["Resource Lifecycle"]
end
subgraph subGraph0["Test Files"]
    A["tests/all.rs"]
end

A --> B
A --> C
A --> D
A --> E
A --> F

Sources: tests/all.rs

Testing ResArc Reference Counting

A critical aspect of AXNS is its reference counting mechanism implemented through ResArc. The test suite verifies that reference counting works correctly to prevent memory leaks.

sequenceDiagram
    participant Test as Test
    participant Namespace as Namespace
    participant Resource as Resource
    participant ResArc as ResArc

    Test ->> Namespace: Create Namespace
    Test ->> Resource: Define Resource (def_resource!)
    Test ->> Namespace: Get resource (DATA.get_mut())
    Namespace ->> ResArc: Clone ResArc
    Test ->> Namespace: Modify resource
    Note over Test,ResArc: Thread-local tests
    Test ->> ResArc: Share resource (share_from)
    ResArc ->> ResArc: Increment reference count
    Test ->> ResArc: Reset resource (reset)
    ResArc ->> ResArc: Decrement reference count
    Note over Test,ResArc: Verify reference counts
    Test ->> ResArc: Check strong_count matches expectations

The thread-local tests specifically validate reference counting by tracking the strong_count of Arc instances and ensuring they are properly incremented and decremented.

Sources: tests/all.rs(L40 - L159) 

Development Guidelines

When developing AXNS, follow these guidelines:

  1. Write Tests: All new functionality should be accompanied by appropriate tests.
  2. Feature Flags: When adding features that can be optional, use feature flags and add tests for both configurations.
  3. Documentation: Document all public APIs with doc comments.
  4. Code Quality: Ensure code passes cargo fmt and cargo clippy checks.
  5. Compatibility: Maintain backward compatibility when possible.

Adding New Resources

When adding new resource types to the system:

  1. Define the resource using the def_resource! macro
  2. Implement tests that validate the resource behavior in various scenarios
  3. Ensure proper cleanup and reference counting

Testing Thread-Local Features

When working with the thread-local feature:

  1. Place thread-local specific tests in the #[cfg(feature = "thread-local")] module
  2. Verify resources are properly cleaned up when threads terminate
  3. Test interactions between thread-local and global namespaces

Sources: tests/all.rs(L40 - L159) 

Test Code Structure

The test structure in AXNS follows a modular pattern where base functionality is tested first, followed by feature-specific tests:

flowchart TD
A["tests/all.rs"]
B["Base Tests"]
C["#[cfg(feature = thread-local)]"]
B1["ns()"]
B2["current()"]
C1["thread_local! { static NS }"]
C2["CurrentNsImpl struct"]
C3["Feature-specific tests"]
D1["recycle()"]
D2["reset()"]
D3["clone_from()"]

A --> B
A --> C
B --> B1
B --> B2
C --> C1
C --> C2
C --> C3
C3 --> D1
C3 --> D2
C3 --> D3

Sources: tests/all.rs

Conclusion

The development and testing infrastructure of AXNS is designed to ensure the correctness and reliability of the resource namespace system. By following the guidelines and leveraging the existing test framework, developers can contribute to AXNS while maintaining its quality standards.

For information on using AXNS in your applications, please refer to the Usage Guide section.

Overview

Relevant source files

Purpose and Scope

The axsignal crate implements a Unix-like signal handling system for ArceOS. It provides a comprehensive framework for managing signals at both process and thread levels, supporting standard operations such as sending, blocking, and handling signals. This crate enables applications running on ArceOS to use familiar signal handling patterns similar to those found in POSIX systems.

For detailed explanations of specific components, see Signal Management System, Signal Types and Structures, and Architecture Support.

Sources: src/lib.rs(L1 - L16)  Cargo.toml(L1 - L31) 

High-Level Architecture

The axsignal crate is organized into several interconnected modules that together form a complete signal handling system.

flowchart TD
subgraph subGraph2["axsignal Crate"]
    lib["lib.rs"]
    subgraph subGraph1["Architecture Support"]
        x86_64["arch/x86_64.rs"]
        aarch64["arch/aarch64.rs"]
        riscv["arch/riscv.rs"]
        loongarch64["arch/loongarch64.rs"]
    end
    subgraph subGraph0["Core Modules"]
        action["action.rs"]
        types["types.rs"]
        pending["pending.rs"]
        api["api.rs"]
        arch["arch/mod.rs"]
    end
end

arch --> aarch64
arch --> loongarch64
arch --> riscv
arch --> x86_64
lib --> action
lib --> api
lib --> arch
lib --> pending
lib --> types

High-Level Architecture of axsignal

Sources: src/lib.rs(L7 - L15) 

Key Components

The axsignal crate consists of several key components that work together to provide signal handling functionality:

Signal Types

classDiagram
class Signo {
    +value: u8
    +const SIGHUP, SIGINT, SIGQUIT, etc.
    +is_standard()
    +is_realtime()
}

class SignalSet {
    +bits: u64
    +new()
    +add_signal(Signo)
    +del_signal(Signo)
    +contains(Signo)
    +is_empty()
}

class SignalInfo {
    +signo: Signo
    +code: i32
    +errno: i32
    +fields: SignalFields
    
}

class SignalStack {
    +ss_sp: usize
    +ss_flags: i32
    +ss_size: usize
    
}

SignalInfo  -->  Signo

Signal Type Components

Signal Managers

classDiagram
class ProcessSignalManager {
    +pending: Mutex
    +actions: Arc~
    +send_signal(sig: SignalInfo)
    +dequeue_signal(mask: SignalSet)
    +pending()
    +wait_signal()
}

class ThreadSignalManager {
    -proc: Arc
    -pending: Mutex
    -blocked: Mutex
    -stack: Mutex
    +send_signal(sig: SignalInfo)
    +dequeue_signal(mask: SignalSet)
    +check_signals(tf, restore_blocked)
    +handle_signal(tf, restore_blocked, sig, action)
}

class PendingSignals {
    +set: SignalSet
    +info_std: Option[32]
    +info_rt: VecDeque
    +put_signal(sig: SignalInfo)
    +dequeue_signal(mask: SignalSet)
}

ThreadSignalManager  -->  ProcessSignalManager
ProcessSignalManager  -->  PendingSignals
ThreadSignalManager  -->  PendingSignals

Signal Management Components

Sources: src/lib.rs(L7 - L15) 

Signal Processing Flow

The signal handling process in axsignal follows a well-defined flow from generation to handling:

flowchart TD
subgraph subGraph2["Signal Handling"]
    CheckAction["Check SignalAction"]
    Default["Execute Default Action"]
    Ignore["Ignore Signal"]
    Handler["Execute Custom Handler"]
    SaveContext["Save CPU Context"]
    ExecuteHandler["Run Handler Function"]
    RestoreContext["Restore CPU Context"]
end
subgraph subGraph1["Signal Queuing"]
    CheckBlocked["Is Signal Blocked?"]
    Queue["Add to PendingSignals"]
    Deliver["Deliver Immediately"]
end
subgraph subGraph0["Signal Generation"]
    Start["Signal Generated"]
    SendToProcess["ProcessSignalManager.send_signal()"]
    SendToThread["ThreadSignalManager.send_signal()"]
end
CheckPending["check_signals()"]

CheckAction --> Default
CheckAction --> Handler
CheckAction --> Ignore
CheckBlocked --> Deliver
CheckBlocked --> Queue
CheckPending --> Deliver
Deliver --> CheckAction
ExecuteHandler --> RestoreContext
Handler --> SaveContext
Queue --> CheckPending
SaveContext --> ExecuteHandler
SendToProcess --> CheckBlocked
SendToThread --> CheckBlocked
Start --> SendToProcess
Start --> SendToThread

Signal Processing Flow

Sources: src/lib.rs(L7 - L15) 

System Dependencies and Integration

The axsignal crate integrates with several core components of ArceOS to provide its functionality:

flowchart TD
subgraph subGraph1["External Dependencies"]
    axerrno["axerrno - Error Codes"]
    bitflags["bitflags - Bit Flag Manipulation"]
    log["log - Logging Infrastructure"]
end
subgraph subGraph0["ArceOS Core Components"]
    axconfig["axconfig - System Configuration"]
    axhal["axhal - Hardware Abstraction Layer"]
    axtask["axtask - Task Management"]
end
axsignal["axsignal"]

axsignal --> axconfig
axsignal --> axerrno
axsignal --> axhal
axsignal --> axtask
axsignal --> bitflags
axsignal --> log

Dependencies and Integration

The crate relies on:

  • axconfig: For system configuration parameters
  • axhal: For hardware abstraction with userspace support
  • axtask: For multitasking integration
  • axerrno: For error handling
  • bitflags: For efficient signal set implementation
  • Additional utilities for logging and synchronization

Sources: Cargo.toml(L6 - L26) 

Architecture Support

The axsignal crate provides platform-specific implementations for different CPU architectures, ensuring proper signal context management:

flowchart TD
subgraph subGraph2["Architecture-Specific Features"]
    context["Machine Context Management"]
    trampoline["Signal Trampoline Code"]
    setup["Signal Handler Setup"]
    restore["Context Restoration"]
end
subgraph subGraph1["Architecture Support"]
    arch_mod["arch/mod.rs - Common Interface"]
    subgraph subGraph0["Platform-Specific Implementations"]
        x86_64["arch/x86_64.rs"]
        aarch64["arch/aarch64.rs"]
        riscv["arch/riscv.rs"]
        loongarch64["arch/loongarch64.rs"]
    end
end

aarch64 --> context
aarch64 --> restore
aarch64 --> setup
aarch64 --> trampoline
arch_mod --> aarch64
arch_mod --> loongarch64
arch_mod --> riscv
arch_mod --> x86_64
loongarch64 --> context
loongarch64 --> restore
loongarch64 --> setup
loongarch64 --> trampoline
riscv --> context
riscv --> restore
riscv --> setup
riscv --> trampoline

Architecture Support System

Each architecture implementation provides specialized functionality for:

  • Saving and restoring CPU registers during signal handling
  • Setting up signal trampolines (code that transfers control to user-defined handlers)
  • Managing signal stacks
  • Handling architecture-specific signal delivery requirements

Sources: src/lib.rs(L8 - L9) 

Summary

The axsignal crate provides a comprehensive signal handling system for ArceOS, implementing familiar Unix-like functionality across multiple processor architectures. It manages signals at both process and thread levels, supports standard and real-time signals, and offers a flexible framework for defining custom signal actions.

Key features include:

  • Process and thread level signal management
  • Support for multiple architectures (x86_64, AArch64, RISC-V, LoongArch64)
  • Signal masking and prioritization
  • Custom signal handlers with context management
  • Integration with ArceOS task management

For detailed information about specific components, refer to the dedicated pages on Signal Management System and Signal Types and Structures.

Sources: src/lib.rs(L1 - L16)  Cargo.toml(L1 - L31) 

Signal Management System

Relevant source files

Purpose and Scope

This page documents the signal management architecture in the axsignal crate, focusing on the core components that handle signal delivery, queuing, and processing at both process and thread levels. The system implements a Unix-like signal handling framework that coordinates signal delivery across multiple threads within a process.

For information about specific signal types and structures, see Signal Types and Structures. For details on architecture-specific implementations, see Architecture Support.

Signal Management Architecture

The signal management system in axsignal adopts a two-level architecture, consisting of:

  1. Process Signal Manager: Handles signals at the process level, maintaining process-wide pending signals and signal actions
  2. Thread Signal Manager: Manages signals at the thread level, with per-thread signal masks, stacks, and pending signals

This design allows signals to be directed either to a specific thread or to the process as a whole, following the standard Unix signal model.

classDiagram
class ThreadSignalManager {
    -Arc proc
    -Mutex pending
    -Mutex blocked
    -Mutex stack
    +new(proc)
    +dequeue_signal(mask)
    +handle_signal(tf, restore_blocked, sig, action)
    +check_signals(tf, restore_blocked)
    +restore(tf)
    +send_signal(sig)
}

class ProcessSignalManager {
    +Mutex pending
    +Arc~ actions
    +WaitQueue wq
    +usize default_restorer
    +new(actions, default_restorer)
    +dequeue_signal(mask)
    +send_signal(sig)
    +pending()
    +wait_signal()
}

class PendingSignals {
    +SignalSet set
    +Option[32] info_std
    +VecDeque[33] info_rt
    +put_signal(SignalInfo)
    +dequeue_signal(SignalSet)
}

class SignalActions {
    +[SignalAction; 64] actions
    
}

class WaitQueue {
    <<trait>>
    
    +wait_timeout(timeout)
    +wait()
    +notify_one()
    +notify_all()
}

ThreadSignalManager  -->  ProcessSignalManager : references
ProcessSignalManager  -->  PendingSignals : contains
ThreadSignalManager  -->  PendingSignals : contains
ProcessSignalManager  -->  SignalActions : contains
ProcessSignalManager  -->  WaitQueue : uses

Sources: src/api/thread.rs(L20 - L240)  src/api/process.rs(L32 - L82)  src/api/mod.rs(L9 - L30) 

Process Signal Manager

The ProcessSignalManager is responsible for managing signals at the process level. It's a shared resource accessible by all threads within a process.

Structure and Components

flowchart TD
subgraph ProcessSignalManager["ProcessSignalManager"]
    A["pending: Mutex"]
    B["Tracks process-wide pending signals"]
    C["actions: Arc>"]
    D["Defines how each signal is handled"]
    E["wq: WaitQueue"]
    F["Synchronization primitive for signal waiting"]
    G["default_restorer: usize"]
    H["Address of default signal return handler"]
end

A --> B
C --> D
E --> F
G --> H

Sources: src/api/process.rs(L32 - L48) 

Key Methods

MethodPurpose
newCreates a new process signal manager with given actions and default restorer
dequeue_signalRemoves and returns a pending signal that matches the given mask
send_signalSends a signal to the process and notifies waiting threads
pendingReturns the set of pending signals for the process
wait_signalSuspends the current thread until a signal is delivered

Sources: src/api/process.rs(L49 - L82) 

Thread Signal Manager

The ThreadSignalManager handles signals targeted at specific threads, maintaining thread-specific signal state while coordinating with the process-level manager.

Structure and Components

flowchart TD
subgraph ThreadSignalManager["ThreadSignalManager"]
    A["proc: Arc"]
    B["Reference to the process-level manager"]
    C["pending: Mutex"]
    D["Thread-specific pending signals"]
    E["blocked: Mutex"]
    F["Signals currently blocked for this thread"]
    G["stack: Mutex"]
    H["Stack used for signal handlers"]
end

A --> B
C --> D
E --> F
G --> H

Sources: src/api/thread.rs(L21 - L31) 

Key Methods

MethodPurpose
newCreates a new thread signal manager with reference to the process manager
dequeue_signalDequeues a signal from thread or process pending queues
handle_signalProcesses a signal based on its action (default, ignore, handler)
check_signalsChecks and handles pending signals for the thread
restoreRestores the execution context after a signal handler returns
send_signalSends a signal to the thread
wait_timeoutWaits for a signal with optional timeout

Sources: src/api/thread.rs(L33 - L240) 

Signal Processing Flow

The signal handling flow involves coordination between the process and thread signal managers, checking signal masks, and executing the appropriate actions based on signal dispositions.

flowchart TD
subgraph subGraph0["Signal Delivery (check_signals)"]
    CheckSignals["ThreadSignalManager::check_signals()"]
    GetBlocked["Get thread's blocked signals"]
    Mask["Create mask of unblocked signals"]
    DequeueLoop["Start dequeue loop"]
    TryDequeue["Try dequeue signal from thread, then process"]
    SignalFound["Signal found?"]
    Done["No signal to handle"]
    GetAction["Get SignalAction for this signal"]
    HandleSignal["handle_signal()"]
    CheckDisposition["Check disposition"]
    DefaultAction["Execute default action"]
    NextSignal["Continue to next signal"]
    SetupHandler["Set up signal handler"]
    CreateFrame["Create SignalFrame"]
    SetupTrapFrame["Modify trap frame"]
    UpdateBlocked["Update blocked signals"]
    Result["Return signal and action"]
end
Start["Signal Generated"]
SendDecision["Send to Thread or Process?"]
ThreadSend["ThreadSignalManager::send_signal()"]
ProcessSend["ProcessSignalManager::send_signal()"]
ThreadPending["Add to Thread's pending signals"]
ProcessPending["Add to Process's pending signals"]
NotifyWQ["Notify process wait queue"]

CheckDisposition --> DefaultAction
CheckDisposition --> NextSignal
CheckDisposition --> SetupHandler
CheckSignals --> GetBlocked
CreateFrame --> SetupTrapFrame
DequeueLoop --> TryDequeue
GetAction --> HandleSignal
GetBlocked --> Mask
HandleSignal --> CheckDisposition
Mask --> DequeueLoop
NextSignal --> DequeueLoop
ProcessPending --> NotifyWQ
ProcessSend --> ProcessPending
SendDecision --> ProcessSend
SendDecision --> ThreadSend
SetupHandler --> CreateFrame
SetupTrapFrame --> UpdateBlocked
SignalFound --> Done
SignalFound --> GetAction
Start --> SendDecision
ThreadPending --> NotifyWQ
ThreadSend --> ThreadPending
TryDequeue --> SignalFound
UpdateBlocked --> Result

Sources: src/api/thread.rs(L119 - L143)  src/api/thread.rs(L43 - L48)  src/api/thread.rs(L50 - L117)  src/api/thread.rs(L157 - L163)  src/api/process.rs(L64 - L70) 

Signal Handler Execution

When a signal with a custom handler is processed, the system prepares a special execution environment for the handler:

sequenceDiagram
    participant KernelThread as "Kernel/Thread"
    participant ThreadSignalManager as "ThreadSignalManager"
    participant SignalHandler as "Signal Handler"
    participant SignalRestorer as "Signal Restorer"

    KernelThread ->> ThreadSignalManager: check_signals()
    ThreadSignalManager ->> ThreadSignalManager: dequeue_signal()
    ThreadSignalManager ->> ThreadSignalManager: handle_signal()
    Note over ThreadSignalManager: Signal has Handler disposition
    ThreadSignalManager ->> ThreadSignalManager: Create SignalFrame on stack
    ThreadSignalManager ->> ThreadSignalManager: Save current context
    ThreadSignalManager ->> ThreadSignalManager: Set up handler arguments
    ThreadSignalManager ->> SignalHandler: Jump to handler (modify trap frame)
    SignalHandler ->> SignalRestorer: Return from handler
    SignalRestorer ->> ThreadSignalManager: restore()
    ThreadSignalManager ->> ThreadSignalManager: Restore original trap frame
    ThreadSignalManager ->> ThreadSignalManager: Restore original signal mask
    ThreadSignalManager ->> KernelThread: Return to normal execution

Sources: src/api/thread.rs(L50 - L117)  src/api/thread.rs(L145 - L155) 

SignalFrame Structure

When preparing to execute a signal handler, the system creates a special SignalFrame structure on the stack:

flowchart TD
subgraph SignalFrame["SignalFrame"]
    A["ucontext: UContext"]
    B["Contains saved machine context"]
    C["siginfo: SignalInfo"]
    D["Information about the signal"]
    E["tf: TrapFrame"]
    F["Saved trap frame"]
end

A --> B
C --> D
E --> F

Sources: src/api/thread.rs(L14 - L18) 

Wait Queue Interface

The WaitQueue trait provides a synchronization mechanism for threads waiting on signals. It defines methods for waiting with an optional timeout and for notifying waiting threads.

MethodDescription
wait_timeoutWaits for a notification with optional timeout, returns whether notification came
waitWaits indefinitely for a notification
notify_oneNotifies a single waiting thread, returns whether a thread was notified
notify_allNotifies all waiting threads

This interface is used by both the process and thread signal managers to coordinate waiting for and receiving signals.

Sources: src/api/mod.rs(L9 - L30) 

Signal Handling Process

The signal handling process from generation to execution follows this sequence:

  1. A signal is generated and sent via send_signal() to either a thread or process
  2. The signal is added to the appropriate pending queue
  3. Waiting threads are notified via the wait queue
  4. When a thread checks for signals, it:
  • Determines which signals are not blocked
  • Dequeues pending signals from thread and process queues
  • For each signal, checks its action (disposition)
  • Executes the appropriate handler or default action
  1. For custom handlers, the system:
  • Creates a SignalFrame to save the current execution context
  • Sets up the stack and arguments for the handler
  • Modifies the trap frame to transfer control to the handler
  • When the handler returns, restores the original context

This comprehensive system allows for Unix-like signal handling with support for default actions, custom handlers, and signal masking at both process and thread levels.

Sources: src/api/thread.rs(L50 - L117)  src/api/thread.rs(L119 - L143)  src/api/thread.rs(L157 - L163)  src/api/process.rs(L64 - L70) 

Thread Signal Manager

Relevant source files

The Thread Signal Manager is a component of the axsignal crate that provides thread-level signal handling capabilities in a Unix-like signal handling system. It manages signal delivery, blocking, and handling for individual threads, working in coordination with the Process Signal Manager. For process-level signal handling, see Process Signal Manager.

Overview

The Thread Signal Manager implements thread-specific signal handling functionality including:

  • Managing thread-specific pending signals
  • Controlling which signals are blocked for a thread
  • Setting up and managing signal handler stacks
  • Handling signal delivery to user-defined handlers
  • Coordination with the process-level signal manager
classDiagram
class ThreadSignalManager {
    -Arc~ProcessSignalManager~ proc
    -Mutex~PendingSignals~ pending
    -Mutex~SignalSet~ blocked
    -Mutex~SignalStack~ stack
    +new(proc)
    +dequeue_signal(mask)
    +handle_signal(tf, restore_blocked, sig, action)
    +check_signals(tf, restore_blocked)
    +restore(tf)
    +send_signal(sig)
    +blocked()
    +with_blocked_mut(f)
    +stack()
    +with_stack_mut(f)
    +pending()
    +wait_timeout(set, timeout)
}

class ProcessSignalManager {
    +Mutex~PendingSignals~ pending
    +Arc~Mutex~SignalActions~~ actions
    +WaitQueue wq
    +usize default_restorer
    
}

class PendingSignals {
    +SignalSet set
    +Option~SignalInfo~[32] info_std
    +VecDeque~SignalInfo~[33] info_rt
    
}

class SignalSet {
    +u64 bits
    
}

class SignalStack {
    +usize sp
    +usize size
    +u32 flags
    
}

ThreadSignalManager  -->  ProcessSignalManager : references
ThreadSignalManager  *--  PendingSignals : contains
ThreadSignalManager  *--  SignalSet : contains
ThreadSignalManager  *--  SignalStack : contains

Sources: src/api/thread.rs(L21 - L31) 

Core Components

SignalFrame

Before a signal handler is invoked, the current execution context is saved in a SignalFrame structure on the stack:

classDiagram
class SignalFrame {
    +UContext ucontext
    +SignalInfo siginfo
    +TrapFrame tf
    
}

class UContext {
    +MContext mcontext
    +SignalSet sigmask
    
}

class SignalInfo {
    +Signo signo
    +i32 errno
    +i32 code
    
}

class TrapFrame {
    +architecture-specific registers
    
}

SignalFrame  *--  UContext
SignalFrame  *--  SignalInfo
SignalFrame  *--  TrapFrame

Sources: src/api/thread.rs(L14 - L18) 

Signal Handling Flow

The ThreadSignalManager follows a specific flow when handling signals:

flowchart TD
ThreadReceive["Thread receives signal"]
CheckBlocked["Is signalblocked?"]
AddPending["Add to thread'spending signals"]
CheckDisposition["Check signaldisposition"]
Later["Later when unblocked"]
DefaultAction["Execute default action(Terminate/CoreDump/Stop/Ignore/Continue)"]
DoNothing["Do nothing"]
SetupHandler["Setup handler execution"]
SaveContext["Save current contextin SignalFrame"]
ModifyTF["Modify trap framefor handler execution"]
UpdateBlocked["Update blocked signals"]
ExecuteHandler["Execute signal handler"]
RestoreContext["Restore original contextwhen handler returns"]

AddPending --> Later
CheckBlocked --> AddPending
CheckBlocked --> CheckDisposition
CheckDisposition --> DefaultAction
CheckDisposition --> DoNothing
CheckDisposition --> SetupHandler
ExecuteHandler --> RestoreContext
Later --> CheckDisposition
ModifyTF --> UpdateBlocked
SaveContext --> ModifyTF
SetupHandler --> SaveContext
ThreadReceive --> CheckBlocked
UpdateBlocked --> ExecuteHandler

Sources: src/api/thread.rs(L50 - L117)  src/api/thread.rs(L119 - L143) 

Key Methods

Constructor

The ThreadSignalManager is initialized with a reference to a ProcessSignalManager:

#![allow(unused)]
fn main() {
pub fn new(proc: Arc<ProcessSignalManager<M, WQ>>) -> Self {
    Self {
        proc,
        pending: Mutex::new(PendingSignals::new()),
        blocked: Mutex::new(SignalSet::default()),
        stack: Mutex::new(SignalStack::default()),
    }
}
}

Sources: src/api/thread.rs(L34 - L41) 

Signal Dequeuing

The dequeue_signal method attempts to retrieve a signal from the thread's pending signals. If none are found, it falls back to the process-level signal manager:

#![allow(unused)]
fn main() {
fn dequeue_signal(&self, mask: &SignalSet) -> Option<SignalInfo> {
    self.pending
        .lock()
        .dequeue_signal(mask)
        .or_else(|| self.proc.dequeue_signal(mask))
}
}

Sources: src/api/thread.rs(L43 - L48) 

Signal Handling

The handle_signal method is responsible for processing a signal based on its disposition:

flowchart TD
HandleSignal["handle_signal()"]
CheckDisposition["Check disposition"]
DefaultAction["Get default actionfor signal number"]
ReturnOSAction["Return appropriateSignalOSAction"]
ReturnNone["Return None"]
SetupStack["Setup stack forsignal handler"]
CreateFrame["Create SignalFrameon stack"]
ModifyTF["Modify trap frameto call handler"]
AddArguments["Set arguments(signo, siginfo, ucontext)"]
SetRestorer["Set signal restorer"]
UpdateBlockedSignals["Update blocked signals"]
CheckResethand["Has RESETHANDflag?"]
ResetAction["Reset signal actionto default"]
Skip["Skip reset"]
ReturnHandler["ReturnSignalOSAction::Handler"]

AddArguments --> SetRestorer
CheckDisposition --> DefaultAction
CheckDisposition --> ReturnNone
CheckDisposition --> SetupStack
CheckResethand --> ResetAction
CheckResethand --> Skip
CreateFrame --> ModifyTF
DefaultAction --> ReturnOSAction
HandleSignal --> CheckDisposition
ModifyTF --> AddArguments
ResetAction --> ReturnHandler
SetRestorer --> UpdateBlockedSignals
SetupStack --> CreateFrame
Skip --> ReturnHandler
UpdateBlockedSignals --> CheckResethand

Sources: src/api/thread.rs(L50 - L117) 

Check and Handle Signals

The check_signals method checks for pending signals and handles them:

pub fn check_signals(
    &self,
    tf: &mut TrapFrame,
    restore_blocked: Option<SignalSet>,
) -> Option<(SignalInfo, SignalOSAction)> {
    let actions = self.proc.actions.lock();
    
    let blocked = self.blocked.lock();
    let mask = !*blocked;
    let restore_blocked = restore_blocked.unwrap_or_else(|| *blocked);
    drop(blocked);
    
    loop {
        let Some(sig) = self.dequeue_signal(&mask) else {
            return None;
        };
        let action = &actions[sig.signo()];
        if let Some(os_action) = self.handle_signal(tf, restore_blocked, &sig, action) {
            break Some((sig, os_action));
        }
    }
}

Sources: src/api/thread.rs(L119 - L143) 

Signal Frame Restoration

The restore method restores the original context from a signal frame:

pub fn restore(&self, tf: &mut TrapFrame) {
    let frame_ptr = tf.sp() as *const SignalFrame;
    // SAFETY: pointer is valid
    let frame = unsafe { &*frame_ptr };
    
    *tf = frame.tf;
    frame.ucontext.mcontext.restore(tf);
    
    *self.blocked.lock() = frame.ucontext.sigmask;
}

Sources: src/api/thread.rs(L145 - L155) 

Signal Waiting

The wait_timeout method allows a thread to wait for a specific set of signals:

pub fn wait_timeout(
    &self,
    mut set: SignalSet,
    timeout: Option<Duration>,
) -> Option<SignalInfo> {
    // Non-blocked signals cannot be waited
    set &= self.blocked();
    
    if let Some(sig) = self.dequeue_signal(&set) {
        return Some(sig);
    }
    
    let wq = &self.proc.wq;
    let deadline = timeout.map(|dur| axhal::time::wall_time() + dur);
    
    // There might be false wakeups, so we need a loop
    loop {
        match &deadline {
            Some(deadline) => {
                match deadline.checked_sub(axhal::time::wall_time()) {
                    Some(dur) => {
                        if wq.wait_timeout(Some(dur)) {
                            // timed out
                            break;
                        }
                    }
                    None => {
                        // deadline passed
                        break;
                    }
                }
            }
            _ => wq.wait(),
        }
        
        if let Some(sig) = self.dequeue_signal(&set) {
            return Some(sig);
        }
    }
    
    // TODO: EINTR
    None
}

Sources: src/api/thread.rs(L196 - L239) 

Integration with the Signal Handling System

The ThreadSignalManager is a key component in the overall signal handling architecture:


Sources: src/api/thread.rs(L21 - L31)  src/api/thread.rs(L43 - L48) 

Performance and Synchronization

The ThreadSignalManager uses mutexes to protect its internal state:

Protected ResourcePurpose
pendingProtects the thread's pending signals queue
blockedProtects the set of signals blocked from delivery
stackProtects the signal handler stack configuration

The manager also interacts with the process-level wait queue for signal notification across threads.

Sources: src/api/thread.rs(L21 - L31)  src/api/thread.rs(L196 - L239) 

Summary

The Thread Signal Manager is a crucial component of the axsignal crate that handles thread-specific signal management. It works in coordination with the Process Signal Manager to provide a complete signal handling solution, supporting both standard Unix-like signals and real-time signals across multiple architectures.

Key responsibilities include:

  • Managing thread-specific pending signals
  • Handling signal blocking and unblocking
  • Setting up and executing signal handlers
  • Managing signal stacks
  • Coordinating with the process-level signal manager
  • Supporting signal waiting operations with timeouts

The thread-specific nature of the ThreadSignalManager allows for fine-grained control over signal handling within multi-threaded applications, while still maintaining compatibility with process-level signal delivery mechanisms.

Process Signal Manager

Relevant source files

Purpose and Scope

The Process Signal Manager is a core component of the axsignal crate that handles signal management at the process level. It provides mechanisms for managing, queuing, and delivering signals to processes in a Unix-like manner. This component serves as the foundation for process-wide signal operations while working alongside thread-specific signal handling. For thread-level signal management, see Thread Signal Manager.

Sources: src/api/process.rs(L32 - L82) 

Structure and Components

The Process Signal Manager consists of several key components that work together to manage signals at the process level.

classDiagram
class ProcessSignalManager {
    +Mutex pending
    +Arc~ actions
    +WaitQueue wq
    +usize default_restorer
    +new(actions, default_restorer)
    +dequeue_signal(mask)
    +send_signal(sig)
    +pending()
    +wait_signal()
}

class SignalActions {
    +[SignalAction; 64] 0
    +index(Signo)
    +index_mut(Signo)
}

class PendingSignals {
    +SignalSet set
    +Option[32] info_std
    +VecDeque[33] info_rt
    +new()
    +put_signal(SignalInfo)
    +dequeue_signal(SignalSet)
}

class WaitQueue {
    
    +wait()
    +notify_one()
}

ProcessSignalManager  -->  SignalActions : contains Arc>
ProcessSignalManager  -->  PendingSignals : contains Mutex<>
ProcessSignalManager  -->  WaitQueue : contains

Diagram: Process Signal Manager Structure

Sources: src/api/process.rs(L32 - L48)  src/api/process.rs(L13 - L30) 

Core Components

  1. Pending Signals: A mutex-protected PendingSignals instance that stores signals queued for the process.
  2. Signal Actions: An atomic reference counted, mutex-protected SignalActions that defines how the process responds to different signals.
  3. Wait Queue: Provides synchronization for tasks waiting on signals, used in operations like rt_sigtimedwait.
  4. Default Restorer: A function pointer (as usize) that serves as the default signal handler restorer.

Sources: src/api/process.rs(L32 - L48) 

Signal Actions

The SignalActions structure maintains an array of 64 signal actions, providing indexed access to actions for each signal number.

SignalActions[1] -> Action for SIGHUP
SignalActions[2] -> Action for SIGINT
...
SignalActions[64] -> Action for signal 64

The structure implements Index and IndexMut traits to allow convenient access to signal actions by their signal number (Signo).

Sources: src/api/process.rs(L13 - L30) 

Signal Flow and Management

The Process Signal Manager plays a central role in the signal handling flow, serving as an intermediary between signal sources and handler execution.

flowchart TD
subgraph subGraph0["Signal Flow"]
    SendSignal["send_signal(SignalInfo)"]
    AddPending["Add to pending signals"]
    NotifyWaiters["Notify waiting tasks"]
    CheckPending["pending()"]
    GetPendingSet["Return SignalSet of pending signals"]
    DequeueSignal["dequeue_signal(mask)"]
    CheckMask["Signal in mask?"]
    RemoveSignal["Remove from pending"]
    ReturnNone["Return None"]
    ReturnSignal["Return SignalInfo"]
    WaitSignal["wait_signal()"]
    SuspendTask["Suspend current task"]
    WakeupTask["Wake up waiting tasks"]
end

AddPending --> NotifyWaiters
CheckMask --> RemoveSignal
CheckMask --> ReturnNone
CheckPending --> GetPendingSet
DequeueSignal --> CheckMask
NotifyWaiters --> WakeupTask
RemoveSignal --> ReturnSignal
SendSignal --> AddPending
WaitSignal --> SuspendTask
WakeupTask --> SuspendTask

Diagram: Process Signal Manager Operations

Sources: src/api/process.rs(L60 - L81) 

Key Operations

  1. Signal Queueing: When a signal is sent to a process using send_signal(), it's added to the pending signals queue and triggers a notification on the wait queue.
  2. Signal Retrieval: The dequeue_signal() method allows retrieving a pending signal if it's not blocked by the provided mask.
  3. Pending Signal Management: The pending() method returns the set of signals currently pending for the process.
  4. Signal Waiting: The wait_signal() method suspends the current task until a signal is delivered to the process.

Sources: src/api/process.rs(L60 - L81) 

Integration with Signal Management System

The Process Signal Manager integrates with the broader signal management system, particularly with the Thread Signal Manager and other signal handling components.

flowchart TD
subgraph subGraph0["Signal Management System"]
    PSM["ProcessSignalManager"]
    ProcessPending["Process Pending Signals"]
    SignalActions["Signal Actions"]
    TSM["ThreadSignalManager"]
    ThreadPending["Thread Pending Signals"]
    SignalSource["Signal Source"]
    SignalInfo["SignalInfo"]
    Action["SignalAction"]
    Pending["PendingSignals"]
end

PSM --> ProcessPending
PSM --> SignalActions
ProcessPending --> SignalInfo
SignalActions --> SignalInfo
SignalSource --> PSM
SignalSource --> TSM
TSM --> PSM
TSM --> ThreadPending
ThreadPending --> SignalInfo

Diagram: Process Signal Manager in the Signal System

Sources: src/api/process.rs(L32 - L82) 

Key Relationships

  1. Thread Signal Manager: Thread Signal Managers reference a Process Signal Manager, allowing them to check for process-level signals when no thread-specific signals are pending.
  2. Signal Actions: The Process Signal Manager maintains the signal actions table that defines how signals are handled. This table is shared across all threads in the process.
  3. Wait Queue: The Process Signal Manager provides a wait queue that allows tasks to wait for signals, with potential false wakeups due to its shared nature.
  4. Signal Delivery: When signals are sent to a process, they're queued in the Process Signal Manager's pending signals queue. Threads can then dequeue these signals based on their signal masks.

Sources: src/api/process.rs(L32 - L82) 

Implementation Details

The Process Signal Manager is a generic structure parameterized by two types:

  • M: A type that implements the RawMutex trait, used for synchronization
  • WQ: A type that implements the WaitQueue trait, used for signal waiting

This allows flexibility in the underlying synchronization mechanisms while maintaining a consistent API.

Constructor

#![allow(unused)]
fn main() {
pub fn new(actions: Arc<Mutex<M, SignalActions>>, default_restorer: usize) -> Self
}

Creates a new Process Signal Manager with the given signal actions and default restorer function.

Signal Handling

The send_signal method adds a signal to the pending queue and notifies waiting tasks:

#![allow(unused)]
fn main() {
pub fn send_signal(&self, sig: SignalInfo) {
    self.pending.lock().put_signal(sig);
    self.wq.notify_one();
}
}

This simple mechanism ensures that signals are properly queued and waiting tasks are notified, allowing them to check for and potentially handle the new signal.

Sources: src/api/process.rs(L49 - L82) 

Usage Considerations

When using the Process Signal Manager, consider these important points:

  1. Shared Access: The Process Signal Manager is shared across all threads in a process, requiring proper synchronization (provided by the mutex implementations).
  2. Wait Queue Behavior: The wait queue may cause false wakeups since it's shared by all threads in the process. Applications should be designed to handle this case.
  3. Default Restorer: The default restorer function is architecture-specific and is used when a signal handler doesn't provide its own restorer.
  4. Signal Actions: Signal actions define the behavior for each signal and are shared across the process, ensuring consistent handling regardless of which thread receives a signal.

Sources: src/api/process.rs(L32 - L82) 

Wait Queue Interface

Relevant source files

Purpose and Scope

The Wait Queue Interface is a synchronization mechanism used within the axsignal crate to enable threads to efficiently wait for signals. It provides the fundamental building blocks for implementing signal suspension operations like sigsuspend() and sigtimedwait(). This document covers the Wait Queue trait definition, its implementation requirements, and how it's used within the signal management system.

For information about the overall signal management architecture, see Signal Management System, and for process-level and thread-level signal management, see Process Signal Manager and Thread Signal Manager respectively.

Wait Queue Trait Definition

The WaitQueue trait defines an abstract interface for a thread waiting mechanism that can be used across different parts of the signal handling system.

classDiagram
note for WaitQueue "Implemented by concrete wait queuecomponents in the OS"
class WaitQueue {
    <<trait>>
    
    +wait_timeout(timeout: Option~Duration~) bool
    +wait()
    +notify_one() bool
    +notify_all()
}

Sources: src/api/mod.rs(L9 - L30) 

The trait provides four essential methods:

MethodDescriptionReturn Value
wait_timeoutBlocks the current thread until notified or timeout expirestrueif a notification came,falseif timeout expired
waitBlocks the current thread indefinitely until notifiedNone (callswait_timeoutwithNone)
notify_oneWakes up a single waiting thread, if anytrueif a thread was notified
notify_allWakes up all waiting threadsNone (repeatedly callsnotify_one)

Sources: src/api/mod.rs(L9 - L30) 

Integration with Signal Management System

The Wait Queue is a critical component in the signal management architecture, enabling signal-based thread suspension and notification.

flowchart TD
subgraph subGraph1["Signal Operations"]
    SendSignal["send_signal()"]
    WaitSignal["wait_timeout()"]
    Notify["notify_all()"]
    WaitMethod["wait_timeout()"]
end
subgraph subGraph0["Signal Management System"]
    ProcessSigMgr["ProcessSignalManager"]
    ThreadSigMgr["ThreadSignalManager"]
    WaitQ["WaitQueue"]
end

Notify --> WaitMethod
ProcessSigMgr --> WaitQ
SendSignal --> Notify
ThreadSigMgr --> SendSignal
ThreadSigMgr --> WaitQ
ThreadSigMgr --> WaitSignal
WaitQ --> Notify
WaitQ --> WaitMethod
WaitSignal --> WaitMethod

Sources: src/api/thread.rs(L22 - L24)  src/api/thread.rs(L197 - L239)  src/api/thread.rs(L157 - L163) 

Wait Queue Usage in Signal Waiting

The Wait Queue is primarily used to implement signal waiting functionality in the ThreadSignalManager:

sequenceDiagram
    participant Thread as Thread
    participant ThreadSignalManager as ThreadSignalManager
    participant ProcessSignalManager as ProcessSignalManager
    participant WaitQueue as WaitQueue

    Thread ->> ThreadSignalManager: wait_timeout(set, timeout)
    ThreadSignalManager ->> ThreadSignalManager: Check if signal already pending
    alt Signal already pending
    ThreadSignalManager -->> Thread: Return signal
    else No signal pending
    ThreadSignalManager ->> ProcessSignalManager: Access wait queue
    ThreadSignalManager ->> WaitQueue: wait_timeout(timeout)
    WaitQueue -->> ThreadSignalManager: Return (may be false wakeup)
    loop Until signal or timeout
        ThreadSignalManager ->> ThreadSignalManager: Check for pending signals
    alt Signal received
        ThreadSignalManager -->> Thread: Return signal
    else False wakeup or timeout
    alt Timeout expired
        ThreadSignalManager -->> Thread: Return None
    else Timeout not expired
        ThreadSignalManager ->> WaitQueue: wait_timeout(remaining)
    end
    end
    end
    end

Sources: src/api/thread.rs(L197 - L239) 

Implementation Details

Signal Waiting with Timeout

The wait_timeout method in ThreadSignalManager demonstrates how the Wait Queue is used to implement signal waiting functionality:

  1. First checks if a relevant signal is already pending
  2. If not, calculates a deadline based on the timeout
  3. Enters a loop that:
  • Waits on the process's wait queue with a timeout
  • Checks if a relevant signal is now pending after each wakeup
  • Handles cases of false wakeups by continuing to wait
  • Manages the remaining timeout duration

Sources: src/api/thread.rs(L197 - L239) 

Signal Notification

When a signal is sent to a thread, the wait queue is notified:

send_signal() → put_signal() → wq.notify_all()

This ensures that any threads waiting for signals are woken up to check if one of their waited-for signals is now pending.

Sources: src/api/thread.rs(L157 - L163) 

Key Considerations for Wait Queue Implementations

The WaitQueue trait is defined as a generic interface, allowing different concrete implementations to be used. Implementations must consider:

  1. Timeout handling: Must support both indefinite waiting and time-limited waiting
  2. False wakeup handling: The signal management code is designed to handle spurious wakeups by rechecking conditions
  3. Efficiency: Should efficiently wake only necessary threads when possible
  4. Fairness: Ideally should wake threads in a fair manner (e.g., FIFO order)

The default implementations of wait() and notify_all() are provided for convenience, but concrete implementations may override them for better performance.

Sources: src/api/mod.rs(L16 - L29) 

Wait Queue in the Signal Processing Flow

The Wait Queue plays a crucial role in the overall signal processing flow:

flowchart TD
subgraph subGraph2["Signal Waiting"]
    Wait["wait_timeout(set, timeout)"]
    WaitMethod["WaitQueue.wait_timeout()"]
    CheckSig["Check for pending signals"]
    DeqSig["Dequeue signal"]
    Return["Return None"]
end
subgraph subGraph1["Signal Queuing"]
    PendQ["PendingSignals"]
    Notify["WaitQueue.notify_all()"]
end
subgraph subGraph0["Signal Generation"]
    SigSend["send_signal()"]
end
SigInfo["SignalInfo"]

CheckSig --> DeqSig
CheckSig --> Return
DeqSig --> SigInfo
Notify --> WaitMethod
PendQ --> CheckSig
SigSend --> Notify
SigSend --> PendQ
Wait --> WaitMethod
WaitMethod --> CheckSig

Sources: src/api/thread.rs(L197 - L239)  src/api/thread.rs(L157 - L163) 

Summary

The Wait Queue Interface provides a critical synchronization mechanism for the axsignal crate, enabling efficient signal waiting and notification. By abstracting the waiting and notification operations through a trait, the system allows for flexible implementation while maintaining a consistent interface. The ThreadSignalManager leverages this interface to implement signal waiting functionality, with proper handling of timeouts and false wakeups.

Signal Types and Structures

Relevant source files

Purpose and Scope

This page documents the core data structures used to represent and manage signals in the axsignal crate. These structures form the foundation of the signal handling system in ArceOS, providing a Unix-like signal framework that's compatible with Linux signal interfaces. For information on how signals are managed at the process and thread levels, see Signal Management System.

Core Signal Types

The axsignal crate defines several fundamental types that represent different aspects of signals in the system.

classDiagram
class Signo {
    
    +enum values(SIGHUP=1 to SIGRT32=64)
    +is_realtime() bool
    +default_action() DefaultSignalAction
}

class SignalSet {
    +u64 value
    +add(Signo) bool
    +remove(Signo) bool
    +has(Signo) bool
    +dequeue(SignalSet) Option~Signo~
    +to_ctype(kernel_sigset_t)
}

class SignalInfo {
    +siginfo_t raw_value
    +new(Signo, i32) SignalInfo
    +signo() Signo
    +set_signo(Signo)
    +code() i32
    +set_code(i32)
}

class SignalStack {
    +usize sp
    +u32 flags
    +usize size
    +disabled() bool
}

SignalInfo  -->  Signo : contains
SignalSet  -->  Signo : operates on

Sources: src/types.rs(L12 - L77)  src/types.rs(L123 - L182)  src/types.rs(L185 - L215)  src/types.rs(L218 - L240) 

Signal Numbers (Signo)

The Signo enum represents signal numbers compatible with Unix-like systems. It defines constants for standard signals (1-31) and real-time signals (32-64).

flowchart TD
subgraph subGraph2["Real-time Signal Examples"]
    SIGRTMIN["SIGRTMIN (32)"]
    SIGRT1["SIGRT1 (33)"]
    SIGRT32["SIGRT32 (64)"]
end
subgraph subGraph1["Standard Signal Examples"]
    SIGHUP["SIGHUP (1)"]
    SIGINT["SIGINT (2)"]
    SIGTERM["SIGTERM (15)"]
    SIGKILL["SIGKILL (9)"]
end
subgraph subGraph0["Signal Categories"]
    StandardSignals["Standard Signals (1-31)"]
    RealTimeSignals["Real-time Signals (32-64)"]
end

RealTimeSignals --> SIGRT1
RealTimeSignals --> SIGRT32
RealTimeSignals --> SIGRTMIN
StandardSignals --> SIGHUP
StandardSignals --> SIGINT
StandardSignals --> SIGKILL
StandardSignals --> SIGTERM

Key features of the Signo enum:

  • Represents 64 different signal types (1-64)
  • Distinguishes between standard signals (1-31) and real-time signals (32-64)
  • Provides the is_realtime() method to identify signal categories
  • Associates default actions with each signal through the default_action() method

The default actions for signals include:

  • Terminate: End the process
  • CoreDump: End the process and generate a core dump
  • Ignore: Do nothing
  • Stop: Pause the process
  • Continue: Resume a stopped process

Sources: src/types.rs(L12 - L77)  src/types.rs(L80 - L119) 

Signal Sets (SignalSet)

The SignalSet structure represents a set of signals, compatible with the Linux sigset_t type. It uses a 64-bit integer internally, where each bit corresponds to a signal number.

Key operations on SignalSet:

  • add(signal): Adds a signal to the set
  • remove(signal): Removes a signal from the set
  • has(signal): Checks if a signal is in the set
  • dequeue(mask): Removes and returns a signal from the set that is also in the provided mask

The structure provides conversion to and from the Linux kernel_sigset_t type, ensuring compatibility with Linux syscalls and ABI.

Sources: src/types.rs(L123 - L182) 

Signal Information (SignalInfo)

The SignalInfo structure encapsulates detailed information about a signal, compatible with the Linux siginfo_t type. It provides a transparent wrapper around the raw Linux type with convenient methods for accessing and modifying signal properties.

Key features:

  • Retrieves and sets the signal number (signo)
  • Retrieves and sets the signal code (code)
  • Preserves compatibility with the Linux ABI for signal handlers that expect a siginfo_t parameter

Sources: src/types.rs(L185 - L215) 

Signal Stack (SignalStack)

The SignalStack structure defines an alternate stack for signal handlers, compatible with the Linux sigaltstack structure. Signal stacks provide a dedicated memory area for signal handlers to execute, which is useful for handling stack overflow situations.

Fields:

  • sp: Stack pointer (address)
  • flags: Stack flags (e.g., SS_DISABLE to disable the alternate stack)
  • size: Size of the stack in bytes

The disabled() method checks if the alternate stack is disabled.

Sources: src/types.rs(L218 - L240) 

Signal Action Components

The signal action subsystem defines how signals are handled when they are delivered.

classDiagram
class SignalAction {
    +SignalActionFlags flags
    +SignalSet mask
    +SignalDisposition disposition
    +__sigrestore_t restorer
    +to_ctype(kernel_sigaction)
}

class SignalActionFlags {
    +SIGINFO
    +NODEFER
    +RESETHAND
    +RESTART
    +ONSTACK
    +RESTORER
    
}

class SignalDisposition {
    <<enum>>
    +Default
    +Ignore
    +Handler(fn(i32))
}

class DefaultSignalAction {
    <<enum>>
    +Terminate
    +Ignore
    +CoreDump
    +Stop
    +Continue
    
}

class SignalOSAction {
    <<enum>>
    +Terminate
    +CoreDump
    +Stop
    +Continue
    +Handler
    
}

class SignalSet {
    
    
}

SignalAction  -->  SignalActionFlags : contains
SignalAction  -->  SignalSet : contains
SignalAction  -->  SignalDisposition : contains
SignalAction  ..>  DefaultSignalAction : uses when disposition is Default
SignalDisposition  ..>  SignalOSAction : converted to

Sources: src/action.rs(L16 - L156) 

Default Signal Actions

The DefaultSignalAction enum defines the possible default behaviors for signals:

ActionDescription
TerminateEnd the process
IgnoreDo nothing when the signal is received
CoreDumpEnd the process and generate a core dump
StopPause the process
ContinueResume a stopped process

Each signal has a predefined default action as specified by the default_action() method in the Signo enum.

Sources: src/action.rs(L16 - L31)  src/types.rs(L84 - L119) 

Signal Action Flags

The SignalActionFlags bitflags define modifiers for signal handling behavior:

FlagDescription
SIGINFOHandler expects additional signal information
NODEFERSignal is not blocked during handler execution
RESETHANDReset handler to default after execution
RESTARTAutomatically restart interrupted system calls
ONSTACKUse alternate signal stack
RESTORERCustom signal restorer function is provided

These flags match the Linux SA_* constants and modify how signals are handled and processed.

Sources: src/action.rs(L50 - L59) 

Signal Disposition

The SignalDisposition enum defines how a specific signal should be handled:

  • Default: Use the default action for the signal
  • Ignore: Ignore the signal
  • Handler(fn): Execute a custom handler function

This is part of the SignalAction structure and determines the action taken when a signal is delivered.

Sources: src/action.rs(L73 - L83) 

Signal Action Structure

The SignalAction structure combines all aspects of signal handling configuration:

  • flags: Bitflags that modify signal handling behavior
  • mask: Set of signals blocked during handler execution
  • disposition: How the signal should be handled
  • restorer: Function to restore context after handler execution

This structure is compatible with the Linux sigaction structure and provides conversion methods for Linux ABI compatibility.

Sources: src/action.rs(L85 - L156) 

Pending Signal Management

The pending signal subsystem manages signals that have been generated but not yet delivered or handled.

flowchart TD
subgraph subGraph1["Signal Flow"]
    PutSignal["put_signal(SignalInfo)"]
    DequeueSignal["dequeue_signal(SignalSet)"]
    StandardSig["Standard Signal"]
    RTSig["Real-time Signal"]
    StandardSet["Set bit in SignalSet"]
    RTSet["Set bit in SignalSet"]
    StandardStore["Store in info_std array"]
    RTQueue["Push to info_rt queue"]
end
subgraph PendingSignals["PendingSignals"]
    SignalSet["SignalSet (all pending signals)"]
    StandardQueue["Standard Signal Queue (info_std)"]
    RealTimeQueue["Real-time Signal Queue (info_rt)"]
end

DequeueSignal --> SignalSet
PutSignal --> RTSig
PutSignal --> StandardSig
RTSig --> RTQueue
RTSig --> RTSet
SignalSet --> RealTimeQueue
SignalSet --> StandardQueue
StandardSig --> StandardSet
StandardSig --> StandardStore

Sources: src/pending.rs(L8 - L66) 

PendingSignals Structure

The PendingSignals structure maintains a queue of signals that are waiting to be delivered and processed:

  • set: A SignalSet indicating which signals are pending
  • info_std: An array of Option<SignalInfo> for standard signals (1-31)
  • info_rt: An array of queues for real-time signals (32-64)

Key differences in handling standard vs. real-time signals:

  • Standard signals are not queued (at most one instance of each signal can be pending)
  • Real-time signals are fully queued (multiple instances of the same signal can be pending)

Sources: src/pending.rs(L8 - L29) 

Signal Queueing Mechanisms

The PendingSignals structure implements two primary operations:

  1. put_signal(sig): Adds a signal to the pending queue
  • For standard signals, if the signal is already pending, the new instance is ignored
  • For real-time signals, each signal is queued regardless of existing pending signals of the same type
  1. dequeue_signal(mask): Removes and returns a signal from the pending queue
  • Only returns signals that are included in the provided mask
  • For standard signals, it clears the corresponding bit in the signal set
  • For real-time signals, it removes one instance from the queue and only clears the bit if the queue becomes empty

This two-tier design provides different quality-of-service levels for standard and real-time signals, matching the behavior of Unix-like systems.

Sources: src/pending.rs(L30 - L66) 

Linux Compatibility Model

The signal types and structures in axsignal are designed to be binary-compatible with their Linux counterparts.

flowchart TD
subgraph subGraph1["Linux Types"]
    LSigSet["kernel_sigset_t"]
    LSigInfo["siginfo_t"]
    LSigStack["sigaltstack"]
    LSigAction["kernel_sigaction"]
end
subgraph subGraph0["axsignal Types"]
    ASigSet["SignalSet"]
    ASigInfo["SignalInfo"]
    ASigStack["SignalStack"]
    ASigAction["SignalAction"]
end

ASigAction --> LSigAction
ASigInfo --> LSigInfo
ASigSet --> LSigSet
ASigStack --> LSigStack

Key compatibility features:

  • #[repr(transparent)] ensures binary compatibility for SignalSet and SignalInfo
  • #[repr(C)] ensures memory layout compatibility for SignalStack
  • Conversion methods (to_ctype, TryFrom) provide interoperability with the Linux ABI

This compatibility layer enables the axsignal crate to interact seamlessly with Linux syscalls and application code that expects Linux-compatible signal structures.

Sources: src/types.rs(L123 - L182)  src/types.rs(L185 - L215)  src/types.rs(L218 - L240)  src/action.rs(L85 - L156) 

Signal Numbers and Sets

Relevant source files

This document details the signal numbers and signal sets implementation in the axsignal crate, which provides the foundation for signal handling in ArceOS. For information about signal actions and handling, see Signal Actions and Dispositions. For information about pending signal management, see Pending Signals.

Overview

The signal system in axsignal implements Unix-compatible signal numbers and sets that are used throughout the signal handling framework. Signal numbers (represented by the Signo enum) identify specific signals, while signal sets (represented by the SignalSet struct) provide an efficient way to manage collections of signals.

flowchart TD
subgraph subGraph1["Usage in System"]
    TSM["ThreadSignalManager"]
    PSM["ProcessSignalManager"]
    PS["PendingSignals"]
    SA["SignalAction"]
end
subgraph subGraph0["Signal Numbers and Sets"]
    Signo["Signo Enum"]
    SignalSet["SignalSet Struct"]
end

SignalSet --> PS
SignalSet --> PSM
SignalSet --> TSM
Signo --> PS
Signo --> SA

Sources: src/types.rs(L9 - L182) 

Signal Numbers (Signo)

The Signo enum defines all standard Unix signals and real-time signals. It is implemented as a u8 enum with explicit numeric values that correspond to standard Unix signal numbers.

Signal Categories

Signal numbers in axsignal are divided into two main categories:

  1. Standard Signals (1-31): Traditional Unix signals with predefined behaviors
  2. Real-time Signals (32-64): Extended signals for application-defined purposes
classDiagram
class Signo {
    <<enum>>
    SIGHUP = 1
    SIGINT = 2
    ...
    SIGSYS = 31
    SIGRTMIN = 32
    ...
    SIGRT32 = 64
    is_realtime() bool
    default_action() DefaultSignalAction
}

class DefaultSignalAction {
    <<enum>>
    Terminate
    CoreDump
    Ignore
    Stop
    Continue
    
}

Signo  -->  DefaultSignalAction : returns

Sources: src/types.rs(L9 - L77)  src/types.rs(L79 - L120) 

Standard Signals

Standard signals (1-31) represent traditional Unix signals, each with a specific purpose and default behavior:

Signal NumberNameDefault ActionDescription
1SIGHUPTerminateHangup detected on controlling terminal
2SIGINTTerminateInterrupt from keyboard (Ctrl+C)
3SIGQUITCoreDumpQuit from keyboard (Ctrl+)
4SIGILLCoreDumpIllegal instruction
5SIGTRAPCoreDumpTrace/breakpoint trap
6SIGABRTCoreDumpAbort signal
7SIGBUSCoreDumpBus error
8SIGFPECoreDumpFloating-point exception
9SIGKILLTerminateKill signal (cannot be caught or ignored)
10SIGUSR1TerminateUser-defined signal 1
11SIGSEGVCoreDumpInvalid memory reference
12SIGUSR2TerminateUser-defined signal 2
13SIGPIPETerminateBroken pipe
14SIGALRMTerminateTimer signal
15SIGTERMTerminateTermination signal
16SIGSTKFLTTerminateStack fault
17SIGCHLDIgnoreChild stopped or terminated
18SIGCONTContinueContinue if stopped
19SIGSTOPStopStop process (cannot be caught or ignored)
20SIGTSTPStopStop typed at terminal (Ctrl+Z)
21SIGTTINStopTerminal input for background process
22SIGTTOUStopTerminal output for background process
23SIGURGIgnoreUrgent condition on socket
24SIGXCPUCoreDumpCPU time limit exceeded
25SIGXFSZCoreDumpFile size limit exceeded
26SIGVTALRMTerminateVirtual alarm clock
27SIGPROFTerminateProfiling timer expired
28SIGWINCHIgnoreWindow resize signal
29SIGIOTerminateI/O now possible
30SIGPWRTerminatePower failure
31SIGSYSCoreDumpBad system call

Sources: src/types.rs(L12 - L43)  src/types.rs(L84 - L118) 

Real-time Signals

Real-time signals (32-64) are numbered from SIGRTMIN (32) to SIGRT32 (64) and are primarily for application-defined purposes. Unlike standard signals, real-time signals:

  • Have no predefined meanings
  • Default to the Ignore action
  • Are queued (multiple instances of the same signal can be pending)
flowchart TD
subgraph subGraph1["Default Actions"]
    Terminate["Terminate Process"]
    CoreDump["Terminate with Core Dump"]
    Ignore["Ignore Signal"]
    Stop["Stop Process"]
    Continue["Continue Process"]
end
subgraph subGraph0["Signal Number Range"]
    StandardSignals["Standard Signals (1-31)"]
    RealTimeSignals["Real-time Signals (32-64)"]
end

RealTimeSignals --> Ignore
StandardSignals --> Continue
StandardSignals --> CoreDump
StandardSignals --> Ignore
StandardSignals --> Stop
StandardSignals --> Terminate

Sources: src/types.rs(L44 - L76)  src/types.rs(L80 - L82)  src/types.rs(L117 - L118) 

Signo Implementation

The Signo enum provides two key methods:

  1. is_realtime(): Determines if a signal is a real-time signal by checking if its value is greater than or equal to SIGRTMIN (32).
#![allow(unused)]
fn main() {
pub fn is_realtime(&self) -> bool {
    *self >= Signo::SIGRTMIN
}
}
  1. default_action(): Returns the default action for a signal (as a DefaultSignalAction enum).
#![allow(unused)]
fn main() {
pub fn default_action(&self) -> DefaultSignalAction {
    match self {
        Signo::SIGHUP => DefaultSignalAction::Terminate,
        // ... other cases ...
        _ => DefaultSignalAction::Ignore, // For real-time signals
    }
}
}

Sources: src/types.rs(L79 - L120) 

Signal Sets (SignalSet)

A SignalSet is a bit vector representation of a set of signals, compatible with the C sigset_t type. It provides an efficient way to represent and manipulate collections of signals.

Representation

The SignalSet is implemented as a transparent wrapper around a u64, where:

  • Each bit position corresponds to a signal number minus 1
  • Bit is set (1) if the signal is in the set
  • Bit is clear (0) if the signal is not in the set
flowchart TD
subgraph subGraph0["SignalSet Representation"]
    B1["Bit 0"]
    S1["SIGHUP (1)"]
    B2["Bit 1"]
    S2["SIGINT (2)"]
    B3["Bit 2"]
    S3["SIGQUIT (3)"]
    D["..."]
    DS["..."]
    B30["Bit 30"]
    S31["SIGSYS (31)"]
    B31["Bit 31"]
    S32["SIGRTMIN (32)"]
    B63["Bit 63"]
    S64["SIGRT32 (64)"]
end

B1 --> S1
B2 --> S2
B3 --> S3
B30 --> S31
B31 --> S32
B63 --> S64
D --> DS

Sources: src/types.rs(L122 - L126) 

Operations

The SignalSet struct provides several operations for manipulating signal sets:

  1. Adding a signal: add(&mut self, signal: Signo) -> bool
  • Sets the bit corresponding to the signal
  • Returns true if the signal was not already in the set
  1. Removing a signal: remove(&mut self, signal: Signo) -> bool
  • Clears the bit corresponding to the signal
  • Returns true if the signal was in the set
  1. Checking for a signal: has(&self, signal: Signo) -> bool
  • Returns true if the bit corresponding to the signal is set
  1. Dequeueing a signal: dequeue(&mut self, mask: &SignalSet) -> Option<Signo>
  • Finds and removes the lowest-numbered signal that is both in the set and in the mask
  • Returns the removed signal, or None if no matching signal exists
  1. Bitwise operations: The struct implements Not, BitOr, BitOrAssign, BitAnd, and BitAndAssign
  • Allows combining and modifying signal sets using standard bit operations
flowchart TD
subgraph subGraph1["SignalSet Operations"]
    A["Original Set"]
    B["Modified Set"]
    Result["Boolean Result"]
    Result2["Option"]
    subgraph Operations["Operations"]
        Add["add(SIGINT)"]
        Remove["remove(SIGHUP)"]
        Has["has(SIGTERM)"]
        Dequeue["dequeue(mask)"]
        BitwiseAnd["set1 & set2"]
        BitwiseOr["set1 | set2"]
        BitwiseNot["!set"]
    end
end

A --> Add
A --> BitwiseAnd
A --> BitwiseNot
A --> BitwiseOr
A --> Dequeue
A --> Has
A --> Remove
Add --> B
BitwiseAnd --> B
BitwiseNot --> B
BitwiseOr --> B
Dequeue --> Result2
Has --> Result
Remove --> B

Sources: src/types.rs(L126 - L166) 

C API Compatibility

The SignalSet includes methods for conversion to and from the C kernel_sigset_t type, ensuring compatibility with system calls and C libraries:

  • to_ctype(&self, dest: &mut kernel_sigset_t): Converts the SignalSet to a C kernel_sigset_t
  • From<kernel_sigset_t> for SignalSet: Converts a C kernel_sigset_t to a SignalSet
flowchart TD
subgraph subGraph1["C API"]
    kernel_sigset_t["kernel_sigset_t"]
end
subgraph subGraph0["Rust Code"]
    SignalSet["SignalSet (u64)"]
end

SignalSet --> kernel_sigset_t
kernel_sigset_t --> SignalSet

Sources: src/types.rs(L169 - L181) 

Usage in the Signal System

Signal numbers and sets form the foundation of the signal handling system in axsignal:

  1. Signal identification: Signo enumerates all possible signals that can be sent and received.
  2. Signal masking: SignalSet is used to represent blocked signals in ThreadSignalManager.
  3. Pending signals: SignalSet tracks which signals are pending in PendingSignals.
  4. Signal delivery control: SignalSet determines which signals can be dequeued during signal delivery.
flowchart TD
subgraph subGraph1["Signal Management"]
    TSM["ThreadSignalManager"]
    PSM["ProcessSignalManager"]
    Pending["PendingSignals"]
    Action["SignalAction"]
end
subgraph subGraph0["Signal Numbers & Sets"]
    Signo["Signo"]
    SignalSet["SignalSet"]
end
SendSignal["send_signal(sig)"]
TSM_Blocked["ThreadSignalManager::blocked"]
Pending_Set["PendingSignals::set"]
Dequeue["dequeue_signal(mask)"]

Dequeue --> PSM
Dequeue --> TSM
Pending_Set --> Pending
SendSignal --> PSM
SendSignal --> TSM
SignalSet --> Dequeue
SignalSet --> Pending_Set
SignalSet --> TSM_Blocked
Signo --> Action
Signo --> SendSignal
TSM_Blocked --> TSM

Sources: src/types.rs(L9 - L182) 

Summary

Signal numbers and sets are fundamental components of the axsignal crate:

  • Signo provides a type-safe enumeration of all signal numbers, with additional functionality to determine signal characteristics and default actions.
  • SignalSet provides an efficient, bit-based representation of signal collections with operations for adding, removing, checking, and dequeueing signals.
  • Together, they form the foundation for signal identification, blocking, and delivery throughout the signal handling system.

These components follow Unix/POSIX signal conventions while providing Rust-specific advantages like type safety and clear semantics.

Signal Actions and Dispositions

Relevant source files

This document describes the signal action and disposition system in the axsignal crate, which determines how signals are handled when they are delivered to processes or threads. It covers the core data structures that represent signal handling behaviors and how they interact with the signal processing flow.

For information about the signal numbers and signal sets, see Signal Numbers and Sets. For details about how pending signals are queued, see Pending Signals.

Signal Disposition Types

The SignalDisposition enum defines what happens when a signal is received:

classDiagram
class SignalDisposition {
    <<enum>>
    Default
    Ignore
    Handler(unsafe extern "C" fn(i32))
}

class DefaultSignalAction {
    <<enum>>
    Terminate
    Ignore
    CoreDump
    Stop
    Continue
    
}

SignalDisposition "Default" -->  DefaultSignalAction : maps to
  • Default: Uses the predefined action for the signal (terminate, ignore, etc.)
  • Ignore: The signal is completely ignored
  • Handler: A custom function is called when the signal is delivered

When Default is selected, the actual behavior depends on the signal's default action as defined by the DefaultSignalAction enum.

Sources: src/action.rs(L15 - L31)  src/action.rs(L73 - L82) 

Signal Action Structure

The SignalAction structure represents the complete configuration for how a signal should be handled:

classDiagram
class SignalAction {
    +SignalActionFlags flags
    +SignalSet mask
    +SignalDisposition disposition
    +__sigrestore_t restorer
    +to_ctype(kernel_sigaction) void
}

class SignalActionFlags {
    <<bitflags>>
    +SIGINFO
    +NODEFER
    +RESETHAND
    +RESTART
    +ONSTACK
    +RESTORER
    +from_bits(value)
}

class SignalDisposition {
    <<enum>>
    Default
    Ignore
    Handler(extern "C" fn)
}

class SignalSet {
    
    
}

SignalAction  -->  SignalActionFlags : contains
SignalAction  -->  SignalDisposition : contains
SignalAction  -->  SignalSet : contains
  • flags: Bitflags that modify the behavior of signal handlers
  • mask: Set of signals to block while the handler is running
  • disposition: What to do with the signal (default, ignore, or handle)
  • restorer: Function to restore context after signal handler returns

Sources: src/action.rs(L84 - L112) 

Signal Action Flags

The SignalActionFlags bitflags control aspects of signal handling behavior:

FlagDescription
SIGINFOHandler uses the SA_SIGINFO interface (3 arguments instead of 1)
NODEFERDon't block the signal when handling it
RESETHANDReset to default action after handling the signal once
RESTARTAutomatically restart certain system calls interrupted by the signal
ONSTACKUse the alternate signal stack for the handler
RESTORERTherestorerfield inSignalActionis valid

Sources: src/action.rs(L50 - L60) 

OS Actions for Signal Handling

When a signal is delivered, the system must take one of several actions based on the signal disposition:

flowchart TD
Signal["Signal Delivered"]
Disposition["Check Signal Disposition"]
DefaultAction["Check Default Action"]
NoAction["No Action"]
SetupHandler["Set Up Signal Handler"]
TerminateProcess["OS: Terminate Process"]
CoreDump["OS: Generate Core Dump"]
StopProcess["OS: Stop Process"]
ContinueProcess["OS: Continue Process"]
ExecuteHandler["Execute Handler"]
RestoreContext["Restore Context"]

DefaultAction --> ContinueProcess
DefaultAction --> CoreDump
DefaultAction --> NoAction
DefaultAction --> StopProcess
DefaultAction --> TerminateProcess
Disposition --> DefaultAction
Disposition --> NoAction
Disposition --> SetupHandler
ExecuteHandler --> RestoreContext
SetupHandler --> ExecuteHandler
Signal --> Disposition

The SignalOSAction enum represents the actions that the OS should take after signal disposition is determined:

  • Terminate: End the process
  • CoreDump: Generate a core dump and terminate the process
  • Stop: Suspend the process execution
  • Continue: Resume a stopped process
  • Handler: A handler function has been set up (no OS action needed)

Sources: src/action.rs(L36 - L48) 

Signal Handler Execution Flow

When a signal with a custom handler is delivered, the system performs these steps:

sequenceDiagram
    participant ThreadSignalManager as "ThreadSignalManager"
    participant SignalStack as "Signal Stack"
    participant SignalFrame as "Signal Frame"
    participant SignalHandler as "Signal Handler"
    participant TrapFrame as "Trap Frame"

    ThreadSignalManager ->> ThreadSignalManager: handle_signal()
    ThreadSignalManager ->> SignalStack: Check if stack.disabled() || !ONSTACK flag
    alt Use current stack
        SignalStack -->> ThreadSignalManager: Use tf.sp()
    else Use alternate stack
        SignalStack -->> ThreadSignalManager: Use stack.sp
    end
    ThreadSignalManager ->> SignalFrame: Create new SignalFrame
    ThreadSignalManager ->> SignalFrame: Store UContext (saved state)
    ThreadSignalManager ->> SignalFrame: Store SignalInfo
    ThreadSignalManager ->> SignalFrame: Store original TrapFrame
    ThreadSignalManager ->> TrapFrame: Set IP to handler
    ThreadSignalManager ->> TrapFrame: Set SP to frame location
    ThreadSignalManager ->> TrapFrame: Set arguments (signo, siginfo, ucontext)
    ThreadSignalManager ->> TrapFrame: Set return address to restorer
    alt If RESETHAND flag
    alt set
        ThreadSignalManager ->> ThreadSignalManager: Reset signal action to default
    end
    end
    alt If !NODEFER flag
        ThreadSignalManager ->> ThreadSignalManager: Add signal to blocked set
    end
    ThreadSignalManager -->> SignalHandler: Return (signal handler will execute)
    SignalHandler -->> ThreadSignalManager: Handler returns to restorer
    ThreadSignalManager ->> ThreadSignalManager: restore()
    ThreadSignalManager ->> SignalFrame: Get original TrapFrame
    ThreadSignalManager ->> TrapFrame: Restore original context
    ThreadSignalManager ->> ThreadSignalManager: Restore original signal mask
    ThreadSignalManager -->> ThreadSignalManager: Continue execution

This diagram shows the complete lifecycle of signal handling, from determining the disposition to executing the handler and restoring the original context.

Sources: src/api/thread.rs(L50 - L117)  src/api/thread.rs(L145 - L155) 

Converting Between C and Rust Types

The SignalAction structure provides methods to convert to and from the Linux kernel's kernel_sigaction structure:

From Rust to C Type

The to_ctype method converts a SignalAction to a kernel_sigaction:

  • Copies flags
  • Converts the signal mask
  • Sets the handler based on disposition
  • Sets the restorer function if supported

From C to Rust Type

The TryFrom<kernel_sigaction> implementation converts a kernel_sigaction to a SignalAction:

  • Validates flags
  • Interprets the handler value (None for Default, 1 for Ignore, others as Handler)
  • Extracts the signal mask
  • Extracts the restorer function if supported

Sources: src/action.rs(L93 - L112)  src/action.rs(L115 - L156) 

Signal Handler Function Execution Context

When a signal handler executes, it receives:

  1. Signal number (signo) as the first argument
  2. Pointer to a SignalInfo structure as the second argument (if SIGINFO flag is set)
  3. Pointer to a UContext structure as the third argument (if SIGINFO flag is set)

The UContext contains:

  • The machine context (MContext) with saved CPU registers
  • The signal mask that was in effect before the handler was called
  • Information about the signal stack

Sources: src/api/thread.rs(L14 - L18)  src/api/thread.rs(L85 - L95) 

Pending Signals

Relevant source files

Overview

This document describes the pending signals system in the axsignal crate, which manages signals that have been delivered but not yet processed by their handlers. The pending signals system is responsible for queuing signals, maintaining their associated information, and dequeuing them when they are ready to be handled.

For information about signal types and representations, see Signal Numbers and Sets. For details on the actions taken when signals are handled, see Signal Actions and Dispositions.

Sources: src/pending.rs(L1 - L66) 

PendingSignals Structure

The core of the pending signals system is the PendingSignals structure, which manages two types of signals:

  1. Standard signals (1-31): At most one instance of each standard signal can be pending at any time.
  2. Real-time signals (32-64): Multiple instances of each real-time signal can be queued.

Data Structure Components

classDiagram
class PendingSignals {
    +SignalSet set
    +Option~SignalInfo~[32] info_std
    +VecDeque~SignalInfo~[33] info_rt
    +new()
    +put_signal(SignalInfo) bool
    +dequeue_signal(SignalSet) Option~SignalInfo~
}

class SignalSet {
    +u64 bits
    +add(Signo) bool
    +dequeue(SignalSet) Option~Signo~
}

class SignalInfo {
    +Signo signo
    +int32_t si_code
    +union sigval si_value
    +pid_t si_pid
    +uid_t si_uid
    +...
    
}

PendingSignals  -->  SignalSet : contains
PendingSignals  -->  SignalInfo : stores

The PendingSignals structure consists of:

  • set: A bit field representing which signals are currently pending
  • info_std: An array storing information for standard signals (indices 1-31)
  • info_rt: An array of queues storing information for real-time signals (indices 32-64)

Sources: src/pending.rs(L8 - L21) 

Signal Queuing Process

Adding Signals to the Queue

When a signal is sent to a process or thread, it's added to the pending queue using the put_signal method:

flowchart TD
Start["put_signal(sig)"]
GetSigno["Get signal number"]
AddToSet["Add to SignalSet"]
IsRT["Is real-time signal?"]
QueueRT["Add to info_rt queue"]
ReturnTrue["Return true"]
AlreadyPending["Was signalalready pending?"]
ReturnFalse["Return false"]
SetSTD["Store in info_std array"]

AddToSet --> IsRT
AlreadyPending --> ReturnFalse
AlreadyPending --> SetSTD
GetSigno --> AddToSet
IsRT --> AlreadyPending
IsRT --> QueueRT
QueueRT --> ReturnTrue
SetSTD --> ReturnTrue
Start --> GetSigno

Key points about signal queuing:

  • Standard signals (1-31) will only be queued once, with repeated signals being ignored
  • Real-time signals (32-64) are queued in order of arrival, with multiple instances allowed
  • The put_signal method returns a boolean indicating whether the signal was added to the queue

Sources: src/pending.rs(L31 - L49)  src/api/thread.rs(L157 - L163)  src/api/process.rs(L64 - L70) 

Signal Dequeuing Process

Retrieving Signals from the Queue

Signals are dequeued when they are ready to be handled, using the dequeue_signal method:

flowchart TD
Start["dequeue_signal(mask)"]
DequeueSet["Dequeue a signal number from set"]
SignalFound["Signal found?"]
ReturnNone["Return None"]
IsRT["Is real-time signal?"]
PopQueue["Pop from info_rt queue"]
QueueEmpty["Queue empty?"]
ResetBit["Reset bit in set"]
Skip[""]
ReturnRT["Return signal info"]
TakeSTD["Take from info_std array"]
ReturnSTD["Return signal info"]

DequeueSet --> SignalFound
IsRT --> PopQueue
IsRT --> TakeSTD
PopQueue --> QueueEmpty
QueueEmpty --> ResetBit
QueueEmpty --> Skip
ResetBit --> ReturnRT
SignalFound --> IsRT
SignalFound --> ReturnNone
Skip --> ReturnRT
Start --> DequeueSet
TakeSTD --> ReturnSTD

Key points about signal dequeuing:

  • Signals are dequeued according to priority (lower signal numbers first)
  • Only signals that match the provided mask are considered
  • For real-time signals, the queue maintains signal delivery order
  • After dequeuing, the signal is removed from the pending set unless more instances exist

Sources: src/pending.rs(L50 - L65) 

Hierarchy of Pending Signal Management

The pending signals system operates at two levels:

flowchart TD
subgraph subGraph1["Thread Level"]
    ThreadManager["ThreadSignalManager"]
    ThreadPending["Thread PendingSignals"]
    ThreadBlocked["Thread Blocked SignalSet"]
end
subgraph subGraph0["Process Level"]
    ProcessManager["ProcessSignalManager"]
    ProcessPending["Process PendingSignals"]
    ProcessWaitQueue["WaitQueue"]
end
DequeueSignal["dequeue_signal()"]
SendSignal["send_signal()"]

DequeueSignal --> ThreadPending
ProcessManager --> ProcessPending
ProcessManager --> ProcessWaitQueue
SendSignal --> ProcessPending
SendSignal --> ThreadPending
ThreadManager --> ProcessManager
ThreadManager --> ThreadBlocked
ThreadManager --> ThreadPending
ThreadPending --> ProcessPending

Process-Level Pending Signals

The ProcessSignalManager maintains a process-wide pending signals queue that is shared among all threads in the process. Signals sent to the process are queued here.

Sources: src/api/process.rs(L33 - L35)  src/api/process.rs(L60 - L62)  src/api/process.rs(L64 - L70) 

Thread-Level Pending Signals

Each ThreadSignalManager maintains its own pending signals queue for thread-specific signals. When checking for signals to handle, a thread will:

  1. First check its own pending queue
  2. Then check the process-level pending queue if no signals are found

This hierarchical approach allows for both process-wide and thread-specific signal delivery.

Sources: src/api/thread.rs(L22 - L26)  src/api/thread.rs(L43 - L48)  src/api/thread.rs(L157 - L163)  src/api/thread.rs(L185 - L188) 

Signal Handling Process

When the system checks for signals to handle, it combines the pending signals system with the blocked signals mask:

flowchart TD
CheckSignals["check_signals()"]
GetBlocked["Get blocked signals"]
CreateMask["Create mask of unblocked signals"]
Loop["Loop until no signals or handler found"]
DequeueSignal["Dequeue signal from thread or process queue"]
SignalFound["Signal found?"]
ReturnNone["Return None"]
GetAction["Get signal action"]
HandleSignal["Handle signal based on action"]
SignalHandled["Signal handled?"]
ReturnAction["Return signal info and action"]

CheckSignals --> GetBlocked
CreateMask --> Loop
DequeueSignal --> SignalFound
GetAction --> HandleSignal
GetBlocked --> CreateMask
HandleSignal --> SignalHandled
Loop --> DequeueSignal
SignalFound --> GetAction
SignalFound --> ReturnNone
SignalHandled --> Loop
SignalHandled --> ReturnAction

Key points about signal handling:

  • Only unblocked signals are considered for handling
  • Signals are handled in priority order (lower signal numbers first)
  • Standard signals are processed before real-time signals
  • The action taken depends on the signal's disposition (default, ignore, or handler)

Sources: src/api/thread.rs(L119 - L143) 

Waiting for Signals

The signal system provides mechanisms to wait for signals, implemented through wait queues:

flowchart TD
WaitTimeout["wait_timeout(set, timeout)"]
CheckDequeue["Check if signal already pending"]
Found["Signal found?"]
ReturnSignal["Return signal"]
SetupWait["Setup wait with timeout"]
WaitLoop["Wait on process wait queue"]
WakeUp["Woken up"]
Timeout["Timed out?"]
ReturnNone["Return None"]
CheckAgain["Check for pending signal"]
SignalFound["Signal found?"]
ReturnFound["Return signal"]

CheckAgain --> SignalFound
CheckDequeue --> Found
Found --> ReturnSignal
Found --> SetupWait
SetupWait --> WaitLoop
SignalFound --> ReturnFound
SignalFound --> WaitLoop
Timeout --> CheckAgain
Timeout --> ReturnNone
WaitLoop --> WakeUp
WaitTimeout --> CheckDequeue
WakeUp --> Timeout

When waiting for signals:

  1. The thread first checks if any of the requested signals are already pending
  2. If not, it waits on the process wait queue
  3. When a signal arrives, the queue is notified and the thread wakes up
  4. The thread checks again for the requested signals
  5. If found, it returns; otherwise, it continues waiting until timeout

Sources: src/api/thread.rs(L190 - L239)  src/api/process.rs(L76 - L81) 

Standard vs. Real-Time Signals Comparison

FeatureStandard Signals (1-31)Real-Time Signals (32-64)
StorageSingle slot per signal numberQueue for each signal number
QueuingAt most one instance pendingMultiple instances can be queued
OverwritingNew signals overwrite older onesSignals queued in arrival order
InformationMinimal signal info storedFull signal info preserved for each instance
Typical UseCommon system signals (SIGINT, SIGTERM, etc.)Application-specific signals with data

Sources: src/pending.rs(L8 - L21)  src/pending.rs(L31 - L49)  src/pending.rs(L50 - L65) 

Architecture Support

Relevant source files

This document covers the architecture-specific implementation layer of the axsignal crate, which enables signal handling across multiple CPU architectures. The architecture support subsystem provides platform-specific code for handling CPU context during signal delivery and processing, allowing the signal handling system to work consistently across different hardware platforms.

For information about specific architecture implementations, see:

Architecture Abstraction Layer

The architecture support subsystem employs conditional compilation to select the appropriate implementation based on the target architecture. It provides a consistent interface to the rest of the signal handling system while handling architecture-specific details internally.

flowchart TD
subgraph subGraph1["arch Module"]
    arch_mod["arch/mod.rs"]
    signal_trampoline["signal_trampoline()"]
    signal_addr["signal_trampoline_address()"]
    subgraph subGraph0["Architecture-Specific Implementations"]
        x86_64["x86_64.rs"]
        riscv["riscv.rs"]
        aarch64["aarch64.rs"]
        loongarch64["loongarch64.rs"]
    end
end
trampoline_extern["Arch-specific assembly implementation"]

arch_mod --> aarch64
arch_mod --> loongarch64
arch_mod --> riscv
arch_mod --> signal_addr
arch_mod --> signal_trampoline
arch_mod --> x86_64
signal_addr --> signal_trampoline
signal_trampoline --> trampoline_extern

Diagram: Architecture Module Structure

Sources: src/arch/mod.rs(L1 - L25)  src/lib.rs(L8 - L9) 

The architecture abstraction layer is implemented using Rust's conditional compilation feature through the cfg_if macro. Each supported architecture has its own implementation file that is selected at compile time based on the target architecture.

Common Architecture Interface

Every architecture-specific implementation must provide the following key components:

ComponentPurpose
MContextMachine context - architecture-specific CPU state
UContextUser context - complete execution context including signal mask
signal_trampolineAssembly routine for calling signal handlers
Context manipulation functionsSave/restore CPU state during signal handling

Sources: src/arch/mod.rs(L19 - L25) 

Signal Context Management

One of the crucial aspects of signal handling is saving and restoring the execution context. The architecture support layer defines two main structures for this purpose:

classDiagram
class UContext {
    +MContext mcontext
    +SignalStack stack
    +SignalSet mask
    +usize flags
    
}

class MContext {
    +registers
    +program_counter
    +stack_pointer
    +other arch-specific state
    
}

class TrapFrame {
    +architecture-specific
    +register state
    
}

UContext "1" *-- "1" MContext : contains
MContext "1" -->  TrapFrame : converts to/from

Diagram: Signal Context Data Structures

When a signal is delivered to a process or thread, the current execution context must be saved to allow the signal handler to run. After the signal handler completes, the original context is restored. The architecture-specific implementation handles how CPU registers and other hardware state are saved and restored.

Signal Trampoline Mechanism

A critical component provided by the architecture layer is the signal trampoline:

sequenceDiagram
    participant KernelMode as "Kernel Mode"
    participant ThreadSignalManager as "ThreadSignalManager"
    participant signal_trampoline as "signal_trampoline"
    participant UserSignalHandler as "User Signal Handler"

    KernelMode ->> ThreadSignalManager: Trap/Exception
    ThreadSignalManager ->> ThreadSignalManager: check_signals()
    ThreadSignalManager ->> ThreadSignalManager: handle_signal()
    ThreadSignalManager ->> KernelMode: Save current context
    ThreadSignalManager ->> KernelMode: Set up stack for handler
    ThreadSignalManager ->> signal_trampoline: Jump to signal_trampoline
    signal_trampoline ->> UserSignalHandler: Call user handler
    UserSignalHandler ->> signal_trampoline: Return
    signal_trampoline ->> KernelMode: Call sigreturn syscall
    KernelMode ->> ThreadSignalManager: restore()
    ThreadSignalManager ->> KernelMode: Restore original context

Diagram: Signal Trampoline Flow

Sources: src/arch/mod.rs(L19 - L25) 

The signal_trampoline function is a small assembly routine that:

  1. Calls the user's signal handler with appropriate arguments
  2. After the handler returns, performs a sigreturn syscall to restore the original execution context

This function is critical because it bridges between the kernel's signal delivery mechanism and the user-space signal handler, ensuring proper setup and cleanup.

Build System Integration

The architecture support layer also interacts with the build system to enable or disable certain features based on the target architecture:

flowchart TD
subgraph build.rs["build.rs"]
    target_detection["Detect Target Architecture"]
    sa_restorer_check["Check sa_restorer Support"]
    cfg_alias["Set Cargo Configuration"]
end
architecture_code["Architecture-specific Code"]

cfg_alias --> architecture_code
sa_restorer_check --> cfg_alias
target_detection --> sa_restorer_check

Diagram: Build System Integration

Sources: build.rs(L1 - L25) 

The build script (build.rs) checks whether the target architecture supports the sa_restorer feature, which is needed for proper signal handler return in some architectures. This configuration is used by the architecture-specific code to adapt its implementation.

Architecture-Specific Features

While all architectures implement the common interface, they differ in several important ways:

FeatureVariations Across Architectures
Register SetNumber and types of registers vary by architecture
Context Sizex86_64 and ARM64 typically have more registers than RISC-V
Signal FrameDifferent memory layout for saved context
Return MechanismSome usesa_restorer, others use direct jumps
Stack AlignmentRequirements differ (e.g., 16-byte for x86_64)

Sources: src/arch/mod.rs(L1 - L17)  build.rs(L3 - L15) 

Integration with Signal Managers

The architecture support layer integrates with the signal management system as follows:

flowchart TD
subgraph subGraph1["Architecture Support Layer"]
    signal_trampoline["signal_trampoline"]
    save_context["Save Context Functions"]
    restore_context["Restore Context Functions"]
end
subgraph subGraph0["Thread Signal Manager"]
    check_signals["check_signals()"]
    handle_signal["handle_signal()"]
    restore["restore()"]
end

check_signals --> handle_signal
handle_signal --> save_context
handle_signal --> signal_trampoline
restore --> restore_context

Diagram: Integration with Signal Management

When the ThreadSignalManager needs to deliver a signal, it uses the architecture-specific functions to:

  1. Save the current execution context
  2. Set up the stack frame for the signal handler
  3. Jump to the architecture-specific signal_trampoline
  4. Upon return from the signal handler, restore the original context

This design allows the higher-level signal management logic to remain architecture-independent while delegating platform-specific operations to the architecture support layer.

Summary

The architecture support subsystem provides a critical abstraction layer that enables the signal handling system to work consistently across different CPU architectures. By encapsulating architecture-specific details and providing a uniform interface, it allows the rest of the system to operate in an architecture-agnostic manner while still benefiting from hardware-specific optimizations.

Each architecture implementation provides specialized routines for:

  • Context saving and restoration
  • Signal trampoline implementation
  • Conversion between trap frames and user contexts
  • Stack management for signal handlers

This modular design makes it easier to add support for new architectures while maintaining compatibility with existing code.

Sources: src/arch/mod.rs(L1 - L25)  src/lib.rs(L8 - L9)  build.rs(L1 - L25) 

x86_64 Implementation

Relevant source files

This page documents the x86_64-specific implementation of the signal handling mechanism in the AxSignal crate. It covers the architecture-specific data structures, context management, and assembly code used for handling signals on the x86_64 architecture. For information about other architectures, see ARM64 Implementation, RISC-V Implementation, or LoongArch64 Implementation.

Overview

The x86_64 implementation provides architecture-specific components required for signal handling, including:

  1. The signal trampoline assembly code
  2. Machine context (MContext) for saving/restoring CPU registers
  3. User context (UContext) structure for the complete signal handling context

These components work together to allow saving the current execution state when a signal occurs, executing a signal handler, and then restoring the original state to resume normal execution.

Sources: src/arch/mod.rs(L1 - L26)  src/arch/x86_64.rs(L1 - L4) 

Signal Trampoline

The signal trampoline is a small assembly function that serves as the return mechanism after a signal handler completes execution. It's designed to be a fixed-address function that can be reliably used by the signal handling system.

flowchart TD
A["Signal Handler"]
B["signal_trampoline"]
C["syscall(15)"]
D["Return to Original Execution"]

A --> B
B --> C
C --> D

Implementation Details

The signal trampoline is implemented in assembly and placed in its own 4KB-aligned section:

  1. It executes syscall 15 (0xF), which is designated for signal return
  2. The assembly code is padded to occupy a full 4KB page

The trampoline's address is exposed through the signal_trampoline_address() function, allowing the signal handling system to set up the return address for signal handlers.

Sources: src/arch/mod.rs(L19 - L25)  src/arch/x86_64.rs(L5 - L17) 

Machine Context (MContext)

The MContext structure represents the complete CPU register state for x86_64 architecture. This structure is crucial for:

  1. Saving the processor state when a signal is delivered
  2. Restoring the processor state when returning from a signal handler

Structure Layout

The MContext structure contains all general-purpose registers, instruction pointer, stack pointer, flags, segment registers, and other CPU state information:

classDiagram
class MContext {
    +usize r8
    +usize r9
    +usize r10
    +usize r11
    +usize r12
    +usize r13
    +usize r14
    +usize r15
    +usize rdi
    +usize rsi
    +usize rbp
    +usize rbx
    +usize rdx
    +usize rax
    +usize rcx
    +usize rsp
    +usize rip
    +usize eflags
    +u16 cs
    +u16 gs
    +u16 fs
    +u16 _pad
    +usize err
    +usize trapno
    +usize oldmask
    +usize cr2
    +usize fpstate
    +[usize; 8] _reserved1
    +new(tf: &TrapFrame)
    +restore(&self, tf: &mut TrapFrame)
}

Conversion Methods

The MContext structure provides methods to convert between the trap frame format and the machine context format:

  1. new(): Creates a new MContext by copying register values from a TrapFrame
  2. restore(): Updates a TrapFrame with register values from the MContext

These methods enable seamless conversion between the kernel's internal representation of CPU state (TrapFrame) and the architecture-specific representation used for signal handling (MContext).

Sources: src/arch/x86_64.rs(L19 - L109) 

User Context (UContext)

The UContext structure combines the machine context with additional information needed for signal handling, providing a complete context for signal handlers.

Structure Layout

classDiagram
class UContext {
    +usize flags
    +usize link
    +SignalStack stack
    +MContext mcontext
    +SignalSet sigmask
    +new(tf: &TrapFrame, sigmask: SignalSet)
}

class SignalStack {
    // Signal stack information
    
}

class SignalSet {
    // Signal mask information
    
}

class MContext {
    
    // Machine context(register state)
}

UContext  -->  MContext : contains
UContext  -->  SignalStack : contains
UContext  -->  SignalSet : contains

The UContext structure includes:

  1. flags: Used for various control flags
  2. link: Pointer to linked context (for nested signals)
  3. stack: Information about the signal stack
  4. mcontext: The machine context (CPU registers)
  5. sigmask: The signal mask to be applied during handler execution

Context Creation

The UContext::new() method creates a new user context from a trap frame and signal mask:

  1. It initializes the flags and link fields to zero
  2. Sets up a default signal stack
  3. Creates a new machine context from the provided trap frame
  4. Stores the provided signal mask

This combined context provides all the information a signal handler needs to execute properly and allows for correct state restoration afterward.

Sources: src/arch/x86_64.rs(L111 - L131) 

Signal Handling Flow on x86_64

The following diagram illustrates the complete flow of signal handling on x86_64, from signal delivery to handler execution and context restoration.

sequenceDiagram
    participant KernelExecution as "Kernel Execution"
    participant ThreadSignalManager as "ThreadSignalManager"
    participant MContext as "MContext"
    participant UContext as "UContext"
    participant SignalHandler as "Signal Handler"
    participant signal_trampoline as "signal_trampoline"

    KernelExecution ->> ThreadSignalManager: Trap occurs (signal generated)
    ThreadSignalManager ->> MContext: Save current context
    ThreadSignalManager ->> MContext: MContext::new(trap_frame)
    MContext ->> UContext: Create user context
    MContext ->> UContext: UContext::new(trap_frame, sigmask)
    ThreadSignalManager ->> KernelExecution: Modify trap frame to point to handler
    KernelExecution ->> SignalHandler: Resume execution (now in handler)
    Note over SignalHandler: Handler executes
    SignalHandler ->> signal_trampoline: Return when complete
    signal_trampoline ->> KernelExecution: syscall(15) - Signal return
    KernelExecution ->> ThreadSignalManager: Handle signal return
    ThreadSignalManager ->> UContext: Retrieve saved context
    UContext ->> MContext: Extract machine context
    MContext ->> KernelExecution: Restore trap frame
    MContext ->> KernelExecution: mcontext.restore(trap_frame)
    KernelExecution ->> KernelExecution: Resume original execution

The key steps in this process are:

  1. When a signal is delivered, the current CPU state is saved into an MContext
  2. A full UContext is created, including the machine context, signal mask, and stack info
  3. The trap frame is modified to point to the signal handler
  4. When the signal handler returns, it goes to signal_trampoline
  5. The trampoline executes syscall 15 to return to the kernel
  6. The saved context is restored, and normal execution resumes

This architecture-specific implementation ensures that signals can be properly handled on x86_64 systems without corrupting the execution state of the process.

Sources: src/arch/x86_64.rs(L1 - L131) 

Register State Mapping

The following table shows how register state is mapped between the TrapFrame and MContext structures:

TrapFrame FieldMContext FieldDescription
r8r8General purpose register R8
r9r9General purpose register R9
r10r10General purpose register R10
r11r11General purpose register R11
r12r12General purpose register R12
r13r13General purpose register R13
r14r14General purpose register R14
r15r15General purpose register R15
rdirdiFirst function argument register
rsirsiSecond function argument register
rbprbpBase pointer register
rbxrbxGeneral purpose register (callee saved)
rdxrdxThird function argument register
raxraxReturn value register
rcxrcxFourth function argument register
rsprspStack pointer register
ripripInstruction pointer register
rflagseflagsCPU flags register
cscsCode segment register
error_codeerrError code from exception
vectortrapnoInterrupt/exception vector number

This mapping ensures that all necessary register state is preserved during signal handling.

Sources: src/arch/x86_64.rs(L53 - L108) 

Integration with Signal Management System

The x86_64 implementation integrates with the broader signal management system through the following mechanisms:

flowchart TD
subgraph subGraph2["Architecture Interface"]
    AM["arch/mod.rs"]
    STA["signal_trampoline_address()"]
end
subgraph subGraph1["x86_64 Implementation"]
    MCT["MContext"]
    UCT["UContext"]
    ST["signal_trampoline"]
end
subgraph subGraph0["Signal Management System"]
    TSM["ThreadSignalManager"]
    PSM["ProcessSignalManager"]
end

AM --> MCT
AM --> STA
AM --> UCT
MCT --> UCT
ST --> STA
STA --> TSM
TSM --> MCT
UCT --> MCT

Key integration points:

  1. The signal_trampoline_address() function exposes the address of the architecture-specific trampoline
  2. The MContext and UContext structures are used by the ThreadSignalManager to save and restore execution context
  3. The architecture module (arch/mod.rs) selects and exports the appropriate implementation based on the target architecture

This modular design allows the signal management system to work consistently across different architectures while handling the architecture-specific details appropriately.

Sources: src/arch/mod.rs(L1 - L26)  src/arch/x86_64.rs(L1 - L131) 

ARM64 Implementation

Relevant source files

This document describes the ARM64 (AArch64) architecture-specific implementation of the signal handling system in ArceOS. It details how signal context management, trampolines, and architecture-specific data structures are implemented for ARM64 processors. For information about other architecture implementations, see x86_64 Implementation, RISC-V Implementation, or LoongArch64 Implementation.

Overview

The ARM64 implementation provides the architecture-specific components needed for signal handling, including:

  1. A signal trampoline for transferring control to user signal handlers
  2. Context management structures for saving and restoring CPU state
  3. Context conversion utilities between trap frames and signal contexts
flowchart TD
subgraph subGraph1["Key Functions"]
    save["Context Saving"]
    restore["Context Restoration"]
    syscall["rt_sigreturn Syscall"]
end
subgraph subGraph0["ARM64 Signal Implementation"]
    trampoline["signal_trampoline()"]
    mcontext["MContext"]
    ucontext["UContext"]
end

mcontext --> restore
mcontext --> save
restore --> mcontext
save --> ucontext
trampoline --> syscall
ucontext --> mcontext

Sources: src/arch/aarch64.rs src/arch/mod.rs

Signal Trampoline

The signal trampoline is a small piece of assembly code that serves as the return path from signal handlers. When a signal handler completes execution, the trampoline is called to restore the original execution context and return to the interrupted code.


The ARM64 signal trampoline is implemented as:

  1. A page-aligned assembly function that makes syscall 139 (typically rt_sigreturn in Unix-like systems)
  2. The function is padded to fill an entire 4096-byte page

The implementation in assembly is:

signal_trampoline:
    mov x8, #139   ; Load syscall number 139 into x8 register
    svc #0         ; Trigger supervisor call (system call)

This trampoline is accessed via the signal_trampoline_address() function, which returns its memory address for use during signal handler setup.

Sources: src/arch/aarch64.rs(L5 - L16)  src/arch/mod.rs(L19 - L25) 

Machine Context (MContext)

The MContext structure is responsible for storing the complete CPU state necessary to restore execution after signal handling. It captures all registers and processor state flags.

classDiagram
class MContext {
    +u64 fault_address
    +u64[31] regs
    +u64 sp
    +u64 pc
    +u64 pstate
    +MContextPadding __reserved
    +new(TrapFrame) MContext
    +restore(TrapFrame)
}

class MContextPadding {
    +u8[4096] 0
    
}

MContext  -->  MContextPadding

The MContext structure:

  • Is 16-byte aligned for optimal performance on ARM64
  • Contains all 31 general-purpose registers (x0-x30)
  • Stores critical CPU state including stack pointer, program counter, and processor state
  • Includes a large reserved padding area
  • Provides methods to create from and restore to a trap frame

This structure effectively captures the entire execution state that must be preserved during signal handling.

Sources: src/arch/aarch64.rs(L18 - L51) 

User Context (UContext)

The UContext structure provides a higher-level abstraction that combines the machine context with additional signal-related information. This matches the structure expected by user-level signal handlers.

classDiagram
class UContext {
    +usize flags
    +usize link
    +SignalStack stack
    +SignalSet sigmask
    +u8[] __unused
    +MContext mcontext
    +new(TrapFrame, SignalSet) UContext
}

class MContext {
    +u64 fault_address
    +u64[31] regs
    +u64 sp
    +u64 pc
    +u64 pstate
    +padding
    
}

class SignalStack {
    +stack attributes
    
}

class SignalSet {
    +signal mask bits
    
}

UContext  -->  MContext
UContext  -->  SignalStack
UContext  -->  SignalSet

The UContext structure includes:

  • Flags for context management
  • A link field that can point to another context
  • A SignalStack for defining the stack used during signal handling
  • A SignalSet representing the signal mask during handler execution
  • Reserved space to ensure proper sizing and alignment
  • The MContext containing all CPU registers and state

During signal handling, this structure is used to:

  1. Save the current execution context before calling the handler
  2. Configure the signal environment for the handler execution
  3. Restore the original context when the handler completes

Sources: src/arch/aarch64.rs(L53 - L75) 

Context Management Flow

The following diagram illustrates how the ARM64 implementation manages context during signal handling:

sequenceDiagram
    participant UserProcess as "User Process"
    participant KernelArceOS as "Kernel/ArceOS"
    participant SignalHandler as "Signal Handler"
    participant signal_trampoline as "signal_trampoline"

    UserProcess ->> KernelArceOS: Normal Execution
    KernelArceOS ->> KernelArceOS: Signal Received
    KernelArceOS ->> KernelArceOS: Create MContext from TrapFrame
    KernelArceOS ->> KernelArceOS: Create UContext with MContext
    KernelArceOS ->> SignalHandler: Set up and jump to handler with UContext
    SignalHandler ->> SignalHandler: Handle signal
    SignalHandler ->> signal_trampoline: Return via trampoline
    signal_trampoline ->> KernelArceOS: syscall rt_sigreturn (139)
    KernelArceOS ->> KernelArceOS: Extract MContext from UContext
    KernelArceOS ->> KernelArceOS: Restore TrapFrame from MContext
    KernelArceOS ->> UserProcess: Resume original execution

When a signal is delivered:

  1. The current CPU state is captured in a TrapFrame
  2. This state is converted to an MContext
  3. An UContext is built including the MContext and signal information
  4. The signal handler is called with this context
  5. When the handler returns, the signal trampoline is executed
  6. The syscall in the trampoline triggers the kernel to restore the original context
  7. Regular execution continues from where it was interrupted

Sources: src/arch/aarch64.rs(L34 - L45)  src/arch/aarch64.rs(L45 - L50)  src/arch/aarch64.rs(L65 - L74) 

Context Conversion Process

The ARM64 implementation provides efficient methods for converting between trap frames and contexts:


Creation Process

When creating an MContext from a TrapFrame, the following fields are mapped:

  • General registers (r0-r30) are copied directly
  • The user stack pointer (usp) becomes the stack pointer (sp)
  • The exception link register (elr) becomes the program counter (pc)
  • The saved program status register (spsr) becomes the processor state (pstate)

Restoration Process

When restoring a TrapFrame from an MContext, the reverse mappings occur:

  • General registers are copied back
  • The stack pointer is restored to usp
  • The program counter is restored to elr
  • The processor state is restored to spsr

This bidirectional conversion ensures that execution context is properly preserved during signal handling.

Sources: src/arch/aarch64.rs(L34 - L45)  src/arch/aarch64.rs(L45 - L50) 

Integration with Signal Handling System

The ARM64 implementation integrates with the overall signal handling system through the architecture abstraction layer defined in arch/mod.rs. This layer provides a unified interface for all supported architectures while allowing architecture-specific implementations of critical components.

flowchart TD
subgraph subGraph2["Signal Handling System"]
    thread_manager["ThreadSignalManager"]
    process_manager["ProcessSignalManager"]
end
subgraph subGraph1["ARM64 Implementation"]
    aarch64["aarch64.rs"]
    mcontext["MContext"]
    ucontext["UContext"]
    trampoline["signal_trampoline"]
end
subgraph subGraph0["Architecture Module"]
    arch_mod["arch/mod.rs"]
    trampoline_addr["signal_trampoline_address()"]
end

aarch64 --> mcontext
aarch64 --> trampoline
aarch64 --> ucontext
arch_mod --> aarch64
process_manager --> arch_mod
thread_manager --> arch_mod
trampoline_addr --> trampoline

The key integration points are:

  1. The architecture module exposes the signal_trampoline_address() function
  2. The signal handling system uses this function to set up signal handlers
  3. The MContext and UContext structures are used to manage execution state
  4. The architecture-specific context conversion methods are used during signal delivery and return

This abstraction allows the core signal handling logic to remain architecture-agnostic while leveraging the ARM64-specific implementation for context management.

Sources: src/arch/mod.rs(L1 - L17)  src/arch/mod.rs(L19 - L25) 

Summary

The ARM64 implementation provides the architecture-specific components required for signal handling on AArch64 processors:

  1. Signal Trampoline: A carefully positioned assembly function that makes the rt_sigreturn syscall
  2. Machine Context (MContext): A structure capturing all ARM64 CPU registers and state
  3. User Context (UContext): A higher-level structure combining machine context with signal information
  4. Context Management Methods: Functions to convert between trap frames and contexts

These components work together to ensure that signal handling can properly save and restore execution state on ARM64 platforms.

Sources: src/arch/aarch64.rs src/arch/mod.rs

RISC-V Implementation

Relevant source files

This document details the RISC-V architecture-specific implementation of signal handling in the axsignal crate. It covers the signal trampoline mechanism, context saving/restoring operations, and the data structures specific to RISC-V processors. For information about other architectures, see the corresponding implementation pages: x86_64 Implementation, ARM64 Implementation, and LoongArch64 Implementation.

RISC-V Signal Handling Architecture

The RISC-V signal handling implementation provides the architecture-specific components needed to save CPU state before executing a signal handler and to restore that state afterward. It consists of two main components:

  1. A signal trampoline implementation in assembly language
  2. Data structures for storing CPU context

The implementation supports both 32-bit (riscv32) and 64-bit (riscv64) RISC-V architectures through a unified module.

flowchart TD
subgraph subGraph1["Integration Points"]
    TSM["ThreadSignalManager"]
    TF["TrapFrame"]
end
subgraph subGraph0["Signal Handling System"]
    ARCH["arch/mod.rs"]
    RISCV["arch/riscv.rs"]
    TRAMP["signal_trampoline"]
    MCTX["MContext"]
    UCTX["UContext"]
end

ARCH --> RISCV
MCTX --> TF
MCTX --> UCTX
RISCV --> MCTX
RISCV --> TRAMP
RISCV --> UCTX
TF --> MCTX
TSM --> TRAMP
TSM --> UCTX

Sources: src/arch/mod.rs(L1 - L26)  src/arch/riscv.rs(L1 - L64) 

Signal Trampoline Implementation

The signal trampoline is a small piece of assembly code that serves as the bridge between signal handler execution and returning to normal execution. In RISC-V, it's implemented as a simple syscall wrapper that invokes syscall number 139 (sigreturn).

flowchart TD
subgraph subGraph0["Signal Trampoline Flow"]
    SH["Signal Handler"]
    ST["signal_trampoline"]
    SC["Syscall 139 (sigreturn)"]
    KR["Kernel Return Processing"]
    RT["Return to Normal Execution"]
end

KR --> RT
SC --> KR
SH --> ST
ST --> SC

The trampoline is defined in assembly and aligned to a 4096-byte page boundary:

flowchart TD
ASM["Assembly Code"]
TRAM["signal_trampoline"]
ECALL["syscall 139 (sigreturn)"]

ASM --> TRAM
TRAM --> ECALL

Sources: src/arch/riscv.rs(L5 - L16)  src/arch/mod.rs(L19 - L25) 

Context Data Structures

The RISC-V implementation defines two key structures for context management:

MContext Structure

MContext stores the essential machine context that needs to be saved and restored during signal handling.

classDiagram
class MContext {
    +usize pc
    -GeneralRegisters regs
    -usize[66] fpstate
    +new(TrapFrame) MContext
    +restore(TrapFrame) void
}

class TrapFrame {
    +usize sepc
    +GeneralRegisters regs
    
}

MContext  -->  TrapFrame : converts from/to

The structure contains:

  • pc: Program counter (stored as sepc in the trap frame)
  • regs: General-purpose registers from the GeneralRegisters structure
  • fpstate: Floating-point state (66 words of storage)

UContext Structure

UContext is a higher-level structure that encapsulates MContext along with additional signal-related information.


The structure contains:

  • flags: Context flags (not currently used, set to 0)
  • link: Link to another context (not currently used, set to 0)
  • stack: Signal stack information (type SignalStack)
  • sigmask: Signal mask (type SignalSet)
  • __unused: Padding to ensure proper structure alignment
  • mcontext: The machine context described above

Sources: src/arch/riscv.rs(L18 - L63) 

Context Operations

The RISC-V implementation provides two primary operations on context:

  1. Context Creation: Converting from a trap frame to an MContext/UContext
  2. Context Restoration: Restoring a trap frame from an MContext

Context Creation

When a signal is delivered, the current CPU state (represented by a TrapFrame) is saved into an MContext and then into a UContext.

sequenceDiagram
    participant ThreadSignalManager as ThreadSignalManager
    participant TrapFrame as TrapFrame
    participant MContext as MContext
    participant UContext as UContext

    ThreadSignalManager ->> UContext: new(tf, sigmask)
    UContext ->> MContext: new(tf)
    MContext ->> TrapFrame: read sepc to pc
    MContext ->> TrapFrame: copy regs
    MContext ->> MContext: initialize fpstate to zeros
    UContext ->> UContext: initialize other fields

Context Restoration

After signal handler execution, the saved context is restored to continue normal execution.

sequenceDiagram
    participant ThreadSignalManager as ThreadSignalManager
    participant TrapFrame as TrapFrame
    participant MContext as MContext

    ThreadSignalManager ->> MContext: restore(tf)
    MContext ->> TrapFrame: write pc to sepc
    MContext ->> TrapFrame: copy regs

Sources: src/arch/riscv.rs(L27 - L38)  src/arch/riscv.rs(L53 - L62) 

Integration with Signal Handling System

The RISC-V implementation integrates with the rest of the signal handling system through the architecture abstraction layer defined in arch/mod.rs. This layer selects the appropriate architecture-specific implementation at compile time based on the target architecture.

flowchart TD
subgraph subGraph1["Signal Processing"]
    TSM["ThreadSignalManager"]
    TADDR["signal_trampoline_address()"]
    TRAMP["signal_trampoline"]
    SH["Signal Handler"]
    ARCH["arch/mod.rs"]
    X86["x86_64.rs"]
    RISCV["riscv.rs"]
    ARM["aarch64.rs"]
end
subgraph subGraph0["Architecture Selection"]
    TSM["ThreadSignalManager"]
    TADDR["signal_trampoline_address()"]
    SH["Signal Handler"]
    ARCH["arch/mod.rs"]
    X86["x86_64.rs"]
    RISCV["riscv.rs"]
    ARM["aarch64.rs"]
    LOONG["loongarch64.rs"]
end

ARCH --> ARM
ARCH --> LOONG
ARCH --> RISCV
ARCH --> X86
SH --> TRAMP
TADDR --> TRAMP
TSM --> SH
TSM --> TADDR

Key integration points:

  1. The signal_trampoline_address() function provides the address of the architecture-specific trampoline implementation
  2. ThreadSignalManager uses the context structures to save and restore CPU state

Sources: src/arch/mod.rs(L1 - L26) 

Technical Details

Signal Trampoline Memory Layout

The signal trampoline is carefully aligned to a 4096-byte page boundary and padded to fill an entire page. This is important for security and memory protection:

.section .text
.balign 4096
.global signal_trampoline
signal_trampoline:
    li a7, 139     # Load syscall number 139 (sigreturn) into a7
    ecall          # Execute syscall
.fill 4096 - (. - signal_trampoline), 1, 0  # Fill remainder of page with zeros

The trampoline simply loads the sigreturn syscall number (139) into register a7 and executes the syscall instruction.

RISC-V Register Handling

The MContext structure saves the program counter (PC) separately from the general registers. During restoration:

  • The program counter is restored to the sepc (Supervisor Exception Program Counter) field of the trap frame
  • The general registers are copied directly between the trap frame and MContext

The floating-point state (fpstate) is currently initialized to zeros but provides space for future implementations to save floating-point registers.

Sources: src/arch/riscv.rs(L5 - L16)  src/arch/riscv.rs(L27 - L38) 

Summary

The RISC-V implementation in the axsignal crate provides the architecture-specific components needed for signal handling on RISC-V processors. It defines the data structures for saving and restoring CPU context (MContext and UContext) and implements the signal trampoline needed to return from signal handlers. The implementation supports both 32-bit and 64-bit RISC-V architectures through a single module.

The architecture-specific implementation is selected at compile time based on the target architecture, ensuring that the appropriate code is used without runtime overhead.

Sources: src/arch/mod.rs(L1 - L26)  src/arch/riscv.rs(L1 - L64) 

LoongArch64 Implementation

Relevant source files

Purpose and Scope

This document describes the LoongArch64-specific implementation of signal handling in the axsignal crate. It details the architecture-specific data structures, register context management, and signal trampoline mechanism that enable Unix-like signal handling on the LoongArch64 architecture. For a general overview of the architecture support system, see Architecture Support.

Signal Context Management

The LoongArch64 implementation provides specialized structures for managing CPU context during signal handling operations. These structures are critical for preserving and restoring the CPU state when a signal handler is invoked and when it returns.

Context Structures

classDiagram
class TrapFrame {
    +regs: [u64; 32]
    +era: usize
    +Other architecture-specific registers
    
}

class MContext {
    +sc_pc: u64
    +sc_regs: [u64; 32]
    +sc_flags: u32
    +new(tf: &TrapFrame)
    +restore(&self, tf: &mut TrapFrame)
}

class UContext {
    +flags: usize
    +link: usize
    +stack: SignalStack
    +sigmask: SignalSet
    +__unused: [u8; ...]
    +mcontext: MContext
    +new(tf: &TrapFrame, sigmask: SignalSet)
}

class SignalSet {
    
    
}

class SignalStack {
    
    
}

TrapFrame  -->  MContext : converted to
MContext  -->  UContext : contained in
SignalSet  -->  UContext : contained in
SignalStack  -->  UContext : contained in

The LoongArch64 implementation defines two main context structures:

  1. MContext (Machine Context): Stores the CPU register state for LoongArch64
  • sc_pc: Program counter (instruction pointer)
  • sc_regs: Array of 32 general-purpose registers
  • sc_flags: Context flags
  1. UContext (User Context): Encapsulates the complete execution context
  • flags: Context flags
  • link: Pointer to linked context
  • stack: Signal stack information
  • sigmask: Signal mask in effect
  • mcontext: Machine context (CPU registers)

Sources: src/arch/loongarch64.rs(L20 - L67) 

Signal Trampoline

The signal trampoline is a critical piece of assembly code that provides a reliable mechanism for returning from signal handlers. It executes a system call (rt_sigreturn) to restore the original execution context.

flowchart TD
SignalDelivery["Signal Delivery"]
SetupStack["Set up Handler Stack"]
SaveContext["Save Current Context"]
InvokeHandler["Invoke Signal Handler"]
SignalTrampoline["Signal Trampoline"]
SyscallRtSigreturn["Syscall rt_sigreturn (139)"]
RestoreContext["Restore Original Context"]
ResumeExecution["Resume Original Execution"]

InvokeHandler --> SignalTrampoline
RestoreContext --> ResumeExecution
SaveContext --> InvokeHandler
SetupStack --> SaveContext
SignalDelivery --> SetupStack
SignalTrampoline --> SyscallRtSigreturn
SyscallRtSigreturn --> RestoreContext

The LoongArch64 signal trampoline is implemented in assembly:

signal_trampoline:
    li.w    $a7, 139    # Load syscall number 139 (rt_sigreturn)
    syscall 0           # Make syscall

The trampoline is aligned on a 4096-byte boundary and padded to fill a full page, ensuring it has a predictable memory layout. When the signal handler completes, execution flows to this trampoline, which performs syscall 139 (rt_sigreturn) to restore the original execution context.

Sources: src/arch/loongarch64.rs(L7 - L18)  src/arch/mod.rs(L19 - L25) 

Context Conversion and Restoration

The LoongArch64 implementation provides methods to convert between the TrapFrame structure (used by the kernel) and the MContext structure (used for signal handling).

sequenceDiagram
    participant Kernel as "Kernel"
    participant SignalManager as "Signal Manager"
    participant SignalHandler as "Signal Handler"
    participant SignalTrampoline as SignalTrampoline

    Kernel ->> SignalManager: Deliver signal with TrapFrame
    SignalManager ->> SignalManager: Create MContext from TrapFrame
    SignalManager ->> SignalManager: Create UContext with MContext
    SignalManager ->> SignalHandler: Invoke with UContext pointer
    SignalHandler -->> SignalTrampoline: Return
    SignalTrampoline ->> Kernel: rt_sigreturn syscall
    Kernel ->> SignalManager: Restore TrapFrame from UContext
    SignalManager ->> SignalManager: MContext.restore(TrapFrame)
    SignalManager ->> Kernel: Resume execution

Context Creation

When a signal is delivered, the system creates an MContext from the current TrapFrame:

  1. The MContext::new method creates a new machine context from a trap frame
  2. It copies the program counter (era) and all 32 general-purpose registers

Context Restoration

When a signal handler returns, the system restores the original TrapFrame from the saved MContext:

  1. The MContext::restore method updates the trap frame with saved values
  2. It restores the program counter (era) and all 32 general-purpose registers

Sources: src/arch/loongarch64.rs(L28 - L43) 

Memory Layout for Signal Handling

When a signal is delivered, the system sets up a specific memory layout on the user stack to facilitate signal handling.

flowchart TD
subgraph subGraph0["Signal Handler Stack Layout"]
    signalHandler["Signal Handler Function"]
    signoArg["Signal Number (Argument 1)"]
    siginfoArg["SignalInfo Pointer (Argument 2)"]
    ucontextArg["UContext Pointer (Argument 3)"]
    returnAddress["Return Address (signal_trampoline)"]
    savedRegisters["Saved Registers (MContext)"]
end
stackGrowth["Stack Growth Direction ↓"]

returnAddress --> savedRegisters
siginfoArg --> ucontextArg
signalHandler --> signoArg
signoArg --> siginfoArg
stackGrowth --> signalHandler
ucontextArg --> returnAddress

The key components of this memory layout are:

  1. Signal Handler Function: The entry point for the signal handler
  2. Arguments: Three arguments are passed to the handler:
  • Signal number
  • Pointer to signal information
  • Pointer to user context (UContext)
  1. Return Address: Set to the signal_trampoline function
  2. Saved Context: The complete user context (UContext) including:
  • Signal mask
  • Signal stack information
  • Machine context (registers)

This layout ensures that when the signal handler returns, it will jump to the signal trampoline, which will restore the original execution context through the rt_sigreturn syscall.

Sources: src/arch/loongarch64.rs(L7 - L18)  src/arch/loongarch64.rs(L45 - L67) 

Comparison with Other Architectures

The LoongArch64 implementation shares many similarities with other RISC architectures in the axsignal crate, particularly with RISC-V. However, there are architecture-specific differences in register naming and context structure.

FeatureLoongArch64RISC-Vx86_64AArch64
PC Registererasepcripelr_el1
Register Count32 GP registers32 GP registers16 GP registers31 GP registers
Context FlagsSimple 32-bit flagsSimple 32-bit flagsEFLAGS/XSAVEPSTATE flags
Signal TrampolineSyscall 139Syscall 139Syscall 15Syscall 139

The main architecture-specific aspects of the LoongArch64 implementation include:

  1. Register Set: LoongArch64 has 32 general-purpose registers (like RISC-V)
  2. Program Counter: Called era (Exception Return Address)
  3. Assembly Instructions: Uses LoongArch64-specific instructions like li.w and syscall

Sources: src/arch/loongarch64.rs(L20 - L43) 

Integration with Signal Handling System

The LoongArch64 implementation integrates with the broader signal handling system through the architecture abstraction layer in src/arch/mod.rs.


The integration points include:

  1. Architecture Selection: Conditional compilation selects the LoongArch64 implementation based on the target architecture
  2. Signal Trampoline Address: Exposed through a common function to get the address of the architecture-specific signal trampoline
  3. Context Management: The architecture-specific UContext and MContext structures are used by the signal manager to save and restore execution context

Sources: src/arch/mod.rs(L1 - L25) 

Summary

The LoongArch64 implementation in the axsignal crate provides the architecture-specific components needed for Unix-like signal handling on LoongArch64 processors. It includes:

  1. A signal trampoline mechanism for returning from signal handlers
  2. Machine context (MContext) and user context (UContext) structures for saving and restoring CPU state
  3. Methods for converting between trap frames and machine contexts
  4. Integration with the architecture-independent signal handling system

These components enable the axsignal crate to provide a consistent signal handling API across different architectures, including LoongArch64.

Build Configuration and Dependencies

Relevant source files

Purpose and Scope

This document details the build configuration and dependency management aspects of the axsignal crate. It explains how the crate is configured for different target architectures, its external dependencies, build-time configuration mechanisms, and integration with the ArceOS ecosystem. For information about the signal handling implementation details, refer to Signal Management System and Architecture Support.

Dependency Structure

The axsignal crate is designed with carefully selected dependencies to provide Unix-like signal handling functionality within the ArceOS framework.

flowchart TD
subgraph subGraph2["Patched Dependencies"]
    page_table_multiarch["page_table_multiarch"]
    page_table_entry["page_table_entry"]
end
subgraph subGraph1["ArceOS Dependencies"]
    axconfig["axconfig"]
    axhal["axhal (with uspace feature)"]
    axtask["axtask (with multitask feature)"]
end
subgraph subGraph0["Core Dependencies"]
    axerrno["axerrno (0.1.0)"]
    bitflags["bitflags (2.6)"]
    cfg_if["cfg-if (1.0.0)"]
    linux_raw_sys["linux-raw-sys (0.9.3)"]
    log["log (0.4)"]
    strum_macros["strum_macros (0.27.1)"]
    lock_api["lock_api (0.4.12)"]
    derive_more["derive_more (2.0.1)"]
end
axsignal["axsignal Crate"]

axhal --> page_table_entry
axhal --> page_table_multiarch
axsignal --> axconfig
axsignal --> axerrno
axsignal --> axhal
axsignal --> axtask
axsignal --> bitflags
axsignal --> cfg_if
axsignal --> derive_more
axsignal --> linux_raw_sys
axsignal --> lock_api
axsignal --> log
axsignal --> strum_macros

Diagram: Dependency Structure of axsignal Crate

Sources: Cargo.toml(L6 - L26) 

Standard Dependencies

The axsignal crate relies on several standard Rust crates:

DependencyVersionPurpose
axerrno0.1.0Provides error code definitions for system calls
bitflags2.6Used for creating type-safe bit flags (e.g., signal sets)
cfg-if1.0.0Simplifies conditional compilation
linux-raw-sys0.9.3Provides low-level Linux system call definitions
log0.4Logging functionality
strum_macros0.27.1Used for enum string conversions
lock_api0.4.12Abstractions for synchronization primitives
derive_more2.0.1Additional derive macros for common traits

The linux-raw-sys dependency is configured with default-features = false and explicitly enables the general and no_std features, ensuring compatibility with the no_std environment of ArceOS.

Sources: Cargo.toml(L6 - L26) 

ArceOS Dependencies

The crate integrates with ArceOS through the following dependencies:

DependencyFeaturesPurpose
axconfig(none)Configuration constants and parameters from ArceOS
axhaluspaceHardware abstraction layer with userspace support
axtaskmultitaskTask/thread management system

These dependencies are sourced directly from the ArceOS GitHub repository:

axconfig = { git = "https://github.com/oscomp/arceos.git" }
axhal = { git = "https://github.com/oscomp/arceos.git", features = ["uspace"] }
axtask = { git = "https://github.com/oscomp/arceos.git", features = ["multitask"] }

Sources: Cargo.toml(L10 - L14) 

Dependency Patches

The axsignal crate applies patches to two dependencies:

[patch.crates-io]
page_table_multiarch = { git = "https://github.com/Mivik/page_table_multiarch.git", rev = "19ededd" }
page_table_entry = { git = "https://github.com/Mivik/page_table_multiarch.git", rev = "19ededd" }

These patches ensure compatibility with the specific memory management requirements of ArceOS by using patched versions of the page table libraries.

Sources: Cargo.toml(L28 - L30) 

Architecture-Specific Build Configuration

The axsignal crate is designed to support multiple CPU architectures, with different implementation details for each. The build system automatically configures the appropriate architecture-specific code based on the target platform.

flowchart TD
subgraph subGraph2["Implementation Files"]
    arch_mod["arch/mod.rs"]
    x86_64_impl["arch/x86_64.rs"]
    aarch64_impl["arch/aarch64.rs"]
    riscv_impl["arch/riscv.rs"]
    loongarch64_impl["arch/loongarch64.rs"]
end
subgraph subGraph1["Supported Architectures"]
    x86_64["x86_64"]
    x86["x86"]
    powerpc["powerpc"]
    powerpc64["powerpc64"]
    s390x["s390x"]
    arm["arm"]
    aarch64["aarch64"]
    other["Other Architectures"]
end
subgraph subGraph0["Architecture-Specific Config"]
    sa_restorer_cfg["sa_restorer cfg flag"]
end
build_rs["build.rs"]
target["CARGO_CFG_TARGET_ARCH"]

aarch64 --> sa_restorer_cfg
arch_mod --> aarch64_impl
arch_mod --> loongarch64_impl
arch_mod --> riscv_impl
arch_mod --> x86_64_impl
arm --> sa_restorer_cfg
build_rs --> target
other --> sa_restorer_cfg
powerpc --> sa_restorer_cfg
powerpc64 --> sa_restorer_cfg
s390x --> sa_restorer_cfg
sa_restorer_cfg --> arch_mod
target --> sa_restorer_cfg
x86 --> sa_restorer_cfg

Diagram: Architecture-Specific Build Configuration

Sources: build.rs(L1 - L25) 

The sa_restorer Configuration

The build.rs script creates a configuration flag called sa_restorer that is enabled only for specific architectures. This flag is used to conditionally compile code that handles the signal return trampoline mechanism:

fn main() {
    let target_arch = std::env::var("CARGO_CFG_TARGET_ARCH").unwrap();
    alias(
        "sa_restorer",
        [
            "x86_64",
            "x86",
            "powerpc",
            "powerpc64",
            "s390x",
            "arm",
            "aarch64",
        ]
        .contains(&target_arch.as_str()),
    );
}

The sa_restorer feature is architecture-dependent because only certain architectures support or require a dedicated signal return trampoline. This configuration flag allows the signal handling implementation to adapt to the specifics of each architecture.

Sources: build.rs(L1 - L15) 

Build Script Helper Function

The build script uses a helper function called alias to create the configuration flag:

#![allow(unused)]
fn main() {
fn alias(alias: &str, has_feature: bool) {
    println!("cargo:rustc-check-cfg=cfg({alias})");
    if has_feature {
        println!("cargo:rustc-cfg={alias}");
    }
}
}

This function:

  1. Declares the existence of the configuration option via cargo:rustc-check-cfg
  2. Conditionally enables the configuration via cargo:rustc-cfg

Sources: build.rs(L18 - L25) 

Conditional Compilation Structure

The axsignal crate makes extensive use of Rust's conditional compilation features to adapt to different environments and architectures. This approach allows the code to maintain compatibility with multiple platforms while minimizing redundancy.

flowchart TD
subgraph subGraph2["Conditional Code Paths"]
    sa_restorer_code["Signal Restorer Logic"]
    arch_specific["Architecture-Specific Signal Context"]
    common_code["Common Signal Code"]
end
subgraph subGraph1["Architecture Implementations"]
    x86_64_impl["x86_64 Implementation"]
    aarch64_impl["aarch64 Implementation"]
    riscv_impl["RISC-V Implementation"]
    loongarch_impl["LoongArch64 Implementation"]
end
subgraph subGraph0["Compilation Conditions"]
    target_arch["Target Architecture"]
    sa_restorer["sa_restorer Feature"]
    feature_flags["Feature Flags"]
end
crate["axsignal Crate"]
build["build.rs"]

aarch64_impl --> arch_specific
arch_specific --> common_code
build --> sa_restorer
crate --> target_arch
loongarch_impl --> arch_specific
riscv_impl --> arch_specific
sa_restorer --> sa_restorer_code
target_arch --> aarch64_impl
target_arch --> loongarch_impl
target_arch --> riscv_impl
target_arch --> x86_64_impl
x86_64_impl --> arch_specific

Diagram: Conditional Compilation Structure

Sources: build.rs(L1 - L25)  Cargo.toml(L6 - L26) 

Target Architecture Selection

The cfg-if crate is used throughout the codebase to selectively include architecture-specific implementations based on the target architecture. For example, in the arch/mod.rs file, different architecture-specific modules would be conditionally included:

cfg_if::cfg_if! {
    if #[cfg(target_arch = "x86_64")] {
        mod x86_64;
        pub use self::x86_64::*;
    } else if #[cfg(target_arch = "aarch64")] {
        mod aarch64;
        pub use self::aarch64::*;
    } else if #[cfg(any(target_arch = "riscv32", target_arch = "riscv64"))] {
        mod riscv;
        pub use self::riscv::*;
    } else if #[cfg(target_arch = "loongarch64")] {
        mod loongarch64;
        pub use self::loongarch64::*;
    } else {
        compile_error!("Unsupported target architecture");
    }
}

This pattern ensures that only the appropriate architecture-specific code is compiled into the final binary.

Sources: Cargo.toml(L16) 

The sa_restorer Configuration

The sa_restorer configuration flag created by the build script enables conditional compilation of code related to the signal restoration mechanism. In architectures that support sa_restorer, the signal action structure will include an additional field for the restorer function pointer.

For example, code using this flag might look like:

pub struct SignalOSAction {
    pub handler: usize,
    pub flags: SaFlags,
    pub mask: SignalSet,
    #[cfg(sa_restorer)]
    pub restorer: usize,
}

This conditional field ensures that the signal action structure is correctly defined for each supported architecture.

Sources: build.rs(L1 - L15) 

Integration with ArceOS

The axsignal crate is designed to integrate seamlessly with the ArceOS operating system kernel. This integration is facilitated by the dependency specifications in the Cargo.toml file and the design of the signal handling interfaces.

flowchart TD
subgraph subGraph1["axsignal Integration"]
    axsignal["axsignal Crate"]
    managers["Signal Managers"]
    arch_support["Architecture Support"]
    signal_types["Signal Types"]
end
subgraph subGraph0["ArceOS Ecosystem"]
    arceos["ArceOS Kernel"]
    axconfig["axconfig"]
    axhal["axhal"]
    axtask["axtask"]
    other_modules["Other ArceOS Modules"]
end

arceos --> axconfig
arceos --> axhal
arceos --> axsignal
arceos --> axtask
arceos --> other_modules
axhal --> arch_support
axsignal --> axconfig
axsignal --> axhal
axsignal --> axtask
axtask --> managers

Diagram: Integration with ArceOS Ecosystem

Sources: Cargo.toml(L10 - L14) 

Dependency on axhal

The axhal dependency is included with the uspace feature enabled:

axhal = { git = "https://github.com/oscomp/arceos.git", features = ["uspace"] }

This dependency provides the hardware abstraction layer functionalities required for signal handling, such as:

  1. Access to trap frames and CPU context management
  2. Architecture-specific operations for signal handling
  3. Userspace support for delivering signals to user applications

The uspace feature specifically enables the userspace support components in axhal that are necessary for implementing signal handling in a user/kernel separated environment.

Sources: Cargo.toml(L11) 

Dependency on axtask

The axtask dependency is included with the multitask feature enabled:

axtask = { git = "https://github.com/oscomp/arceos.git", features = ["multitask"] }

This dependency provides the task/thread management system that axsignal uses to:

  1. Associate signal handlers with specific threads
  2. Manage signal delivery to the appropriate targets
  3. Coordinate the execution and scheduling of signal handlers

The multitask feature ensures that proper thread management capabilities are available, which is essential for implementing per-thread signal handling.

Sources: Cargo.toml(L12 - L14) 

Dependency on axconfig

The axconfig dependency contains ArceOS configuration constants and parameters:

axconfig = { git = "https://github.com/oscomp/arceos.git" }

This dependency provides configuration settings that affect signal handling behavior, such as:

  1. Maximum number of concurrent signals
  2. Sizes of signal-related buffers
  3. System-wide constants affecting signal delivery

Sources: Cargo.toml(L10) 

Build-Time Configuration Summary

The following table summarizes the key build-time configuration aspects of the axsignal crate:

Configuration AspectMechanismPurpose
Architecture SupportTarget architecture detectionSelect appropriate architecture-specific implementation
sa_restorer Featurebuild.rs scriptEnable/disable restorer functionality based on architecture
ArceOS IntegrationGit dependenciesConnect with other ArceOS components
Feature FlagsCargo featuresEnable specific functionality (e.g., uspace, multitask)
Dependency PatchingCargo [patch] sectionEnsure compatibility with specific dependency versions

Sources: build.rs(L1 - L25)  Cargo.toml(L1 - L31) 

Conclusion

The build configuration and dependency management of the axsignal crate are designed to support a flexible, cross-architecture signal handling implementation that integrates seamlessly with the ArceOS ecosystem. The crate uses conditional compilation extensively to adapt to different target architectures while maintaining a clean and maintainable codebase.

The build script provides architecture-specific configurations, while carefully selected dependencies enable the crate to leverage existing ArceOS components for tasks such as thread management and hardware abstraction. This approach allows axsignal to provide Unix-like signal handling capabilities across multiple architectures with minimal redundancy and maximum compatibility.

Overview

Relevant source files

The axptr library provides a safe abstraction for kernel code to access user-space memory. It prevents the kernel from crashing when accessing potentially invalid user memory while providing a convenient API for common user memory operations. This page introduces the main components of axptr and provides a high-level understanding of its architecture.

For detailed information about the specific pointer types used for memory safety, see User Space Pointers. For comprehensive information about the safety mechanisms, see Safety Mechanisms.

Purpose and Scope

axptr addresses a common challenge in operating system development: safely accessing memory that belongs to user processes. User-provided pointers can't be trusted directly because they might:

  • Point to invalid memory addresses
  • Have insufficient access permissions
  • Be improperly aligned
  • Cause page faults that could crash the kernel

This library provides a robust solution by wrapping raw pointers with safety checks and contextual page fault handling.

Sources: src/lib.rs(L1) 

Key Components

flowchart TD
subgraph subGraph0["axptr Library"]
    A["UserPtr"]
    D["User-space Memory"]
    B["UserConstPtr"]
    C["AddrSpaceProvider"]
    E["Address Space Management"]
    F["Safety Mechanisms"]
    G["Kernel from Crashes"]
end
H["Kernel Code"]

A --> D
B --> D
C --> E
F --> G
H --> A
H --> B

The library consists of several key components that work together to provide safe access to user-space memory:

  1. User pointers:
  • UserPtr<T>: A wrapper around *mut T for safe mutable access to user memory
  • UserConstPtr<T>: A wrapper around *const T for safe read-only access to user memory
  1. Address space abstraction:
  • AddrSpaceProvider: A trait that abstracts operations for working with address spaces
  1. Safety mechanisms:
  • Alignment checking
  • Access permission validation
  • Page table population
  • Context-aware page fault handling

Sources: src/lib.rs(L128 - L170)  src/lib.rs(L219 - L254)  src/lib.rs(L119 - L126)  src/lib.rs(L31 - L54) 

Memory Access Flow

The following diagram illustrates the typical flow when kernel code accesses user-space memory through axptr:

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant UserPtrUserConstPtr as "UserPtr/UserConstPtr"
    participant AddrSpaceProvider as "AddrSpaceProvider"
    participant check_region as "check_region()"
    participant UserMemory as "User Memory"

    KernelCode ->> UserPtrUserConstPtr: Request access (get/get_as_slice)
    UserPtrUserConstPtr ->> check_region: check_region_with()
    check_region ->> AddrSpaceProvider: with_addr_space()
    AddrSpaceProvider ->> check_region: Provide AddrSpace
    check_region ->> check_region: Check alignment
    check_region ->> check_region: Verify access permissions
    check_region ->> check_region: Populate page tables
    alt Memory checks pass
        check_region -->> UserPtrUserConstPtr: Return Ok(())
        UserPtrUserConstPtr ->> UserMemory: Set ACCESSING_USER_MEM = true
        UserPtrUserConstPtr ->> UserMemory: Access memory safely
        UserPtrUserConstPtr ->> UserMemory: Set ACCESSING_USER_MEM = false
        UserPtrUserConstPtr -->> KernelCode: Return reference to user memory
    else Memory checks fail
        check_region -->> UserPtrUserConstPtr: Return Err(EFAULT)
        UserPtrUserConstPtr -->> KernelCode: Propagate error
    end

When a kernel function wants to access user memory:

  1. It calls a method like get() or get_as_slice() on a user pointer
  2. The user pointer performs safety checks through check_region()
  3. If checks pass, the pointer accesses memory with special handling for page faults
  4. A reference to the memory is returned or an error if access is invalid

Sources: src/lib.rs(L171 - L198)  src/lib.rs(L256 - L277)  src/lib.rs(L31 - L54)  src/lib.rs(L22 - L29) 

Code Architecture

The following diagram shows the relationship between the main types and their important methods:

classDiagram
class UserPtr~T~ {
    +*mut T pointer
    +const ACCESS_FLAGS: MappingFlags
    +address() VirtAddr
    +as_ptr() *mut T
    +cast~U~() UserPtr~U~
    +is_null() bool
    +nullable() Option~Self~
    +get() LinuxResult~&mut T~
    +get_as_slice() LinuxResult~&mut [T]~
    +get_as_null_terminated() LinuxResult~&mut [T]~
}

class UserConstPtr~T~ {
    +*const T pointer
    +const ACCESS_FLAGS: MappingFlags
    +address() VirtAddr
    +as_ptr() *const T
    +cast~U~() UserConstPtr~U~
    +is_null() bool
    +nullable() Option~Self~
    +get() LinuxResult~&T~
    +get_as_slice() LinuxResult~&[T]~
    +get_as_null_terminated() LinuxResult~&[T]~
    +get_as_str() LinuxResult~&str~
}

class AddrSpaceProvider {
    <<trait>>
    
    +with_addr_space(f) R
}

class SafetyFunctions {
    <<functions>>
    
    +check_region()
    +check_null_terminated()
    +is_accessing_user_memory()
    +access_user_memory()
}

UserPtr  -->  SafetyFunctions : uses
UserConstPtr  -->  SafetyFunctions : uses
UserPtr  -->  AddrSpaceProvider : requires
UserConstPtr  -->  AddrSpaceProvider : requires

The architecture follows these principles:

  • Separate types for mutable (UserPtr) and read-only (UserConstPtr) access
  • A trait (AddrSpaceProvider) to abstract address space operations
  • Helper functions to manage safety checks and context-aware page fault handling
  • Methods on user pointer types for common operations like getting a single value, a slice, or a null-terminated array

Sources: src/lib.rs(L128 - L217)  src/lib.rs(L219 - L303)  src/lib.rs(L119 - L126)  src/lib.rs(L18 - L107) 

Context-Aware Page Fault Handling

One of the key safety features of axptr is context-aware page fault handling:

flowchart TD
A["Kernel attempts to access user memory"]
B["axptr sets ACCESSING_USER_MEM = true"]
C["Memory access occurs"]
D["Page fault?"]
E["OS page fault handler checks is_accessing_user_memory()"]
F["is_accessing_user_memory() == true?"]
G["Handle as user memory fault (non-fatal)"]
H["Handle as kernel fault (potentially fatal)"]
I["Memory access completes normally"]
J["axptr sets ACCESSING_USER_MEM = false"]

A --> B
B --> C
C --> D
D --> E
D --> I
E --> F
F --> G
F --> H
G --> J
I --> J

This mechanism allows the OS to distinguish between:

  • Page faults that occur when intentionally accessing user memory (expected and should be handled gracefully)
  • Page faults in kernel code (may indicate a kernel bug and could be treated more severely)

The is_accessing_user_memory() function is provided for OS page fault handlers to check this context.

Sources: src/lib.rs(L11 - L20)  src/lib.rs(L22 - L29) 

Dependencies

axptr has the following key dependencies:

DependencyPurpose
axerrnoProvides error codes and result types (LinuxError,LinuxResult)
axmmMemory management, providesAddrSpace
memory_addrVirtual address manipulation
page_table_multiarchPage table and memory mapping flags
percpuPer-CPU variable support

Sources: src/lib.rs(L4 - L7)  Cargo.toml(L7 - L12) 

Conclusion

The axptr library provides a comprehensive solution for safely accessing user-space memory from kernel code. By wrapping raw pointers in smart container types that perform necessary safety checks and implement context-aware page fault handling, it helps prevent kernel crashes while providing a convenient API.

For more detailed information about specific components, refer to the following pages:

Memory Safety Architecture

Relevant source files

This document explains the core architecture and design principles of the memory safety system in the axptr library. It focuses on how the system provides a safe interface for kernel code to access user-space memory while preventing potential security vulnerabilities or system crashes. For details about specific pointer types, see User Space Pointers, and for information about safety mechanisms, see Safety Mechanisms.

Overview of Memory Safety Architecture

The axptr library implements a robust architecture to ensure memory operations across privilege boundaries (kernel accessing user memory) remain safe. The architecture is built around three key principles:

  1. Type-safe access - Using strongly-typed pointer wrappers
  2. Memory region validation - Ensuring pointers reference valid user memory regions
  3. Context-aware fault handling - Managing page faults during user memory access

Sources: src/lib.rs(L129 - L183)  src/lib.rs(L219 - L254) 

Core Components

The memory safety architecture consists of these fundamental components:

ComponentDescriptionRole
UserPtrTyped wrapper for mutable user pointersProvides safe access to user memory with read/write permissions
UserConstPtrTyped wrapper for immutable user pointersProvides safe access to user memory with read-only permissions
AddrSpaceProviderTrait for address space operationsAbstracts address space lookup and access control
Memory checking functionsSafety validation utilitiesVerifies memory region alignment, permissions, and availability
Context trackingPage fault handling mechanismManages page faults during user memory access
classDiagram
class UserPtr~T~ {
    +*mut T pointer
    +const ACCESS_FLAGS: MappingFlags
    +get()
    +get_as_slice()
    +get_as_null_terminated()
}

class UserConstPtr~T~ {
    +*const T pointer
    +const ACCESS_FLAGS: MappingFlags
    +get()
    +get_as_slice()
    +get_as_null_terminated()
}

class AddrSpaceProvider {
    <<trait>>
    
    +with_addr_space()
}

UserPtr  -->  AddrSpaceProvider : uses
UserConstPtr  -->  AddrSpaceProvider : uses

Sources: src/lib.rs(L119 - L126)  src/lib.rs(L129 - L134)  src/lib.rs(L219 - L225) 

Memory Access Workflow

The core workflow for safely accessing user memory follows these steps:

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant UserPtrUserConstPtr as "UserPtr/UserConstPtr"
    participant check_region as "check_region()"
    participant ACCESSING_USER_MEMflag as "ACCESSING_USER_MEM flag"
    participant UserMemory as "User Memory"

    KernelCode ->> UserPtrUserConstPtr: Request user memory access
    UserPtrUserConstPtr ->> check_region: Validate memory region
    check_region ->> check_region: Check alignment
    check_region ->> check_region: Verify access permissions
    check_region ->> check_region: Populate page tables
    alt Memory region valid
        check_region ->> UserPtrUserConstPtr: Access permitted
        UserPtrUserConstPtr ->> ACCESSING_USER_MEMflag: Set to true
        UserPtrUserConstPtr ->> UserMemory: Access memory
        UserPtrUserConstPtr ->> ACCESSING_USER_MEMflag: Set to false
        UserPtrUserConstPtr ->> KernelCode: Return memory reference
    else Memory region invalid
        check_region ->> UserPtrUserConstPtr: Return EFAULT
        UserPtrUserConstPtr ->> KernelCode: Propagate error
    end

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L11 - L29)  src/lib.rs(L175 - L183) 

Memory Region Validation

Before any user memory access, a series of validation steps ensure memory safety:

  1. Alignment Checking: Ensures the pointer is properly aligned for the requested type
  2. Access Permission Verification: Checks that the memory region has appropriate read/write permissions
  3. Page Table Population: Ensures that all required pages are mapped in the address space
flowchart TD
start["Memory Access Request"]
align["Check Alignment"]
error["Return EFAULT"]
perms["Check Access Permissions"]
populate["Populate Page Tables"]
access["Set ACCESSING_USER_MEM flag"]
read["Access Memory"]
clear["Clear ACCESSING_USER_MEM flag"]
finish["Return Result"]

access --> read
align --> error
align --> perms
clear --> finish
perms --> error
perms --> populate
populate --> access
populate --> error
read --> clear
start --> align

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L110 - L117) 

Context-Aware Page Fault Handling

A key aspect of the memory safety architecture is handling page faults during user memory access. This is accomplished through the ACCESSING_USER_MEM flag, which indicates when the kernel is accessing user memory.

stateDiagram-v2
state AccessingUser {
    [*] --> Reading
    Reading --> PageFault : Page not present
    PageFault --> Reading : Handle fault safely
}
[*] --> Normal
Normal --> AccessingUser : set ACCESSING_USER_MEM = true
AccessingUser --> Normal : set ACCESSING_USER_MEM = false

The architecture uses a per-CPU variable to track this state:

#[percpu::def_percpu]
static mut ACCESSING_USER_MEM: bool = false;

When set to true, the OS knows that any page faults occurring should be handled differently than regular kernel page faults, preventing kernel crashes from invalid user memory accesses.

Sources: src/lib.rs(L11 - L29)  src/lib.rs(L22 - L29) 

Null-Terminated Data Handling

The architecture includes specialized handling for null-terminated data like C strings, which is particularly important for OS interfaces:

flowchart TD
request["Request null-terminated data"]
check["check_null_terminated()"]
page["Process page by page"]
scan["Scan for null terminator"]
return["Return validated slice"]

check --> page
page --> scan
request --> check
scan --> return

This process efficiently handles null-terminated structures while maintaining safety guarantees by:

  1. Validating pages incrementally as needed
  2. Handling page faults appropriately during traversal
  3. Returning the correctly sized slice or string when the null terminator is found

Sources: src/lib.rs(L56 - L107)  src/lib.rs(L202 - L217)  src/lib.rs(L280 - L303) 

Security Implications

The memory safety architecture provides critical security guarantees:

  1. Protection against invalid memory access: Prevents kernel crashes from accessing invalid user memory
  2. Defense against privilege escalation: Ensures kernel code can only access user memory with proper permissions
  3. Safety from malicious user input: Validates user-provided pointers before use

By combining strong typing, rigorous validation, and context-aware fault handling, the architecture creates a comprehensive barrier against memory-related security vulnerabilities when crossing privilege boundaries.

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L11 - L29) 

Integration with Operating System

The architecture is designed to integrate with operating systems through:

flowchart TD
subgraph subGraph1["Process Management"]
    errnoSys["axerrno"]
    percpu["percpu"]
end
subgraph subGraph0["Memory Subsystem"]
    mmSys["axmm"]
    memAddr["memory_addr"]
    pageTable["page_table_multiarch"]
end
axptr["axptr"]
memorySubsystem["Memory Subsystem"]
os["Operating System Kernel"]
processManagement["Process Management"]

axptr --> errnoSys
axptr --> memAddr
axptr --> mmSys
axptr --> pageTable
axptr --> percpu
memorySubsystem --> os
processManagement --> os

The architecture's dependencies enable it to work with the underlying memory management system while providing a consistent error handling mechanism through Linux-compatible error codes.

Sources: Cargo.toml(L7 - L12) 

User Space Pointers

Relevant source files

This page details the UserPtr and UserConstPtr types provided by the axptr library, which serve as safe abstractions for accessing user-space memory from kernel code. These pointer types prevent common errors that could lead to kernel crashes when handling user memory, such as invalid pointers, improper alignment, and unauthorized memory access.

For information about address space management that these pointers rely on, see Address Space Management. For details on the safety mechanisms they implement, see Safety Mechanisms.

Core Pointer Types

The axptr library provides two primary pointer types for accessing user-space memory:

  1. UserPtr<T>: For read-write access to user-space memory
  2. UserConstPtr<T>: For read-only access to user-space memory

Both types are represented as transparent wrappers around raw pointers (*mut T and *const T respectively), providing a safe interface for kernel code to access user-space memory.

classDiagram
class UserPtr~T~ {
    +*mut T pointer
    +const ACCESS_FLAGS: MappingFlags
    +address() VirtAddr
    +as_ptr() *mut T
    +cast~U~() UserPtr~U~
    +is_null() bool
    +nullable() Option~Self~
    +get() LinuxResult~&mut T~
    +get_as_slice() LinuxResult~&mut [T]~
    +get_as_null_terminated() LinuxResult~&mut [T]~
}

class UserConstPtr~T~ {
    +*const T pointer
    +const ACCESS_FLAGS: MappingFlags
    +address() VirtAddr
    +as_ptr() *const T
    +cast~U~() UserConstPtr~U~
    +is_null() bool
    +nullable() Option~Self~
    +get() LinuxResult~&T~
    +get_as_slice() LinuxResult~&[T]~
    +get_as_null_terminated() LinuxResult~&[T]~
    +get_as_str() LinuxResult~&str~
}

class RawPointer {
    <<Rust raw pointer>>
    
    
}

UserPtr  --|>  RawPointer : "Wraps *mut T"
UserConstPtr  --|>  RawPointer : "Wraps *const T"

Sources: src/lib.rs(L128 - L130)  src/lib.rs(L219 - L221) 

Creating User Space Pointers

Both pointer types can be created from a user-space address represented as a usize:

flowchart TD
A["User-space address (usize)"]
B["From::from()"]
C["UserPtr or UserConstPtr"]

A --> B
B --> C

Sources: src/lib.rs(L130 - L134)  src/lib.rs(L221 - L225) 

Memory Access Flow

The main purpose of these pointer types is to provide safe access to user-space memory. The following diagram illustrates the flow of operations when accessing user memory:

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant UserPtrUserConstPtr as "UserPtr/UserConstPtr"
    participant AddrSpaceProvider as "AddrSpaceProvider"
    participant check_region as "check_region()"
    participant UserMemory as "User Memory"

    KernelCode ->> UserPtrUserConstPtr: Call get(), get_as_slice(), etc.
    UserPtrUserConstPtr ->> AddrSpaceProvider: Pass to AddrSpaceProvider
    AddrSpaceProvider ->> check_region: Call check_region()
    check_region ->> check_region: Verify alignment
    check_region ->> check_region: Check access permissions
    check_region ->> check_region: Populate page tables
    alt Memory access allowed
        check_region ->> UserPtrUserConstPtr: Return OK
        UserPtrUserConstPtr ->> UserMemory: Set ACCESSING_USER_MEM flag
        UserPtrUserConstPtr ->> UserMemory: Access memory safely
        UserPtrUserConstPtr ->> UserMemory: Clear ACCESSING_USER_MEM flag
        UserPtrUserConstPtr ->> KernelCode: Return memory reference
    else Memory access denied
        check_region ->> UserPtrUserConstPtr: Return EFAULT error
        UserPtrUserConstPtr ->> KernelCode: Propagate error
    end

Sources: src/lib.rs(L11 - L20)  src/lib.rs(L22 - L29)  src/lib.rs(L31 - L54)  src/lib.rs(L175 - L183)  src/lib.rs(L258 - L266) 

Core Methods

Common Methods

Both UserPtr<T> and UserConstPtr<T> provide the following methods:

MethodReturn TypeDescription
address()VirtAddrGets the virtual address of the pointer
as_ptr()*mut T/*const TUnwraps the pointer into a raw pointer (unsafe)
cast()UserPtr/UserConstPtrCasts the pointer to a different type
is_null()boolChecks if the pointer is null
nullable()OptionConverts toOption, returningNoneif null

Sources: src/lib.rs(L136 - L169)  src/lib.rs(L227 - L254) 

Memory Access Methods

ForUserPtr:

MethodReturn TypeDescription
get(aspace)LinuxResult<&mut T>Gets mutable access to the pointed value
get_as_slice(aspace, length)LinuxResult<&mut [T]>Gets mutable access to a slice of values
get_as_null_terminated(aspace)LinuxResult<&mut [T]>Gets mutable access to a null-terminated array

Sources: src/lib.rs(L171 - L198)  src/lib.rs(L201 - L217) 

ForUserConstPtr:

MethodReturn TypeDescription
get(aspace)LinuxResult<&T>Gets read-only access to the pointed value
get_as_slice(aspace, length)LinuxResult<&[T]>Gets read-only access to a slice of values
get_as_null_terminated(aspace)LinuxResult<&[T]>Gets read-only access to a null-terminated array
get_as_str()(only forUserConstPtr<c_char>)LinuxResult<&'static str>Gets read-only access as a UTF-8 string

Sources: src/lib.rs(L256 - L278)  src/lib.rs(L280 - L292)  src/lib.rs(L294 - L303) 

Safety Mechanisms

The main safety mechanisms implemented by these types include:

  1. Memory Region Validation: Before accessing user memory, the pointer types check if the memory region is accessible with the required permissions.
  2. Alignment Checks: Ensures the memory is properly aligned for the requested type.
  3. Page Table Population: Automatically populates page tables if necessary.
  4. Page Fault Handling: Using a flag to indicate when accessing user memory to properly handle page faults.
flowchart TD
A["Access request via get()/get_as_slice()"]
B["Check alignment"]
C["Return EFAULT error"]
D["Check region permissions"]
E["Populate page tables"]
F["Return error"]
G["Set ACCESSING_USER_MEM flag"]
H["Access memory"]
I["Clear ACCESSING_USER_MEM flag"]
J["Return reference to memory"]

A --> B
B --> C
B --> D
D --> C
D --> E
E --> F
E --> G
G --> H
H --> I
I --> J

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L11 - L12)  src/lib.rs(L18 - L20)  src/lib.rs(L22 - L29) 

Null-Terminated Data Handling

A special feature of the pointer types is their ability to safely handle null-terminated data (such as C strings). The get_as_null_terminated() method performs a specialized check that scans the user memory page by page until it finds a null terminator.

flowchart TD
A["get_as_null_terminated()"]
B["check_null_terminated()"]
C["Check alignment"]
D["Set ACCESSING_USER_MEM flag"]
E["Scan memory page by page"]
F["Found null terminator?"]
G["Clear ACCESSING_USER_MEM flag"]
H["Check next page permissions"]
I["Return EFAULT error"]
J["Return reference to memory slice"]

A --> B
B --> C
C --> D
D --> E
E --> F
F --> G
F --> H
G --> J
H --> E
H --> I

Sources: src/lib.rs(L56 - L107)  src/lib.rs(L201 - L217)  src/lib.rs(L280 - L292) 

Type-Specific Operations

The UserConstPtr<c_char> type provides additional functionality specifically for handling C strings:

flowchart TD
A["UserConstPtr"]
B["get_as_null_terminated()"]
C["Obtain character slice"]
D["Convert to u8 slice"]
E["from_utf8()"]
F["Return &str"]
G["Return EILSEQ error"]

A --> B
B --> C
C --> D
D --> E
E --> F
E --> G

Sources: src/lib.rs(L294 - L303) 

Integration with Address Space Management

The user pointer types work with the AddrSpaceProvider trait to abstract address space operations. This allows them to work with different address space implementations as long as they implement this trait.

classDiagram
class AddrSpaceProvider {
    <<trait>>
    
    +with_addr_space(f) R
}

class UserPtr~T~ {
    
    +get(aspace: impl AddrSpaceProvider)
    +get_as_slice(aspace: impl AddrSpaceProvider, length)
    +get_as_null_terminated(aspace: impl AddrSpaceProvider)
}

class UserConstPtr~T~ {
    
    +get(aspace: impl AddrSpaceProvider)
    +get_as_slice(aspace: impl AddrSpaceProvider, length)
    +get_as_null_terminated(aspace: impl AddrSpaceProvider)
    +get_as_str(aspace: impl AddrSpaceProvider)
}

UserPtr  -->  AddrSpaceProvider : "Requires"
UserConstPtr  -->  AddrSpaceProvider : "Requires"

Sources: src/lib.rs(L119 - L126)  src/lib.rs(L175 - L183)  src/lib.rs(L258 - L266) 

Common Usage Patterns

The typical usage pattern for user space pointers in kernel code involves:

  1. Receiving a user-space address as a usize
  2. Converting it to a UserPtr<T> or UserConstPtr<T>
  3. Using the appropriate get method to safely access the memory
  4. Handling potential errors (EFAULT, EILSEQ, etc.)
sequenceDiagram
    participant KernelFunction as "Kernel Function"
    participant UserPtrT as "UserPtr<T>"
    participant AddrSpace as "AddrSpace"
    participant UserMemory as "User Memory"

    KernelFunction ->> UserPtrT: Create from user address
    KernelFunction ->> UserPtrT: Call get(), get_as_slice(), etc.
    UserPtrT ->> AddrSpace: Request permission check
    AddrSpace ->> AddrSpace: Validate memory region
    AddrSpace ->> UserPtrT: Return result
    alt Access Granted
        UserPtrT ->> UserMemory: Safely access memory
        UserMemory ->> KernelFunction: Return data/reference
    else Access Denied
        UserPtrT ->> KernelFunction: Return error (EFAULT)
    end

Sources: src/lib.rs(L130 - L134)  src/lib.rs(L221 - L225)  src/lib.rs(L175 - L183)  src/lib.rs(L258 - L266) 

Address Space Management

Relevant source files

Purpose and Scope

This document covers the address space management components within the axptr library. The address space management layer provides an abstraction for safely interacting with user-space memory through virtual address spaces and page tables. For information about the user space pointers that utilize this abstraction, see User Space Pointers.

Overview

Address space management in axptr is built around the AddrSpaceProvider trait, which serves as a bridge between user pointers (UserPtr/UserConstPtr) and the underlying memory management system. This abstraction allows for flexible implementation of address space operations while maintaining memory safety guarantees.

flowchart TD
subgraph subGraph2["Memory Management Layer"]
    check["Memory Region Checking"]
    populate["Page Table Population"]
end
subgraph subGraph1["Address Space Layer"]
    asp["AddrSpaceProvider trait"]
    aspimp["AddrSpace implementation"]
end
subgraph subGraph0["User Pointer Layer"]
    userptr["UserPtr"]
    userconstptr["UserConstPtr"]
end

asp --> aspimp
aspimp --> check
aspimp --> populate
userconstptr --> asp
userptr --> asp

Sources: src/lib.rs(L119 - L126)  src/lib.rs(L31 - L54) 

AddrSpaceProvider Trait

The AddrSpaceProvider trait defines a contract for accessing an address space. It contains a single method that allows temporary access to an AddrSpace object through a closure.

classDiagram
class AddrSpaceProvider {
    <<trait>>
    
    +with_addr_space(f: impl FnOnce(&mut AddrSpace) -~ R) -~ R
}

class AddrSpace {
    
    +check_region_access(range: VirtAddrRange, flags: MappingFlags) -~ bool
    +populate_area(start: VirtAddr, size: usize) -~ LinuxResult~() ~
}

AddrSpace  -->  AddrSpaceProvider : provides access to

Sources: src/lib.rs(L119 - L121)  src/lib.rs(L122 - L126) 

Implementation and Usage

The library provides a default implementation of AddrSpaceProvider for &mut AddrSpace:

#![allow(unused)]
fn main() {
impl AddrSpaceProvider for &mut AddrSpace {
    fn with_addr_space<R>(&mut self, f: impl FnOnce(&mut AddrSpace) -> R) -> R {
        f(self)
    }
}
}

This simple implementation allows a mutable reference to an AddrSpace to be used as an AddrSpaceProvider. The implementation pattern ensures that the AddrSpace is only accessible within the provided closure, enforcing proper resource management.

Sources: src/lib.rs(L122 - L126) 

Memory Region Management

The address space management layer is responsible for two primary operations:

  1. Checking if a memory region is accessible with specific permissions
  2. Populating page tables to ensure memory is mapped when accessed

These operations are encapsulated in the check_region function:

flowchart TD
A["check_region(aspace, start, layout, access_flags)"]
B["Check address alignment"]
C["Return EFAULT"]
D["Check region access permissions"]
E["Calculate page boundaries"]
F["Populate page tables"]
G["Return error"]
H["Return Ok(())"]

A --> B
B --> C
B --> D
D --> C
D --> E
E --> F
F --> G
F --> H

Sources: src/lib.rs(L31 - L54) 

Region Checking Process

The check_region function performs several validation steps:

  1. Alignment Check: Verifies that the memory address is properly aligned for the requested data type
  2. Permission Check: Ensures the memory region has the appropriate access flags (read/write)
  3. Page Table Population: Maps the necessary pages in virtual memory

This function returns a LinuxResult<()> which is Ok(()) if the region is valid and accessible, or Err(LinuxError::EFAULT) if the region cannot be accessed.

Sources: src/lib.rs(L31 - L54) 

Integration with User Pointers

The address space management layer is primarily used by the user pointer types (UserPtr and UserConstPtr) to safely access user-space memory. These types call into the address space abstraction whenever they need to validate memory accesses.

sequenceDiagram
    participant UserPtrUserConstPtr as "UserPtr/UserConstPtr"
    participant AddrSpaceProvider as "AddrSpaceProvider"
    participant AddrSpace as "AddrSpace"
    participant check_region as "check_region"

    UserPtrUserConstPtr ->> AddrSpaceProvider: with_addr_space(closure)
    AddrSpaceProvider ->> AddrSpace: invoke closure with &mut AddrSpace
    AddrSpace ->> check_region: check_region(start, layout, flags)
    check_region -->> AddrSpace: Ok(()) or Err(LinuxError)
    AddrSpace -->> AddrSpaceProvider: return result
    AddrSpaceProvider -->> UserPtrUserConstPtr: return result
    UserPtrUserConstPtr ->> UserPtrUserConstPtr: access memory if Ok

Sources: src/lib.rs(L175 - L182)  src/lib.rs(L258 - L266) 

Helper Function: check_region_with

To simplify the interaction between user pointers and address space providers, the library includes a check_region_with helper function:

#![allow(unused)]
fn main() {
fn check_region_with(
    mut aspace: impl AddrSpaceProvider,
    start: VirtAddr,
    layout: Layout,
    access_flags: MappingFlags,
) -> LinuxResult<()> {
    aspace.with_addr_space(|aspace| check_region(aspace, start, layout, access_flags))
}
}

This function takes an AddrSpaceProvider and delegates to the check_region function, simplifying the code in the user pointer methods.

Sources: src/lib.rs(L110 - L117) 

Null-Terminated Data Handling

Special handling is provided for null-terminated data (like C strings) through the check_null_terminated function. This function safely traverses memory until it finds a null terminator, validating each page as needed.

flowchart TD
A["check_null_terminated(aspace, start, access_flags)"]
B["Check address alignment"]
C["Return EFAULT"]
D["Initialize variables"]
E["Begin memory traversal loop"]
F["Check if current position crosses page boundary"]
G["Validate next page access permissions"]
H["Return EFAULT"]
I["Move to next page"]
J["Read memory at current position"]
K["Check if value is null terminator"]
L["Return pointer and length"]
M["Increment length"]

A --> B
B --> C
B --> D
D --> E
E --> F
F --> G
F --> J
G --> H
G --> I
I --> J
J --> K
K --> L
K --> M
M --> E

Sources: src/lib.rs(L56 - L107) 

The check_null_terminated function uses the access_user_memory helper to set a thread-local flag that indicates user memory is being accessed, allowing the kernel to handle page faults correctly.

Memory Access Context Management

To safely handle page faults during user memory access, the address space management system uses a thread-local flag:

flowchart TD
A["access_user_memory(f)"]
B["Set ACCESSING_USER_MEM = true"]
C["Execute closure f"]
D["Set ACCESSING_USER_MEM = false"]
E["Return result of f"]

A --> B
B --> C
C --> D
D --> E

Sources: src/lib.rs(L22 - L29)  src/lib.rs(L11 - L20) 

The is_accessing_user_memory() function provides a way for the OS to check if a page fault occurred during a legitimate user memory access, allowing it to handle these faults differently from other kernel faults.

Sources: src/lib.rs(L14 - L20) 

Implementation Notes

  • The address space management layer is designed to be minimal yet flexible, providing only the necessary abstractions for safe user memory access
  • The AddrSpaceProvider trait follows the resource acquisition is initialization (RAII) pattern, ensuring proper resource management
  • All memory checks are performed before memory is accessed, preventing undefined behavior
  • Page table population is done lazily, only when memory is actually accessed

Sources: src/lib.rs(L119 - L126)  src/lib.rs(L31 - L54) 

Safety Mechanisms

Relevant source files

This document details the safety mechanisms implemented in the axptr library to prevent kernel crashes when accessing user memory. These mechanisms form a critical layer of protection for kernel code that needs to interact with user-space memory securely and robustly. For information about the basic types used for user memory access, see User Space Pointers and for address space abstractions, see Address Space Management.

Overview of Safety Layers

The axptr library implements multiple safety layers that work together to ensure user memory access is handled safely from kernel code.

flowchart TD
A["Kernel Code"]
B["UserPtr / UserConstPtr"]
C1["Null Pointer Check"]
C2["Memory Alignment"]
C3["Access Permissions"]
C4["Page Table Population"]
C5["Page Fault Handling"]
D["Safe Memory Access"]

A --> B
B --> C1
B --> C2
B --> C3
B --> C4
B --> C5
C1 --> D
C2 --> D
C3 --> D
C4 --> D
C5 --> D

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L11 - L29)  src/lib.rs(L175 - L216)  src/lib.rs(L258 - L302) 

Memory Region Checking

Before any user memory access is permitted, axptr performs thorough validation of the memory region to be accessed.

Alignment Verification

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant UserPtrget as "UserPtr::get()"
    participant check_region as "check_region()"

    KernelCode ->> UserPtrget: Request memory access
    UserPtrget ->> check_region: Validate region
    check_region ->> check_region: Check alignment
    Note over check_region: if start.as_usize() & (align - 1) != 0
    check_region -->> UserPtrget: Return EFAULT if misaligned

Sources: src/lib.rs(L37 - L40) 

Memory alignment verification ensures that pointers are properly aligned for the data type being accessed. Misaligned memory access can cause hardware exceptions on some architectures or inefficient memory operations on others.

Access Permission Validation

sequenceDiagram
    participant check_region as "check_region()"
    participant AddrSpace as "AddrSpace"

    check_region ->> AddrSpace: check_region_access(range, flags)
    AddrSpace -->> check_region: true/false
    Note over check_region: Return EFAULT if access not permitted

Sources: src/lib.rs(L42 - L47) 

The function checks that the memory range is accessible with the requested permissions (read-only or read-write). This prevents the kernel from attempting to access protected user memory regions.

Page Table Population

sequenceDiagram
    participant check_region as "check_region()"
    participant AddrSpace as "AddrSpace"

    check_region ->> check_region: Calculate page_start and page_end
    check_region ->> AddrSpace: populate_area(page_start, size)
    AddrSpace -->> check_region: Result (success/error)
    Note over check_region: Propagate error or proceed

Sources: src/lib.rs(L49 - L52) 

The system ensures that page tables are populated for the entire memory region being accessed. This helps prevent page faults during access by pre-populating the necessary page tables.

Context-Aware Page Fault Handling

A critical safety feature is the context-aware page fault handling mechanism, which allows the OS to distinguish between legitimate page faults while accessing user memory and actual kernel bugs.

flowchart TD
subgraph subGraph0["User Memory Access Context"]
    C["Set ACCESSING_USER_MEM = true"]
    D["Execute memory access"]
    E["Set ACCESSING_USER_MEM = false"]
end
A["Kernel Code"]
B["access_user_memory()"]
F["Page Fault Handler"]
G["is_accessing_user_memory()?"]
H["Handle as legitimateuser memory fault"]
I["Handle as kernel bug(panic/oops)"]

A --> B
B --> C
C --> D
D --> E
F --> G
G --> H
G --> I

Sources: src/lib.rs(L11 - L29)  src/lib.rs(L73 - L104) 

The system uses a per-CPU flag ACCESSING_USER_MEM to track whether kernel code is actively accessing user memory. This information is crucial for the OS's page fault handler, allowing it to:

  1. Properly handle page faults occurring during legitimate user memory access
  2. Correctly identify true kernel bugs that would otherwise cause crashes

This approach enables the kernel to access user memory regions that may trigger page faults (e.g., due to swapped pages) without crashing.

Null-Terminated Data Handling

The axptr library provides special handling for null-terminated data from user space, such as C-style strings.

flowchart TD
subgraph subGraph0["Protected by access_user_memory()"]
    F["Process memory page by page"]
    G["More pagesto check?"]
    H["Check page permissions"]
    I["Permissiongranted?"]
    J["Move to next page"]
    K["Search for null terminator"]
end
A["Kernel Code"]
B["UserPtr::get_as_null_terminated()"]
C["check_null_terminated()"]
D["Alignment Check"]
E["Return EFAULT"]
L["Return validatedslice to caller"]

A --> B
B --> C
C --> D
D --> E
D --> F
F --> G
G --> H
G --> K
H --> I
I --> E
I --> J
J --> G
K --> L

Sources: src/lib.rs(L56 - L107)  src/lib.rs(L202 - L217)  src/lib.rs(L280 - L292)  src/lib.rs(L294 - L303) 

This process handles the special case of null-terminated data structures (like C strings) where the length is not known in advance. The implementation:

  1. Validates memory alignment
  2. Checks permissions page by page as it traverses the data
  3. Executes within the access_user_memory() context to safely handle potential page faults
  4. Efficiently searches for the null terminator
  5. Returns a safe slice reference once validation is complete

Code Structure and Implementation

The table below summarizes how the safety mechanisms are implemented across various functions:

Safety MechanismImplementationKey Functions
Null Pointer DetectionBuilt into UserPtr/UserConstPtris_null(),nullable()
Alignment VerificationCheck incheck_region()functioncheck_region()
Permission ValidationValidation viaAddrSpacecheck_region_access()
Page Table PopulationEnsures pages are ready for accesspopulate_area()
Page Fault ProtectionPer-CPU flag tracks access contextaccess_user_memory(),is_accessing_user_memory()
Null-Terminated Data HandlingSpecial validation routinecheck_null_terminated()

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L56 - L107)  src/lib.rs(L11 - L29)  src/lib.rs(L158 - L169)  src/lib.rs(L245 - L253) 

Integration with Memory Access Functions

The safety mechanisms are integrated into all user memory access methods. The diagram below illustrates how UserPtr and UserConstPtr utilize these mechanisms:

classDiagram
class UserPtr~T~ {
    
    +get(aspace)
    +get_as_slice(aspace, length)
    +get_as_null_terminated(aspace)
}

class UserConstPtr~T~ {
    
    +get(aspace)
    +get_as_slice(aspace, length)
    +get_as_null_terminated(aspace)
    +get_as_str()
}

class SafetyMechanisms {
    
    +check_region()
    +check_null_terminated()
    +access_user_memory()
    +is_accessing_user_memory()
}

UserPtr  -->  SafetyMechanisms : uses
UserConstPtr  -->  SafetyMechanisms : uses

Sources: src/lib.rs(L175 - L198)  src/lib.rs(L202 - L216)  src/lib.rs(L258 - L277)  src/lib.rs(L280 - L302) 

Each access method (get(), get_as_slice(), etc.) applies the appropriate safety mechanisms before permitting memory access, ensuring that all user memory operations are properly validated and protected.

Error Handling

When safety checks fail, the system returns appropriate error codes to the caller rather than crashing:

  • Misaligned memory: EFAULT
  • Inaccessible memory regions: EFAULT
  • Page table population failures: Various errors propagated from the underlying system
  • Invalid UTF-8 in strings: EILSEQ

This approach allows kernel code to gracefully handle user memory access failures without compromising system stability.

Sources: src/lib.rs(L39 - L40)  src/lib.rs(L46 - L47)  src/lib.rs(L301 - L302) 

Memory Region Checking

Relevant source files

Purpose and Scope

This document explains the memory region checking mechanisms in the axptr library that validate user-space memory regions before they are accessed by kernel code. These validation mechanisms ensure memory safety by verifying alignment, access permissions, and page table population before allowing actual memory access. This is a critical component of the safety mechanisms in axptr.

For information about how page faults are handled during memory access, see Context-Aware Page Fault Handling.

Overview

Memory region checking is a multi-step validation process that occurs before any user-space memory access. This process ensures that the kernel does not crash when accessing potentially invalid memory regions.

flowchart TD
A["Memory Access Request"]
B["Alignment Check"]
C["Return EFAULT"]
D["Access Permission Check"]
E["Page Table Population"]
F["Return Error"]
G["Safe Memory Access"]

A --> B
B --> C
B --> D
D --> C
D --> E
E --> F
E --> G

Sources: src/lib.rs(L31 - L54) 

Memory Region Checking Process

The memory region checking process happens in three main stages:

  1. Alignment Verification: Ensures the memory address aligns with the required alignment for the data type
  2. Access Permission Checking: Verifies the process has appropriate permissions for the memory region
  3. Page Table Population: Ensures pages are mapped into memory before access

Core Implementation

The central function for memory region checking is check_region, which takes an address space, starting address, memory layout, and access flags as parameters:


Sources: src/lib.rs(L31 - L54)  src/lib.rs(L110 - L117) 

Alignment Verification

The first check performed is alignment verification, which ensures that the memory address is properly aligned for the data type being accessed.

flowchart TD
A["Memory Address"]
B["Extract Alignment Requirement"]
C["Address & (align - 1) == 0?"]
D["Proceed to Next Check"]
E["Return EFAULT"]

A --> B
B --> C
C --> D
C --> E

For a memory address to be properly aligned, the memory address modulo the alignment requirement must be zero. This is checked using the bitwise AND operation:

if start.as_usize() & (align - 1) != 0 {
    return Err(LinuxError::EFAULT);
}

Sources: src/lib.rs(L37 - L40)  src/lib.rs(L61 - L64) 

Access Permission Checking

The second check verifies that the memory region has the appropriate access permissions:

flowchart TD
A["Create VirtAddrRange"]
B["Call check_region_access()"]
C["Return EFAULT"]
D["Proceed to Page Table Population"]

A --> B
B --> C
B --> D

The check_region_access method on the AddrSpace object determines if the current process has the necessary permissions to access the memory range with the specified access flags. The access flags are different for UserPtr (READ|WRITE) and UserConstPtr (READ only).

Sources: src/lib.rs(L42 - L47)  src/lib.rs(L137)  src/lib.rs(L228) 

Page Table Population

The final step is to ensure that the pages containing the memory region are mapped into physical memory:

flowchart TD
A["Calculate Page Boundaries"]
B["page_start = start.align_down_4k()"]
C["page_end = (start + size).align_up_4k()"]
D["Call populate_area()"]
E["Return Ok(())"]
F["Return Error"]

A --> B
B --> C
C --> D
D --> E
D --> F

This step aligns the address range to page boundaries and calls populate_area to ensure that all necessary pages are mapped and available for access.

Sources: src/lib.rs(L49 - L53) 

Null-Terminated Data Handling

A specialized checking mechanism exists for null-terminated data like C strings:

flowchart TD
A["Check Alignment"]
B["Return EFAULT"]
C["Process Page by Page"]
D["Check Page Access Permissions"]
E["Move to Next Page if Needed"]
F["Read Memory and Check for Null Terminator"]
G["Return Pointer and Length"]
H["Increment Length"]

A --> B
A --> C
C --> D
D --> B
D --> E
E --> F
F --> G
F --> H
H --> C

The check_null_terminated function scans memory page by page, checking access permissions for each page, until it finds the null terminator. This is used by the get_as_null_terminated methods on both pointer types.

Sources: src/lib.rs(L56 - L107)  src/lib.rs(L204 - L217)  src/lib.rs(L282 - L291) 

Integration With User Pointer Types

Memory region checking is integrated into the UserPtr and UserConstPtr types through their access methods:


Each access method performs the appropriate memory region checks before allowing access to the memory.

Access MethodPurposeChecks Performed
get()Access a single itemAlignment, permissions, page population
get_as_slice()Access an array of itemsAlignment, permissions, page population
get_as_null_terminated()Access a null-terminated arrayAlignment, permissions, page-by-page scanning
get_as_str()Access a C string (UserConstPtr only)All checks fromget_as_null_terminated()plus UTF-8 validation

Sources: src/lib.rs(L175 - L183)  src/lib.rs(L186 - L198)  src/lib.rs(L204 - L217)  src/lib.rs(L258 - L266)  src/lib.rs(L269 - L277)  src/lib.rs(L282 - L291)  src/lib.rs(L296 - L302) 

Error Handling

Memory region checking functions propagate errors using the LinuxResult type. The primary error returned is LinuxError::EFAULT, which indicates an invalid address or permission error:

Error ConditionError Value
Misaligned addressEFAULT
Access permission deniedEFAULT
Page population failure(Propagated frompopulate_area)
Invalid UTF-8 in string (forget_as_str)EILSEQ

Memory region checking ensures that these errors are detected before any actual memory access occurs, preventing kernel crashes.

Sources: src/lib.rs(L39)  src/lib.rs(L46)  src/lib.rs(L51)  src/lib.rs(L301) 

Relationship with Address Space Management

Memory region checking relies on the address space management capabilities provided by the AddrSpace type:

sequenceDiagram
    participant UserPtrUserConstPtr as UserPtr/UserConstPtr
    participant check_region_with as check_region_with()
    participant AddrSpaceProvider as AddrSpaceProvider
    participant check_region as check_region()
    participant AddrSpace as AddrSpace

    UserPtrUserConstPtr ->> check_region_with: "Request memory check"
    check_region_with ->> AddrSpaceProvider: "with_addr_space()"
    AddrSpaceProvider ->> check_region: "Provide AddrSpace"
    check_region ->> AddrSpace: "check_region_access()"
    AddrSpace -->> check_region: "Return access status"
    check_region ->> AddrSpace: "populate_area()"
    AddrSpace -->> check_region: "Return population status"
    check_region -->> check_region_with: "Return check result"
    check_region_with -->> UserPtrUserConstPtr: "Return check result"

The AddrSpaceProvider trait abstracts the process of obtaining an AddrSpace object, which provides the necessary methods for checking access permissions and populating page tables.

For more information about address space management, see Address Space Management.

Sources: src/lib.rs(L110 - L126) 

Performance Considerations

Memory region checking adds overhead to each user memory access, but this overhead is necessary to maintain memory safety. The implementation includes some optimizations:

  1. Alignment checks are performed first as they are the cheapest
  2. Permission checks are done before attempting to populate page tables
  3. Page population is done at page granularity to minimize the number of operations

For null-terminated data, the checking is more complex and potentially more expensive, as it must scan the data page by page until it finds the null terminator.

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L56 - L107) 

Context-Aware Page Fault Handling

Relevant source files

Purpose and Scope

This document explains the mechanism used by axptr to safely handle page faults that may occur when the kernel accesses user space memory. The system uses a per-CPU flag to inform the operating system when user memory access is in progress, allowing the kernel to differentiate between legitimate page faults during user memory access and actual kernel bugs. For information about how memory regions are validated before access, see Memory Region Checking.

The Challenge of Accessing User Memory

When kernel code accesses user space memory, multiple issues can arise:

  1. The memory might not be currently mapped (page fault)
  2. The user process might have just freed the memory
  3. The user might have provided an invalid pointer

Without proper handling, these scenarios would cause a kernel panic, as page faults in kernel mode are typically considered fatal errors. Context-aware page fault handling provides a solution to this problem.

flowchart TD
A["Kernel Code"]
B["User Space Memory"]
C["Challenge: Memory might not be mapped"]
D["Challenge: Memory might be invalid"]
E["Challenge: Page fault in kernel mode = crash"]
F["Solution: Context-Aware Page Fault Handling"]

A --> B
B --> C
B --> D
B --> E
C --> F
D --> F
E --> F

Sources: src/lib.rs(L11 - L20) 

ACCESSING_USER_MEM Flag

The core of this mechanism is a per-CPU boolean flag named ACCESSING_USER_MEM. This flag indicates whether the kernel is currently accessing user memory, allowing the page fault handler to make an informed decision about how to respond to a page fault.

#[percpu::def_percpu]
static mut ACCESSING_USER_MEM: bool = false;

This flag is:

  • Defined as a per-CPU variable, so each CPU core has its own instance
  • Initially set to false
  • Set to true immediately before accessing user memory
  • Reset to false after the access is complete

The operating system checks this flag when a page fault occurs to determine whether to treat it as a legitimate page fault (allowing recovery) or as a kernel bug (triggering a panic).

Sources: src/lib.rs(L11 - L12) 

How Context-Aware Handling Works

The context-aware page fault handling process follows these steps:

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant access_user_memory as "access_user_memory()"
    participant ACCESSING_USER_MEMFlag as "ACCESSING_USER_MEM Flag"
    participant PageFaultHandler as "Page Fault Handler"
    participant UserSpaceMemory as "User Space Memory"

    KernelCode ->> access_user_memory: Call with closure to access user memory
    access_user_memory ->> ACCESSING_USER_MEMFlag: Set flag to true
    access_user_memory ->> UserSpaceMemory: Access user memory
    alt Memory access causes page fault
        UserSpaceMemory ->> PageFaultHandler: Trigger page fault
        PageFaultHandler ->> ACCESSING_USER_MEMFlag: Check flag
        ACCESSING_USER_MEMFlag ->> PageFaultHandler: Flag = true (accessing user memory)
        PageFaultHandler ->> UserSpaceMemory: Handle fault (map page, etc.)
        UserSpaceMemory ->> access_user_memory: Continue execution
    end
    access_user_memory ->> ACCESSING_USER_MEMFlag: Set flag to false
    access_user_memory ->> KernelCode: Return result

Sources: src/lib.rs(L22 - L29) 

Implementation Details

The is_accessing_user_memory Function

This function allows the operating system's page fault handler to check whether a page fault occurred during legitimate user memory access:

#![allow(unused)]
fn main() {
pub fn is_accessing_user_memory() -> bool {
    ACCESSING_USER_MEM.read_current()
}
}

As the documentation states: "OS implementation shall allow page faults from kernel when this function returns true."

Sources: src/lib.rs(L14 - L20) 

The access_user_memory Function

This function manages the context flag around a closure that accesses user memory:

fn access_user_memory<R>(f: impl FnOnce() -> R) -> R {
    ACCESSING_USER_MEM.with_current(|v| {
        *v = true;
        let result = f();
        *v = false;
        result
    })
}

Key points:

  • Takes a closure f that performs the actual user memory access
  • Sets the flag before executing the closure
  • Captures the result from the closure
  • Clears the flag after execution
  • Returns the result

Sources: src/lib.rs(L22 - L29) 

Integration with Memory Access Functions

The context-aware page fault handling is primarily used when accessing potentially problematic user memory, such as when reading null-terminated arrays or strings.

Example: Null-Terminated Data Handling

The check_null_terminated function uses this mechanism to safely scan user memory for a null terminator:

flowchart TD
subgraph subGraph0["Protected Region"]
    F["Scan memory looking for null terminator"]
    G["Check next page access permissions"]
    H["Continue scanning"]
    I["Return EFAULT"]
    J["Break loop"]
end
A["check_null_terminated()"]
B["Validate alignment"]
C["Prepare for scanning"]
D["Call access_user_memory()"]
E["Set ACCESSING_USER_MEM flag"]
K["Clear ACCESSING_USER_MEM flag"]
L["Return pointer and length"]

A --> B
B --> C
C --> D
D --> E
E --> F
F --> G
G --> H
G --> I
H --> F
H --> J
I --> K
J --> K
K --> L

The function:

  1. Validates the initial alignment of the memory region
  2. Sets up scanning variables
  3. Crucially wraps the scanning loop in access_user_memory()
  4. Within the protected region, handles page boundaries and potential faults
  5. Returns a pointer and length when successful

Sources: src/lib.rs(L56 - L107) 

System Interactions

Here's how the context-aware page fault handling interacts with different system components:


Sources: src/lib.rs(L11 - L29)  src/lib.rs(L56 - L107)  src/lib.rs(L204 - L216)  src/lib.rs(L282 - L291) 

Example Use Case

Consider what happens when a kernel function tries to access a user-provided null-terminated string that spans multiple pages, where some pages might not be mapped yet:

StepDescriptionFlag StateSystem Behavior
1User calls kernel with string pointerfalseNormal operation
2Kernel callsUserConstPtr::get_as_str()falseNormal operation
3access_user_memory()is calledtruePrepared for potential page faults
4Memory is accessed, causing page faulttrueOS handles fault instead of panicking
5OS maps the pagetrueExecution continues
6String scan completesfalse(reset)Return to normal operation

Without this mechanism, any unmapped page in the user string would crash the kernel, even if the user's access was legitimate.

Sources: src/lib.rs(L295 - L302) 

Key Benefits

  1. Safety: Prevents kernel crashes from legitimate user memory accesses
  2. Transparency: Kernel code can access user memory without explicit fault handling
  3. Efficiency: No need for complex user/kernel copying mechanisms
  4. Robustness: Properly handles both valid and invalid memory access scenarios

Sources: src/lib.rs(L11 - L29) 

Null-Terminated Data Handling

Relevant source files

Purpose and Scope

This document explains how the axptr library safely handles null-terminated data structures in user memory, such as C-style strings and arrays. These special data structures have variable length and are terminated by a sentinel "null" value rather than having an explicit length parameter. For information about general memory region checking, see Memory Region Checking.

Overview

Null-terminated data structures present unique challenges for safe memory access. Unlike fixed-size arrays, their length cannot be determined without scanning the memory until a null terminator is found. This requires special handling to ensure memory safety while efficiently accessing these structures.

flowchart TD
subgraph subGraph0["Null-terminated Data Handling"]
    C["check_null_terminated()"]
    D["Alignment Verification"]
    E["Page-by-Page Scan"]
    F["Return validated pointer + length"]
end
A["Kernel Code"]
B["User Memory Pointer"]
G["Safe Access Methods"]
H["Null-terminated arrays"]
I["C-strings"]

A --> B
B --> C
B --> G
C --> D
D --> E
E --> F
G --> H
G --> I

Sources: src/lib.rs(L56 - L107)  src/lib.rs(L204 - L217)  src/lib.rs(L282 - L292)  src/lib.rs(L294 - L303) 

Core Mechanism

The axptr library implements a specialized mechanism for safely handling null-terminated data from user space. This is performed by the check_null_terminated function.

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant UserPtrUserConstPtr as "UserPtr/UserConstPtr"
    participant check_null_terminated as "check_null_terminated()"
    participant UserMemory as "User Memory"

    KernelCode ->> UserPtrUserConstPtr: get_as_null_terminated(aspace)
    UserPtrUserConstPtr ->> check_null_terminated: check address space & memory
    check_null_terminated ->> check_null_terminated: Check alignment
    check_null_terminated ->> check_null_terminated: Set up page tracking
    loop For each byte until null terminator
        check_null_terminated ->> check_null_terminated: Check if current position crosses page boundary
    alt Crosses page boundary
        check_null_terminated ->> check_null_terminated: Check if new page is accessible
        check_null_terminated ->> check_null_terminated: Move to next page
    end
    check_null_terminated ->> UserMemory: Read memory (with fault handling)
    UserMemory -->> check_null_terminated: Return value
    alt Value equals
    alt terminator
        check_null_terminated ->> check_null_terminated: Stop scanning
    else Value not terminator
    else Value not terminator
        check_null_terminated ->> check_null_terminated: Increment position & counter
    end
    end
    end
    check_null_terminated ->> UserPtrUserConstPtr: Return pointer & length
    UserPtrUserConstPtr ->> KernelCode: Return safe slice reference

Sources: src/lib.rs(L56 - L107) 

Memory Layout Processing

The function processes null-terminated data by checking memory one page at a time, efficiently handling arbitrarily long data structures without needing to know their size in advance.

  1. Alignment Check: Ensures the starting address is properly aligned for the specified type.
  2. Page-by-Page Processing: Handles memory in page-sized chunks, validating each page before access.
  3. Safe Memory Reading: Uses the access_user_memory function to safely read user memory with proper fault handling.
  4. Terminator Detection: Scans until it finds the terminator value (default value of type T).

The function returns a raw pointer to the start of the data and its length (excluding the terminator).

Sources: src/lib.rs(L56 - L107) 

Access Methods for Null-Terminated Data

The library provides specialized methods for both UserPtr<T> and UserConstPtr<T> to handle null-terminated data.

Methods for UserPtr

UserPtr<T> provides the get_as_null_terminated method for accessing mutable null-terminated arrays:


For types that implement Eq + Default, this method:

  1. Calls check_null_terminated with the appropriate access flags
  2. Converts the raw pointer and length into a safe mutable slice
  3. Returns the slice wrapped in a LinuxResult

Sources: src/lib.rs(L204 - L217) 

Methods for UserConstPtr

Similarly, UserConstPtr<T> provides a read-only version of the same functionality:


Sources: src/lib.rs(L282 - L292) 

C-String Handling

The library includes specialized handling for C-style strings through the get_as_str method on UserConstPtr<c_char>.

Processing Flow

flowchart TD
A["UserConstPtr"]
B["get_as_null_terminated()"]
C["Memory transmute to &[u8]"]
D["str::from_utf8()"]
E["Return &str"]
F["Return EILSEQ error"]

A --> B
B --> C
C --> D
D --> E
D --> F

Sources: src/lib.rs(L294 - L303) 

This method:

  1. Gets the null-terminated array of c_char characters
  2. Transmutes the slice from &[c_char] to &[u8] (safe since c_char is u8)
  3. Attempts to parse the byte slice as a UTF-8 string
  4. Returns either a valid string slice or an error if the string is not valid UTF-8

Sources: src/lib.rs(L294 - L303) 

Technical Implementation Details

Accessing User Memory Safely

The check_null_terminated function uses the access_user_memory helper to safely access user memory while handling page faults properly. This ensures that:

  1. The ACCESSING_USER_MEM flag is set to true during memory access
  2. Any page faults occurring during the operation are handled correctly
  3. The flag is reset to false after the operation completes

Type Constraints

The null-terminated handling functions require that the type T implements both:

  • Eq - To compare values for equality with the terminator
  • Default - To create the terminator value (usually zero/null)

This allows the system to work with different types of null-terminated data beyond just strings.

Memory Safety Guarantees

The null-terminated data handling system provides the following safety guarantees:

AspectGuarantee
Memory AlignmentEnsures the pointer is properly aligned for type T
Access PermissionsVerifies each page has appropriate read/write permissions
Page FaultsHandles page faults during user memory access
Memory BoundariesSafely traverses page boundaries
Data ValidationEnsures data is properly terminated
UTF-8 ValidationValidates UTF-8 encoding for strings

Sources: src/lib.rs(L56 - L107)  src/lib.rs(L204 - L217)  src/lib.rs(L282 - L292)  src/lib.rs(L294 - L303) 

Practical Considerations

Performance Characteristics

Scanning for null terminators can potentially traverse many pages of memory, especially for long strings or arrays. The implementation optimizes this by:

  1. Checking page boundaries only when necessary
  2. Validating permissions at the page level, not for each element
  3. Using volatile reads for maximum safety with minimal overhead

Error Handling

The null-terminated data methods return LinuxResult values with appropriate error codes:

  • EFAULT - If memory is inaccessible or improperly aligned
  • EILSEQ - If string data is not valid UTF-8 (for get_as_str)

Sources: src/lib.rs(L56 - L107)  src/lib.rs(L294 - L303) 

Integration with Operating System

Relevant source files

This page documents how the axptr library integrates with the underlying operating system components to provide safe user memory access from kernel code. We cover the dependency architecture, memory management integration, error handling, and page fault coordination that enable axptr to function within a broader OS environment.

For information about the core pointer types and their usage, see User Space Pointers. For details on safety mechanisms, see Safety Mechanisms.

Dependency Architecture

The axptr library depends on several OS components to provide its functionality:

flowchart TD
subgraph subGraph1["OS Integration Points"]
    axmm["axmm: Memory Management"]
    axerrno["axerrno: Error Handling"]
    page_table["page_table_multiarch: Page Tables"]
    memory_addr["memory_addr: Address Types"]
    percpu["percpu: Per-CPU Variables"]
end
subgraph subGraph0["axptr Components"]
    userptr["UserPtr/UserConstPtr"]
    addrspace_provider["AddrSpaceProvider trait"]
    fault_handler["Page Fault Coordination"]
end
kernel["Kernel Memory Subsystem"]
kernel_errors["Kernel Error Handling"]
arch_mm["Architecture-specific Memory Management"]
kernel_smp["Kernel SMP Support"]

addrspace_provider --> axmm
axerrno --> kernel_errors
axmm --> kernel
fault_handler --> percpu
page_table --> arch_mm
percpu --> kernel_smp
userptr --> axerrno
userptr --> axmm
userptr --> memory_addr
userptr --> page_table

The diagram illustrates how axptr interfaces with various operating system components through its dependencies. These dependencies allow axptr to leverage the kernel's existing infrastructure for memory management, error handling, and multi-core support.

Sources: Cargo.toml(L7 - L12)  src/lib.rs(L4 - L11) 

Memory Management Integration

axptr integrates with the OS memory management system primarily through the axmm crate, which provides the AddrSpace abstraction. This integration enables axptr to:

  1. Check permissions for memory regions
  2. Populate page tables as needed
  3. Enforce proper memory alignment
  4. Handle page faults gracefully

Address Space Provider Mechanism

The AddrSpaceProvider trait serves as the primary integration point between axptr and the OS memory management subsystem:


The trait is designed to be simple enough that OS-specific implementations can easily provide access to the appropriate address space, while still allowing for thread-safety and context-specific behavior.

Sources: src/lib.rs(L119 - L126) 

Memory Access Workflow

When kernel code attempts to access user memory through UserPtr or UserConstPtr, the following sequence occurs:

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant UserPtrmethods as "UserPtr methods"
    participant check_regionfunction as "check_region function"
    participant AddrSpaceProvider as "AddrSpaceProvider"
    participant AddrSpaceOS as "AddrSpace (OS)"
    participant PageFaultHandlerOS as "Page Fault Handler (OS)"

    KernelCode ->> UserPtrmethods: get(aspace)
    UserPtrmethods ->> check_regionfunction: check_region_with(aspace, addr, layout, flags)
    check_regionfunction ->> AddrSpaceProvider: with_addr_space(lambda)
    AddrSpaceProvider ->> AddrSpaceOS: lambda(aspace)
    AddrSpaceOS ->> AddrSpaceOS: check_region_access(range, flags)
    AddrSpaceOS ->> AddrSpaceOS: populate_area(page_start, page_end - page_start)
    AddrSpaceOS -->> check_regionfunction: Result
    check_regionfunction -->> UserPtrmethods: Result
    alt Success
        UserPtrmethods ->> UserPtrmethods: Set ACCESSING_USER_MEM flag
        UserPtrmethods ->> KernelCode: Return memory reference
        Note over KernelCode,UserPtrmethods: During access, page fault may occur
        KernelCode ->> PageFaultHandlerOS: (Page fault in kernel mode)
        PageFaultHandlerOS ->> PageFaultHandlerOS: Check is_accessing_user_memory()
        PageFaultHandlerOS ->> KernelCode: Handle fault appropriately
        UserPtrmethods ->> UserPtrmethods: Clear ACCESSING_USER_MEM flag
    else Failure
        UserPtrmethods ->> KernelCode: Return error (EFAULT, etc.)
    end

This workflow demonstrates how axptr coordinates with the OS memory management subsystem to safely access user memory, involving permission checks, page table population, and page fault handling.

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L175 - L198)  src/lib.rs(L258 - L277) 

Error Handling Integration

axptr uses the axerrno crate for Linux-compatible error codes. This integration ensures that errors from user memory access operations can be properly propagated to OS-specific error handling systems.

The primary error codes used by axptr include:

Error CodeDescriptionUsage inaxptr
EFAULTBad addressReturned for misaligned or inaccessible memory regions
EILSEQIllegal byte sequenceReturned when string conversion fails inget_as_str

The error handling flow integrates with the OS through the LinuxResult type, which is a Result<T, LinuxError> that can be directly used by OS components or converted to OS-specific error types.

Sources: src/lib.rs(L4)  src/lib.rs(L36 - L47)  src/lib.rs(L301) 

Page Fault Coordination

One of the most critical aspects of OS integration is the coordination between axptr and the OS page fault handler. This is achieved through the ACCESSING_USER_MEM per-CPU flag:

flowchart TD
start["Kernel Code accesses user memory"]
access_fn["access_user_memory() function"]
set_flag["Set ACCESSING_USER_MEM = true"]
memory_op["Perform memory operation"]
page_fault["Page fault occurs"]
os_handler["OS Page Fault Handler"]
check_flag["Check is_accessing_user_memory()"]
special_handling["Handle as user memory access"]
kernel_crash["Handle as kernel bug"]
clear_flag["Set ACCESSING_USER_MEM = false"]
end_access["Return result"]

access_fn --> set_flag
check_flag --> kernel_crash
check_flag --> special_handling
clear_flag --> end_access
memory_op --> clear_flag
memory_op --> page_fault
os_handler --> check_flag
page_fault --> os_handler
set_flag --> memory_op
special_handling --> memory_op
start --> access_fn

This mechanism requires the OS page fault handler to check is_accessing_user_memory() when a page fault occurs in kernel mode. If true, the fault should be treated as a normal user memory access that may require page table updates or signal delivery. If false, it should be treated as a bug in the kernel.

Sources: src/lib.rs(L11 - L29)  src/lib.rs(L73 - L104) 

Implementation of AddrSpaceProvider

Operating systems integrating with axptr must provide an implementation of the AddrSpaceProvider trait. The library provides a simple implementation for &mut AddrSpace, but OS-specific implementations might include:

  1. Process-specific address space providers
  2. Thread-specific address space providers
  3. Providers that switch to user address spaces temporarily

The implementation should ensure that:

  • The correct address space is used for the current context
  • Any necessary locking or synchronization is handled
  • The address space remains valid throughout the operation

Example of the default implementation:

#![allow(unused)]
fn main() {
impl AddrSpaceProvider for &mut AddrSpace {
    fn with_addr_space<R>(&mut self, f: impl FnOnce(&mut AddrSpace) -> R) -> R {
        f(self)
    }
}
}

Sources: src/lib.rs(L119 - L126) 

Dependency Requirements

The operating system must provide or accommodate the following components for proper integration with axptr:

DependencyRequired Features
axmmAddrSpace implementation with check_region_access and populate_area methods
page_table_multiarchSupport for mapping flags (READ, WRITE)
memory_addrAddress types and manipulation (VirtAddr, VirtAddrRange)
percpuPer-CPU variable support for the ACCESSING_USER_MEM flag
axerrnoLinux-compatible error codes

Each dependency provides essential functionality that axptr relies on to safely access user memory. The operating system must ensure these dependencies are properly implemented and available.

Sources: Cargo.toml(L7 - L12)  src/lib.rs(L4 - L11) 

Integration Example Flow

The complete flow of integration between axptr and the operating system for a typical user memory access operation:

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant UserPtr as "UserPtr"
    participant OSAddrSpaceProvider as "OS AddrSpaceProvider"
    participant OSAddrSpace as "OS AddrSpace"
    participant OSPageTables as "OS Page Tables"
    participant OSPageFaultHandler as "OS Page Fault Handler"

    KernelCode ->> UserPtr: get(os_provider)
    UserPtr ->> OSAddrSpaceProvider: with_addr_space(lambda)
    OSAddrSpaceProvider ->> OSAddrSpace: Acquire address space
    OSAddrSpaceProvider ->> UserPtr: Execute lambda with address space
    UserPtr ->> OSAddrSpace: check_region_access(range, flags)
    OSAddrSpace ->> OSAddrSpace: Validate permissions
    UserPtr ->> OSAddrSpace: populate_area(page_start, size)
    OSAddrSpace ->> OSPageTables: Ensure pages are populated
    UserPtr ->> UserPtr: Set ACCESSING_USER_MEM = true
    UserPtr ->> KernelCode: Return reference to user memory
    KernelCode ->> KernelCode: Access user memory
    alt Page Fault Occurs
        KernelCode -->> OSPageFaultHandler: Page fault exception
        OSPageFaultHandler ->> OSPageFaultHandler: Call is_accessing_user_memory()
        OSPageFaultHandler ->> OSPageTables: Handle fault (map page, etc.)
        OSPageFaultHandler -->> KernelCode: Resume execution
    end
    KernelCode ->> UserPtr: Memory access complete
    UserPtr ->> UserPtr: Set ACCESSING_USER_MEM = false

This diagram illustrates the complete integration flow, showing how various OS components interact with axptr during a user memory access operation, including the handling of page faults.

Sources: src/lib.rs(L18 - L29)  src/lib.rs(L31 - L54)  src/lib.rs(L175 - L198) 

API Reference

Relevant source files

This page provides a comprehensive reference for the axptr library, which offers safe abstractions for accessing user-space memory from kernel code. The API is designed to prevent memory-related security vulnerabilities and crashes that can occur when kernel code interacts with potentially unsafe user memory.

For architectural concepts and safety mechanisms, refer to Memory Safety Architecture and Safety Mechanisms.

API Components Overview


Sources: src/lib.rs(L119 - L126)  src/lib.rs(L128 - L217)  src/lib.rs(L219 - L303)  src/lib.rs(L18 - L20) 

Core Types

UserPtr

UserPtr<T> is a wrapper around a raw mutable pointer (*mut T) to user-space memory. It provides safe methods to access and manipulate user memory with validation checks.

flowchart TD
A["Kernel Code"]
B["UserPtr"]
C["check_region()"]
D["Access Permission Check"]
E["Alignment Check"]
F["Page Table Population"]
G["User Memory"]

A --> B
B --> C
B --> G
C --> D
C --> E
C --> F

Sources: src/lib.rs(L128 - L217) 

Constants

ConstantTypeDescription
ACCESS_FLAGSMappingFlagsRead and write access flags for the pointer (MappingFlags::READ.union(MappingFlags::WRITE))

Sources: src/lib.rs(L137) 

Methods

MethodSignatureDescription
addressfn address(&self) -> VirtAddrReturns the virtual address of the pointer
as_ptrunsafe fn as_ptr(&self) -> *mut TUnwraps the pointer into a raw pointer (unsafe)
castfn cast(self) -> UserPtrCasts the pointer to a different type
is_nullfn is_null(&self) -> boolChecks if the pointer is null
nullablefn nullable(self) -> OptionConverts the pointer to anOption, returningNoneif null
getfn get(&mut self, aspace: impl AddrSpaceProvider) -> LinuxResult<&mut T>Safely accesses the value, validating the memory region
get_as_slicefn get_as_slice(&mut self, aspace: impl AddrSpaceProvider, length: usize) -> LinuxResult<&mut [T]>Gets the value as a slice of specified length
get_as_null_terminatedfn get_as_null_terminated(&mut self, aspace: impl AddrSpaceProvider) -> LinuxResult<&mut [T]>Gets the value as a slice terminated by a null value

Sources: src/lib.rs(L136 - L169)  src/lib.rs(L171 - L198)  src/lib.rs(L201 - L217) 

UserConstPtr

UserConstPtr<T> is a wrapper around a raw constant pointer (*const T) to user-space memory. It provides similar functionality to UserPtr<T> but for read-only access.

flowchart TD
A["Kernel Code"]
B["UserConstPtr"]
C["check_region()"]
D["Access Permission Check"]
E["Alignment Check"]
F["Page Table Population"]
G["User Memory (read-only)"]

A --> B
B --> C
B --> G
C --> D
C --> E
C --> F

Sources: src/lib.rs(L219 - L303) 

Constants

ConstantTypeDescription
ACCESS_FLAGSMappingFlagsRead-only access flags for the pointer (MappingFlags::READ)

Sources: src/lib.rs(L228) 

Methods

MethodSignatureDescription
addressfn address(&self) -> VirtAddrReturns the virtual address of the pointer
as_ptrunsafe fn as_ptr(&self) -> *const TUnwraps the pointer into a raw pointer (unsafe)
castfn cast(self) -> UserConstPtrCasts the pointer to a different type
is_nullfn is_null(&self) -> boolChecks if the pointer is null
nullablefn nullable(self) -> OptionConverts the pointer to anOption, returningNoneif null
getfn get(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&T>Safely accesses the value, validating the memory region
get_as_slicefn get_as_slice(&self, aspace: impl AddrSpaceProvider, length: usize) -> LinuxResult<&[T]>Gets the value as a slice of specified length
get_as_null_terminatedfn get_as_null_terminated(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&[T]>Gets the value as a slice terminated by a null value

Sources: src/lib.rs(L227 - L254)  src/lib.rs(L256 - L278)  src/lib.rs(L280 - L292) 

Special Methods for UserConstPtr<c_char>

UserConstPtr<c_char> has an additional method for working with strings:

MethodSignatureDescription
get_as_strfn get_as_str(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&'static str>Gets the pointer as a Rust string, validating UTF-8 encoding

Sources: src/lib.rs(L294 - L303) 

AddrSpaceProvider Trait

The AddrSpaceProvider trait is used to abstract the address space operations used by both pointer types. It provides a way to access the underlying address space.


Sources: src/lib.rs(L119 - L126) 

Methods

MethodSignatureDescription
with_addr_spacefn with_addr_space(&mut self, f: impl FnOnce(&mut AddrSpace) -> R) -> RProvides a reference to the address space for use with a callback function

Sources: src/lib.rs(L119 - L121) 

Helper Functions

The axptr library provides utility functions for working with user-space memory:

FunctionSignatureDescription
is_accessing_user_memoryfn is_accessing_user_memory() -> boolChecks if we are currently accessing user memory, used for page fault handling
access_user_memoryfn access_user_memory(f: impl FnOnce() -> R) -> RInternal function to set a flag during user memory access

Sources: src/lib.rs(L11 - L29) 

Memory Access Process

The diagram below illustrates the process that occurs when kernel code attempts to access user memory through the axptr API:

sequenceDiagram
    participant KernelCode as Kernel Code
    participant UserPtrUserConstPtr as UserPtr/UserConstPtr
    participant check_region as check_region()
    participant AddrSpace as AddrSpace
    participant UserMemory as User Memory

    KernelCode ->> UserPtrUserConstPtr: get(...)/get_as_slice(...)/etc.
    UserPtrUserConstPtr ->> check_region: check_region_with(...)
    check_region ->> AddrSpace: check_region_access
    check_region ->> AddrSpace: populate_area
    alt Region is valid
        AddrSpace -->> check_region: Ok(())
        check_region -->> UserPtrUserConstPtr: Ok(())
        UserPtrUserConstPtr ->> UserPtrUserConstPtr: access_user_memory(...)
        UserPtrUserConstPtr ->> UserMemory: Safe memory access
        UserMemory -->> UserPtrUserConstPtr: Data
        UserPtrUserConstPtr -->> KernelCode: Return reference/slice
    else Region is invalid or inaccessible
        AddrSpace -->> check_region: Err(EFAULT)
        check_region -->> UserPtrUserConstPtr: Err(EFAULT)
        UserPtrUserConstPtr -->> KernelCode: Return error
    end

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L109 - L117)  src/lib.rs(L22 - L29) 

Type Conversion and Construction

Both UserPtr<T> and UserConstPtr<T> implement From<usize> for convenient construction from raw addresses:

flowchart TD
A["usize (memory address)"]
B["UserPtr"]
C["UserConstPtr"]

A --> B
A --> C

Sources: src/lib.rs(L130 - L134)  src/lib.rs(L221 - L225) 

Null-Terminated Data Handling

The library provides special handling for null-terminated data structures like C strings:

flowchart TD
A["UserPtr/UserConstPtr"]
B["get_as_null_terminated()"]
C["check_null_terminated()"]
D["traverse memory safely"]
E["find null terminator"]
F["return slice up to terminator"]
G["UserConstPtr"]
H["get_as_str()"]
I["get_as_null_terminated()"]
J["validate UTF-8"]
K["return &str"]

A --> B
B --> C
C --> D
D --> E
E --> F
G --> H
H --> I
I --> J
J --> K

Sources: src/lib.rs(L56 - L107)  src/lib.rs(L201 - L217)  src/lib.rs(L280 - L292)  src/lib.rs(L294 - L303) 

UserPtr API

Relevant source files

Purpose and Overview

This document provides detailed information about the UserPtr<T> type, which enables safe access to mutable user-space memory from kernel code. The API ensures memory safety through rigorous access validation, proper alignment checking, and context-aware page fault handling.

For information about the read-only equivalent, see UserConstPtr API.

UserPtr<T> wraps a raw pointer (*mut T) to user-space memory and provides methods to safely access it through the kernel, preventing common vulnerabilities like null pointer dereferences and buffer overflows.

Sources: src/lib.rs(L1 - L7)  src/lib.rs(L129 - L130) 

Type Definition and Core Properties

UserPtr<T> is defined as a transparent wrapper around a *mut T raw pointer:

#[repr(transparent)]
pub struct UserPtr<T>(*mut T);

Key properties:

  • Transparent representation: Ensures the struct has the same memory layout as a raw pointer
  • Generic over type T: Can point to any type
  • Access flags: Includes both READ and WRITE permissions

Sources: src/lib.rs(L129 - L130)  src/lib.rs(L137 - L138) 

Basic Methods

Construction and Conversion

UserPtr<T> can be constructed from a raw usize memory address:

flowchart TD
A["usize address"]
B["UserPtr<T>"]
C["UserPtr<T>"]
D["UserPtr<U>"]

A --> B
C --> D

Pointer Manipulation Methods

MethodDescriptionReturn Type
address()Gets the virtual addressVirtAddr
as_ptr()Unwraps to a raw pointer (unsafe)*mut T
cast()Casts to a different typeUserPtr
is_null()Checks if the pointer is nullbool
nullable()Converts to an Option (None if null)Option

Sources: src/lib.rs(L130 - L169) 

Memory Access Methods

The UserPtr<T> API provides three primary methods for safely accessing user-space memory:

flowchart TD
UserPtr["UserPtr<T>"]
A["&mut T(Single value)"]
B["&mut [T](Fixed-length array)"]
C["&mut [T](Null-terminated array)"]

UserPtr --> A
UserPtr --> B
UserPtr --> C

get()

Retrieves a single value of type T from user-space memory:

#![allow(unused)]
fn main() {
pub fn get(&mut self, aspace: impl AddrSpaceProvider) -> LinuxResult<&mut T>
}

This method:

  1. Validates the memory region
  2. Checks alignment
  3. Verifies read/write permissions
  4. Populates the page tables if necessary
  5. Returns a mutable reference if successful, or an error (EFAULT) if access is invalid

Sources: src/lib.rs(L175 - L183) 

get_as_slice()

Retrieves a fixed-length slice of elements from user-space memory:

#![allow(unused)]
fn main() {
pub fn get_as_slice(
    &mut self,
    aspace: impl AddrSpaceProvider,
    length: usize
) -> LinuxResult<&mut [T]>
}

This method performs the same safety checks as get() but for an array of specified length.

Sources: src/lib.rs(L186 - L199) 

get_as_null_terminated()

Retrieves a null-terminated array from user-space memory:

#![allow(unused)]
fn main() {
pub fn get_as_null_terminated(
    &mut self,
    aspace: impl AddrSpaceProvider
) -> LinuxResult<&mut [T]>
}

This specialized method:

  1. Searches for a null value (T::default()) to determine array length
  2. Validates each memory page during the search
  3. Returns a mutable slice containing all elements up to (but not including) the null terminator

This method requires that type T implements Eq + Default traits.

Sources: src/lib.rs(L204 - L217) 

Memory Safety Mechanism

The UserPtr<T> API employs a multi-layered safety mechanism to prevent kernel crashes when accessing user-space memory:

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant UserPtr as "UserPtr"
    participant check_region as "check_region()"
    participant AddrSpace as "AddrSpace"
    participant access_user_memory as "access_user_memory()"

    KernelCode ->> UserPtr: get()
    UserPtr ->> check_region: check_region_with()
    check_region ->> check_region: Check alignment
    check_region ->> AddrSpace: check_region_access()
    AddrSpace -->> check_region: Access allowed/denied
    alt Access allowed
        check_region ->> AddrSpace: populate_area()
        AddrSpace -->> check_region: Pages populated
        check_region -->> UserPtr: OK
        UserPtr ->> access_user_memory: Set ACCESSING_USER_MEM flag
        access_user_memory ->> UserPtr: Access memory safely
        UserPtr -->> KernelCode: Return reference
    else Access denied
        check_region -->> UserPtr: EFAULT
        UserPtr -->> KernelCode: Return error
    end

Key safety components:

  1. Alignment Checking: Ensures the pointer is properly aligned for the target type
  2. Access Validation: Verifies memory region is accessible with appropriate permissions
  3. Page Table Population: Prepares memory pages before access
  4. Context-Aware Page Fault Handling: Uses the ACCESSING_USER_MEM flag to permit controlled page faults
  5. Error Propagation: Returns LinuxError::EFAULT when access is denied

Sources: src/lib.rs(L11 - L54)  src/lib.rs(L175 - L183) 

Usage Pattern

The typical usage pattern for UserPtr<T> involves:

flowchart TD
A["Create UserPtrfrom usize address"]
B["Obtain AddrSpaceProvider"]
C["Call appropriate get()method"]
D["Use returned referencesafely"]
E["Handle error(EFAULT)"]

A --> B
B --> C
C --> D
C --> E

Example Usage Flow

  1. Obtain a user-space address (typically from a system call parameter)
  2. Convert it to a UserPtr<T>
  3. Get an address space provider (typically from the current process)
  4. Call one of the get* methods to safely access the memory
  5. Use the returned reference to read or modify user memory
  6. Handle any errors (typically EFAULT for invalid access)

Sources: src/lib.rs(L175 - L217) 

Relationship with AddrSpaceProvider

The UserPtr<T> API relies on the AddrSpaceProvider trait to abstract away the details of the underlying memory management system:


This abstraction allows the same UserPtr<T> implementation to work with different memory management systems, as long as they implement the AddrSpaceProvider trait.

Sources: src/lib.rs(L119 - L126)  src/lib.rs(L175 - L183) 

Implementation Details

Memory Region Validation

When accessing user memory, UserPtr<T> validates the memory region through the check_region function, which:

  1. Verifies proper alignment for the target type
  2. Checks access permissions (READ+WRITE for UserPtr<T>)
  3. Ensures the memory pages are populated

For null-terminated arrays, the specialized check_null_terminated function:

  1. Validates memory page by page while searching for the null terminator
  2. Handles page faults that might occur during the search
  3. Determines the total length of the array up to the null terminator

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L56 - L107) 

UserConstPtr API

Relevant source files

This document provides a comprehensive reference for the UserConstPtr<T> type, which enables safe read-only access to user-space memory from kernel code. For information about mutable access to user memory, see UserPtr API.

Overview

UserConstPtr<T> is a generic wrapper around a raw const pointer (*const T) that provides memory-safe operations for reading data from user space. It implements safety checks that prevent common issues like null pointer dereferences, buffer overflows, and illegal memory accesses.

classDiagram
class UserConstPtr~T~ {
    +*const T pointer
    +const ACCESS_FLAGS: MappingFlags
    +address() VirtAddr
    +as_ptr() *const T
    +cast~U~() UserConstPtr~U~
    +is_null() bool
    +nullable() Option~Self~
    +get() LinuxResult~&T~
    +get_as_slice() LinuxResult~&[T]~
    +get_as_null_terminated() LinuxResult~&[T]~
}

class UserConstPtr_cchar {
    
    +get_as_str() LinuxResult~&str~
}

UserConstPtr_cchar  --|>  UserConstPtr : "Specializedimplementation"

Sources: src/lib.rs(L219 - L303) 

Memory Safety Architecture

The UserConstPtr<T> type is part of a comprehensive memory safety system that prevents the kernel from crashing when accessing potentially invalid user memory. Unlike raw pointers, UserConstPtr<T> operations perform several safety checks before accessing user memory:

flowchart TD
A["UserConstPtr.get()"]
B["check_region_with()"]
C["Is pointeraligned?"]
D["Error: EFAULT"]
E["Valid memoryaccess rights?"]
F["Populate page tables"]
G["Page tablespopulated?"]
H["Error: ENOMEM"]
I["Set ACCESSING_USER_MEM flag"]
J["Access memory"]
K["Clear ACCESSING_USER_MEM flag"]
L["Return reference"]

A --> B
B --> C
C --> D
C --> E
E --> D
E --> F
F --> G
G --> H
G --> I
I --> J
J --> K
K --> L

Sources: src/lib.rs(L31 - L54)  src/lib.rs(L258 - L266) 

Type Definition

UserConstPtr<T> is defined as a transparent wrapper around a raw const pointer:

#[repr(transparent)]
pub struct UserConstPtr<T>(*const T);

The #[repr(transparent)] attribute ensures that UserConstPtr<T> has the same memory layout as *const T, making it efficient for passing across FFI boundaries.

Sources: src/lib.rs(L219 - L221) 

Constants

ConstantTypeDescription
ACCESS_FLAGSMappingFlagsSpecifies required memory access flags (READ) for user memory regions

Sources: src/lib.rs(L227 - L228) 

Basic Methods

Conversion and Type Manipulation

MethodSignatureDescription
Fromfn from(value: usize) -> SelfCreates aUserConstPtrfrom a raw address
addressfn address(&self) -> VirtAddrReturns the virtual address of the pointer
as_ptrunsafe fn as_ptr(&self) -> *const TReturns the underlying raw pointer (unsafe)
castfn cast(self) -> UserConstPtrCasts the pointer to a different type

Sources: src/lib.rs(L221 - L243) 

Null Checking

MethodSignatureDescription
is_nullfn is_null(&self) -> boolChecks if the pointer is null
nullablefn nullable(self) -> OptionConverts toNoneif null, orSome(self)otherwise

Sources: src/lib.rs(L245 - L253) 

Memory Access Methods

Single Value Access

#![allow(unused)]
fn main() {
fn get(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&T>
}

Safely retrieves a reference to the value pointed to by UserConstPtr<T>:

  • Validates memory alignment
  • Checks user memory access permissions
  • Populates page tables if necessary
  • Returns a reference or EFAULT error if access failed

Sources: src/lib.rs(L258 - L266) 

Slice Access

#![allow(unused)]
fn main() {
fn get_as_slice(&self, aspace: impl AddrSpaceProvider, length: usize) -> LinuxResult<&[T]>
}

Safely retrieves a slice of values:

  • Validates memory region for the entire slice
  • Verifies alignment and access permissions
  • Returns a slice reference or error if access failed

Sources: src/lib.rs(L269 - L277) 

Null-Terminated Data

#![allow(unused)]
fn main() {
fn get_as_null_terminated(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&[T]>
where
    T: Eq + Default,
}

Retrieves a slice of values terminated by a null value (default value of type T):

  • Scans memory until it finds a null value
  • Checks access permissions page-by-page during scan
  • Returns a slice that includes all values up to (but not including) the null terminator

Sources: src/lib.rs(L282 - L291) 

String-Specific Operations

UserConstPtr<c_char> has an additional method for safely retrieving strings from user space:

#![allow(unused)]
fn main() {
fn get_as_str(&self, aspace: impl AddrSpaceProvider) -> LinuxResult<&'static str>
}

This method:

  1. Gets the null-terminated array using get_as_null_terminated
  2. Transmutes the char array to bytes
  3. Validates that the bytes form valid UTF-8
  4. Returns a string reference or EILSEQ error if invalid UTF-8

Sources: src/lib.rs(L294 - L302) 

Memory Access Flow

The following diagram illustrates the complete flow of operations when accessing user memory with UserConstPtr:

sequenceDiagram
    participant KernelCode as "Kernel Code"
    participant UserConstPtrT as "UserConstPtr<T>"
    participant AddrSpaceProvider as "AddrSpaceProvider"
    participant check_region as "check_region"
    participant UserMemory as "User Memory"

    KernelCode ->> UserConstPtrT: Call get/get_as_slice/etc.
    UserConstPtrT ->> AddrSpaceProvider: with_addr_space()
    AddrSpaceProvider ->> check_region: check_region()
    check_region ->> check_region: Check alignment
    check_region ->> check_region: Check access permissions
    check_region ->> check_region: Populate page tables
    alt Access Allowed
        check_region ->> AddrSpaceProvider: Ok(())
        AddrSpaceProvider ->> UserConstPtrT: Ok(())
        UserConstPtrT ->> UserConstPtrT: Set ACCESSING_USER_MEM = true
        UserConstPtrT ->> UserMemory: Read memory
        UserConstPtrT ->> UserConstPtrT: Set ACCESSING_USER_MEM = false
        UserConstPtrT ->> KernelCode: Return reference
    else Access Denied
        check_region ->> AddrSpaceProvider: Err(EFAULT)
        AddrSpaceProvider ->> UserConstPtrT: Err(EFAULT)
        UserConstPtrT ->> KernelCode: Return error
    end

Sources: src/lib.rs(L22 - L29)  src/lib.rs(L31 - L54)  src/lib.rs(L258 - L266) 

Usage Example

Here's a conceptual example of using UserConstPtr:

  1. Receive a user address as a usize
  2. Convert it to a UserConstPtr<T>
  3. Check if it's null
  4. Access the user memory safely
  5. Handle any errors appropriately

Differences from UserPtr

While UserPtr<T> provides mutable access with both read and write permissions, UserConstPtr<T> is specifically designed for read-only access:

FeatureUserPtrUserConstPtr
Underlying type*mut T*const T
Access flags`READWRITE`
Reference type&mut T&T
UsageReading and writingReading only

Sources: src/lib.rs(L137)  src/lib.rs(L227 - L228) 

Thread Safety

UserConstPtr<T> operations use a thread-local variable ACCESSING_USER_MEM that informs the page fault handler that a page fault during memory access should be handled rather than causing a kernel panic. This flag is automatically set and cleared during memory access operations.

Sources: src/lib.rs(L11 - L12)  src/lib.rs(L22 - L29) 

Helper Functions

Relevant source files

This document describes the utility functions in the axptr library that support safe user-space memory access in kernel code. These helper functions implement the core safety mechanisms behind the UserPtr and UserConstPtr types but are not typically used directly by client code. For information about the user pointer types themselves, see UserPtr API and UserConstPtr API.

Per-CPU Flag System

The foundation of axptr's safety system is a per-CPU boolean flag that tracks when the kernel is accessing user memory.

flowchart TD
A["User Memory Access Request"]
B["is_accessing_user_memory()"]
C["ACCESSING_USER_MEM flag"]
D["OS allows page faultsfrom kernel mode"]
E["OS handles as regularkernel page fault"]
F["access_user_memory()"]
G["ACCESSING_USER_MEM = true"]
H["Execute memory accesscallback function"]
I["ACCESSING_USER_MEM = false"]
J["Return result"]

A --> B
B --> C
C --> D
C --> E
F --> G
G --> H
H --> I
I --> J

Sources: src/lib.rs(L11 - L29) 

The library provides two key functions for working with this system:

  1. is_accessing_user_memory(): A public function that returns the current state of the ACCESSING_USER_MEM flag. Operating system implementations should check this flag when handling page faults in kernel mode - if it returns true, page faults should be allowed to proceed (as they might be from legitimate user memory access attempts).
  2. access_user_memory<R>(f: impl FnOnce() -> R) -> R: An internal function that executes a callback with the user memory access flag set to true. This function:
  • Sets the ACCESSING_USER_MEM flag to true
  • Executes the provided callback function
  • Restores the flag to false
  • Returns the result of the callback

The ACCESSING_USER_MEM flag is implemented as a per-CPU variable using the percpu crate to ensure thread safety without locking overhead.

Memory Region Validation

Before accessing user memory, axptr performs thorough validation using the check_region function.

flowchart TD
A["check_region()"]
B["Memory aligned?"]
C["Return EFAULT"]
D["Access permissionsgranted?"]
E["Populate page tables"]
F["Return error"]
G["Return OK"]

A --> B
B --> C
B --> D
D --> C
D --> E
E --> F
E --> G

Sources: src/lib.rs(L31 - L54) 

The check_region function performs several critical checks:

  1. Alignment Validation: Verifies that the start address has proper alignment for the requested data type. If misaligned, returns EFAULT.
  2. Access Permission Check: Uses the AddrSpace.check_region_access() method to verify that the memory region has the appropriate access permissions (read/write).
  3. Page Table Population: Calls AddrSpace.populate_area() to ensure that page tables are set up correctly for the memory region. This may involve mapping physical pages if they're not already mapped.

The library also provides a wrapper function check_region_with that works with the AddrSpaceProvider trait, simplifying its usage from the pointer types.

Null-Terminated Data Processing

A specialized helper function handles the common case of accessing null-terminated data (like C strings) from user space.

flowchart TD
subgraph subGraph0["Page Boundary Handling"]
    E["Scan memory for null terminator"]
    F["Reached page boundary?"]
    G["Page has accesspermission?"]
    H["Return EFAULT"]
    I["Move to next page"]
    J["Found nullterminator?"]
    K["Advance to next element"]
    L["End scan"]
end
A["check_null_terminated()"]
B["Memory aligned?"]
C["Return EFAULT"]
D["Set ACCESSING_USER_MEM = true"]
M["Set ACCESSING_USER_MEM = false"]
N["Return pointer and length"]

A --> B
B --> C
B --> D
D --> E
E --> F
F --> G
F --> J
G --> H
G --> I
I --> F
J --> K
J --> L
K --> F
L --> M
M --> N

Sources: src/lib.rs(L56 - L107) 

The check_null_terminated<T> function provides a safe way to access variable-length, null-terminated data from user space:

  1. Initial Alignment Check: Verifies the start address has proper alignment for type T.
  2. Page-by-Page Scanning: Processes memory one page at a time, checking permissions at each page boundary. This approach allows handling of strings that span multiple pages.
  3. Safe Memory Access: Uses the access_user_memory() function to set the ACCESSING_USER_MEM flag during scanning, allowing proper handling of page faults that might occur.
  4. Null Terminator Detection: Reads each element using read_volatile() and compares it to the default value (T::default()) to find the null terminator.

This function supports the implementation of get_as_null_terminated() in both UserPtr and UserConstPtr types, as well as get_as_str() for UserConstPtr<c_char>.

Integration With Address Space Provider

The helper functions integrate with the address space abstraction through the check_region_with function.


Sources: src/lib.rs(L110 - L117)  src/lib.rs(L119 - L126) 

The check_region_with function serves as a bridge between the high-level pointer types and the low-level memory region validation:

  1. It accepts an AddrSpaceProvider implementation (typically a reference to an AddrSpace)
  2. It calls with_addr_space() on the provider to get access to the actual AddrSpace
  3. It passes check_region() as a callback, forwarding the memory validation request
  4. It returns the result of the validation

This design reduces code duplication and avoids excessive generic function instantiations, as noted in the source code comment.

Helper Function Usage Patterns

The following table summarizes how the helper functions are used by the public API:

Helper FunctionUsed ByPurpose
is_accessing_user_memory()OS implementationDetermine if page faults in kernel mode should be allowed
access_user_memory()check_null_terminated()Set flag during user memory scanning
check_region()check_region_with()Validate memory region alignment and permissions
check_null_terminated()get_as_null_terminated()Safely scan for null-terminated data
check_region_with()UserPtr::get(),UserConstPtr::get(), etc.Bridge between pointer types and memory validation

Sources: src/lib.rs(L175 - L182)  src/lib.rs(L204 - L216)  src/lib.rs(L258 - L266)  src/lib.rs(L282 - L291) 

These helper functions work together to create a comprehensive safety system that prevents the kernel from crashing when accessing user memory, while maintaining good performance and ergonomics.

Overview

Relevant source files

axprocess is a process management crate designed for ArceOS that provides core abstractions and mechanisms for managing processes, threads, process groups, and sessions. This document introduces the high-level concepts, architecture, and components of the system.

For a deeper dive into the architecture, see Core Architecture.

Purpose and Scope

The axprocess crate implements a hierarchical process management system inspired by Unix-like operating systems, providing the following capabilities:

  • Process creation, management, and termination
  • Thread management within processes
  • Process grouping through process groups
  • Session management for related process groups
  • Parent-child process relationships

The crate manages the lifecycle of these entities while ensuring proper resource cleanup and memory safety using Rust's ownership model.

Sources: src/lib.rs(L1 - L19)  Cargo.toml(L1 - L7)  README.md(L1 - L5) 

System Overview

axprocess implements a hierarchical system with four primary abstractions:

flowchart TD
subgraph subGraph0["Process Management Hierarchy"]
    S["Session"]
    PG["Process Group"]
    P["Process"]
    T["Thread"]
end

P --> T
PG --> P
S --> PG
  • Session: A collection of process groups, typically associated with a user login
  • Process Group: A collection of related processes, useful for signal handling
  • Process: An execution environment with its own address space and resources
  • Thread: An execution context within a process

Sources: src/lib.rs(L8 - L11)  src/lib.rs(L16 - L19) 

Core Components and Relationships

The system is organized in a hierarchical structure with well-defined relationships between components:

classDiagram
class Session {
    sid: Pid
    process_groups: WeakMapUnsupported markdown: del~
    +sid() Pid
    +process_groups() Vec~Arc~ProcessGroup~~
}

class ProcessGroup {
    pgid: Pid
    session: Arc~Session~
    processes: WeakMapUnsupported markdown: del~
    +pgid() Pid
    +session() Arc~Session~
    +processes() Vec~Arc~Process~~
}

class Process {
    pid: Pid
    is_zombie: AtomicBool
    children: StrongMapUnsupported markdown: del~
    parent: Weak~Process~
    group: Arc~ProcessGroup~
    +pid() Pid
    +exit() void
    +is_zombie() bool
    +fork(pid: Pid) ProcessBuilder
}

class Thread {
    tid: Pid
    process: Arc~Process~
    +tid() Pid
    +process() &Arc~Process~
    +exit(exit_code: i32) bool
}

Process "1" o-- "*" ProcessGroup : contains
Process "1" o-- "*" ProcessGroup : contains
Process "1" o-- "*" Thread : contains
Process  o--  Process
Process "1" --> "*" Process : parent-child
Process  -->  Process

Key concepts in this relationship:

  • Sessions contain multiple process groups
  • Process groups contain multiple processes
  • Processes contain threads
  • Processes form parent-child relationships

Sources: src/lib.rs(L13 - L14)  src/lib.rs(L16 - L19) 

Reference Management Strategy

The system uses a carefully designed reference management strategy to prevent memory leaks and ensure proper cleanup:


  • Strong references (Arc): Used for upward relationships to ensure parent objects remain alive as long as their children need them
  • Weak references (Weak): Used for downward and circular relationships to prevent reference cycles

This strategy ensures that resources are properly cleaned up when they're no longer needed, while maintaining the necessary relationships between components.

Sources: Cargo.toml(L8 - L11) 

Process Lifecycle

Processes in the system follow a lifecycle from creation to termination:


This lifecycle management ensures proper resource cleanup and allows parent processes to retrieve exit status from terminated child processes.

For detailed information about process lifecycle, see Process Lifecycle.

Sources: src/lib.rs(L16) 

Thread Management

Threads are execution contexts within a process:

flowchart TD
subgraph subGraph0["Thread Management"]
    p["Process"]
    tb["ThreadBuilder"]
    t["Thread"]
end

p --> tb
t --> p
tb --> t

Each process can have multiple threads, and the last thread's exit typically triggers the process to exit as well. Thread creation is handled through the ThreadBuilder pattern, providing a flexible way to configure new threads.

For more information on thread management, see Thread Management.

Sources: src/lib.rs(L19) 

Integration with ArceOS

axprocess serves as a foundational component in the ArceOS kernel, providing essential process management capabilities that other kernel subsystems build upon:

flowchart TD
subgraph subGraph0["ArceOS Kernel Components"]
    axprocess["axprocess (Process Management)"]
    scheduler["Scheduler"]
    memory["Memory Management"]
    fs["File System"]
end

axprocess --> fs
axprocess --> memory
axprocess --> scheduler

The abstractions provided by axprocess enable the development of higher-level operating system features and applications.

Sources: Cargo.toml(L6)  README.md(L3) 

Next Steps

For more detailed information about specific components and features of the axprocess system, refer to these wiki pages:

Core Architecture

Relevant source files

This document explains the high-level architecture of the axprocess system, focusing on the core components and their relationships. It describes the hierarchical structure, component interactions, and memory management strategy used in the system. For specific details about process lifecycle management, see Process Lifecycle, and for thread management details, see Thread Management.

Component Overview

The axprocess system consists of four primary components that form a hierarchical structure:

  1. Session: A collection of process groups
  2. Process Group: A collection of processes
  3. Process: A basic unit of program execution that contains threads
  4. Thread: An execution unit within a process

Title: Core Component Hierarchy

Sources: src/process.rs src/process_group.rs src/session.rs src/thread.rs

Hierarchical Structure

The system follows a Unix-like hierarchical structure where components are organized in a containment hierarchy:

  1. Sessions contain multiple process groups and are identified by a session ID (sid)
  2. Process Groups contain multiple processes and are identified by a process group ID (pgid)
  3. Processes contain multiple threads and are identified by a process ID (pid)
  4. Threads are the execution units and are identified by a thread ID (tid)

Additionally, processes can have parent-child relationships with other processes, forming a separate process hierarchy.

flowchart TD
subgraph subGraph2["Session (sid=100)"]
    subgraph subGraph0["ProcessGroup (pgid=100)"]
        P100["Process (pid=100)"]
        P101["Process (pid=101)"]
        P102["Process (pid=102)"]
    end
    subgraph subGraph1["ProcessGroup (pgid=200)"]
        P200["Process (pid=200)"]
        P201["Process (pid=201)"]
    end
end
T100["Thread (tid=100)"]
T101["Thread (tid=101)"]
T102["Thread (tid=102)"]

P100 --> P101
P100 --> P102
P100 --> T100
P100 --> T101
P101 --> T102

Title: Hierarchical Container Relationships

Sources: src/process.rs(L34 - L164)  src/process_group.rs(L12 - L17)  src/session.rs(L12 - L16) 

Component Relationships

Session and Process Group Relationship

Sessions contain process groups, and each process group belongs to exactly one session:

  • A session is identified by a unique sid (Session ID)
  • Sessions maintain a weak map of process groups (process_groups)
  • Process groups hold a strong reference (Arc) to their session
  • New sessions are created using the Session::new(sid) method

Sources: src/session.rs(L12 - L27)  src/process_group.rs(L14 - L30) 

Process Group and Process Relationship

Process groups contain processes, and each process belongs to exactly one process group:

  • A process group is identified by a unique pgid (Process Group ID)
  • Process groups maintain a weak map of processes (processes)
  • Processes hold a strong reference (Arc) to their process group
  • Processes can move between process groups using Process::move_to_group()

Sources: src/process_group.rs(L12 - L47)  src/process.rs(L84 - L164) 

Process and Thread Relationship

Processes contain threads, and each thread belongs to exactly one process:

  • A process contains a ThreadGroup which manages its threads
  • Threads hold a strong reference (Arc) to their process
  • Processes maintain weak references to their threads
  • New threads are created using Process::new_thread() and built with ThreadBuilder

Sources: src/process.rs(L18 - L31)  src/process.rs(L167 - L192)  src/thread.rs(L6 - L88) 

Process Parent-Child Relationship

Processes form a hierarchy through parent-child relationships:

  • Each process (except the init process) has a parent process
  • Processes maintain strong references to their children
  • Processes maintain weak references to their parents
  • Child processes are created using Process::fork()
  • When a process exits, its children are inherited by the init process

Sources: src/process.rs(L70 - L81)  src/process.rs(L195 - L237)  src/process.rs(L261 - L282) 

Reference Management Strategy

The system uses a carefully designed reference counting strategy to prevent memory leaks while ensuring proper cleanup:

flowchart TD
subgraph subGraph1["Weak References (Weak)"]
    ProcessGroup2["ProcessGroup"]
    Process2["Process"]
    ParentProcess2["Parent Process"]
    Process3["Process"]
    Thread2["Thread"]
end
subgraph subGraph0["Strong References (Arc)"]
    Process["Process"]
    ProcessGroup["ProcessGroup"]
    Session["Session"]
    Thread["Thread"]
    ParentProcess["ParentProcess"]
    ChildProcess["Child Process"]
end

ParentProcess --> ChildProcess
Process --> ProcessGroup
Process2 --> ParentProcess2
Process3 --> Thread2
ProcessGroup --> Session
ProcessGroup2 --> Process2
Session --> ProcessGroup2
Thread --> Process

Title: Reference Management Strategy

Key patterns in the reference management strategy:

  1. Upward References: Strong references (Arc) are used for upward relationships:
  • Threads strongly reference their process
  • Processes strongly reference their process group
  • Process groups strongly reference their session
  • Parent processes strongly reference their children
  1. Downward References: Weak references (Weak) are used for downward relationships:
  • Sessions weakly reference their process groups
  • Process groups weakly reference their processes
  • Processes weakly reference their threads
  • Processes weakly reference their parent
  1. Maps and Collections:
  • WeakMap is used for downward references
  • StrongMap is used for the children collection in a process

This strategy ensures that components are kept alive as long as they're needed while preventing reference cycles that would cause memory leaks.

Sources: src/process.rs(L36 - L46)  src/process_group.rs(L14 - L16)  src/session.rs(L14 - L15)  src/thread.rs(L7 - L11) 

Process and Thread Lifecycle

Process Lifecycle


Title: Process Lifecycle States

The process lifecycle consists of these key stages:

  1. Creation: A process is created using ProcessBuilder::build()
  • Init process is created using Process::new_init()
  • Child processes are created using Process::fork()
  1. Execution: The process is active and can create threads
  2. Termination: The process becomes a zombie when Process::exit() is called
  • Its children are inherited by the init process
  • It remains in the zombie state until freed
  1. Cleanup: The process resources are freed when Process::free() is called

Sources: src/process.rs(L195 - L237)  src/process.rs(L261 - L331) 

Thread Lifecycle

sequenceDiagram
    participant Process as Process
    participant ThreadBuilder as ThreadBuilder
    participant Thread as Thread
    participant ThreadGroup as ThreadGroup

    Process ->> ThreadBuilder: new_thread(tid)
    ThreadBuilder ->> ThreadBuilder: data(custom_data)
    ThreadBuilder ->> Thread: build()
    Thread ->> ThreadGroup: add to thread group
    Note over Thread: Thread execution
    Thread ->> ThreadGroup: exit(exit_code)
    ThreadGroup ->> ThreadGroup: remove thread
    ThreadGroup ->> Process: check if last thread
    alt Last thread
        Process ->> Process: may trigger process exit
    end

Title: Thread Lifecycle Flow

The thread lifecycle consists of these key stages:

  1. Creation: A thread is created using ThreadBuilder::build()
  • Process creates a new thread using Process::new_thread()
  • Thread is added to the process's thread group
  1. Execution: The thread executes its workload
  2. Termination: The thread exits using Thread::exit()
  • If it's the last thread, it may trigger process termination
  • Thread is removed from the thread group

Sources: src/thread.rs(L29 - L40)  src/thread.rs(L51 - L88)  src/process.rs(L167 - L177) 

Builder Pattern Implementation

The system uses the Builder pattern for creating processes and threads, allowing for flexible configuration:

Process Builder


Title: Process Builder Pattern

  • Process::new_init() creates a ProcessBuilder for the init process
  • Process::fork() creates a ProcessBuilder for a child process
  • ProcessBuilder::data() sets custom data for the process
  • ProcessBuilder::build() creates and initializes the process

Sources: src/process.rs(L261 - L331) 

Thread Builder


Title: Thread Builder Pattern

  • Process::new_thread() creates a ThreadBuilder
  • ThreadBuilder::data() sets custom data for the thread
  • ThreadBuilder::build() creates and initializes the thread

Sources: src/thread.rs(L51 - L88)  src/process.rs(L167 - L177) 

System Integration

The axprocess crate is designed to provide process management capabilities for the ArceOS kernel:

flowchart TD
subgraph subGraph1["External Systems"]
    Scheduler["OS Scheduler"]
    MemoryManagement["Memory Management"]
    FileSystem["File System"]
end
subgraph subGraph0["axprocess Crate"]
    Process["Process"]
    ProcessLifecycle["Process Lifecycle"]
    ThreadManagement["Thread Management"]
    ProcessGroups["Process Groups"]
    Sessions["Sessions"]
    ProcessBuilder["ProcessBuilder"]
    ParentChild["Parent-Child Relations"]
    ThreadBuilder["ThreadBuilder"]
    ThreadGroup["ThreadGroup"]
end

Process --> FileSystem
Process --> MemoryManagement
Process --> ProcessGroups
Process --> ProcessLifecycle
Process --> Scheduler
Process --> Sessions
Process --> ThreadManagement
ProcessLifecycle --> ParentChild
ProcessLifecycle --> ProcessBuilder
ThreadManagement --> ThreadBuilder
ThreadManagement --> ThreadGroup

Title: System Integration Overview

This process management system provides the foundation for:

  1. Creating and managing processes and threads
  2. Organizing processes into hierarchical structures
  3. Managing process lifecycle from creation to cleanup
  4. Supporting Unix-like process relationships

The design emphasizes:

  • Memory safety through careful reference management
  • Clear separation of concerns with distinct component types
  • Flexibility through builder patterns
  • Performance with minimal locking

Sources: src/lib.rs src/process.rs src/thread.rs

Process Management

Relevant source files

This document explains the process abstraction in the axprocess crate, detailing its internal structure, lifecycle, and key operations. The Process Management system provides the core functionality for creating, maintaining, and terminating processes within the ArceOS kernel.

For details on process creation, see Process Creation and Initialization. For information on parent-child relationships, see Parent-Child Relationships.

Process Structure

The Process struct is the central component of the process management system, encapsulating all resources and state information for a running process.

classDiagram
class Process {
    pid: Pid
    is_zombie: AtomicBool
    tg: SpinNoIrq
    data: Box
    children: SpinNoIrq~~
    parent: SpinNoIrq~
    group: SpinNoIrq~
    +pid() Pid
    +data() Option~&T~
    +is_init() bool
    +parent() Option~
    +children() Vec~
    +exit()
    +free()
    +fork(pid: Pid) ProcessBuilder
}

class ThreadGroup {
    threads: WeakMap~
    exit_code: i32
    group_exited: bool
    
}

Process  -->  ThreadGroup : contains

Sources: src/process.rs(L35 - L47)  src/process.rs(L18 - L31) 

The Process struct maintains:

  • A unique process ID (pid)
  • Zombie state tracking (is_zombie)
  • Thread management through ThreadGroup
  • Custom data storage (data)
  • Process hierarchy relationships (children, parent)
  • Process group membership (group)

Process Creation

Processes are created using the Builder pattern, which provides a flexible way to initialize a new process with various configurations.

sequenceDiagram
    participant ParentProcess as "Parent Process"
    participant ProcessBuilder as "ProcessBuilder"
    participant NewProcess as "New Process"
    participant ProcessGroup as "ProcessGroup"

    Note over ParentProcess: Exists already
    ParentProcess ->> ProcessBuilder: "fork(new_pid)"
    ProcessBuilder ->> ProcessBuilder: "data(custom_data)"
    ProcessBuilder ->> NewProcess: "build()"
    NewProcess ->> ProcessGroup: Join group
    NewProcess ->> ParentProcess: Add as child
    Note over NewProcess: Ready to run

Sources: src/process.rs(L262 - L281)  src/process.rs(L284 - L332) 

There are two primary ways to create processes:

  1. Init Process Creation: The first process in the system is created using Process::new_init(), which returns a ProcessBuilder configured for the init process.
  2. Child Process Creation: Existing processes can create child processes using Process::fork(), which returns a ProcessBuilder with the parent relationship already established.

The ProcessBuilder allows setting custom data before finalizing process creation with build(), which:

  • Creates the process object
  • Establishes parent-child relationships
  • Adds the process to its process group
  • Initializes the thread group

Process Lifecycle

Processes in axprocess follow a defined lifecycle from creation to termination and cleanup.


Sources: src/process.rs(L196 - L236)  tests/process.rs(L16 - L44) 

The process lifecycle consists of these key stages:

  1. Active: After creation, a process is active and can create threads, spawn child processes, and perform operations.
  2. Zombie: When a process terminates via Process::exit(), it becomes a zombie - the process has terminated but its resources are not fully released. At this point:
  • The process is marked as zombie (is_zombie = true)
  • Child processes are reassigned to the init process
  • The process remains in its parent's children list
  1. Freed: The parent process must call Process::free() on a zombie process to complete cleanup, which removes it from the parent's children list.

Note that the init process cannot exit, as enforced by a panic check in the exit() method.

Process Hierarchy

Processes are organized in a hierarchical parent-child structure, similar to Unix-like systems.

flowchart TD
subgraph subGraph0["Process Hierarchy"]
    Init["Init Process (PID 1)"]
    P1["Process (PID 2)"]
    P2["Process (PID 3)"]
    C1["Child Process (PID 4)"]
    C2["Child Process (PID 5)"]
    C3["Child Process (PID 6)"]
end

Init --> P1
Init --> P2
P1 --> C1
P1 --> C2
P2 --> C3

Sources: src/process.rs(L71 - L81)  src/process.rs(L207 - L224)  tests/process.rs(L47 - L55) 

Key aspects of process hierarchy:

  1. Init Process: The root of the process hierarchy, created during system initialization. It cannot be terminated and serves as the fallback parent for orphaned processes.
  2. Parent-Child Relationships:
  • Each process except init has exactly one parent
  • A process can have multiple children
  • These relationships are maintained using Arc/Weak references to prevent reference cycles
  1. Orphan Handling: When a parent process exits, its children are reassigned to the init process (known as "reaping"). This ensures all processes always have a valid parent.

Thread Management

Each process can contain multiple threads, managed through a thread group.


Sources: src/process.rs(L18 - L31)  src/process.rs(L167 - L191) 

The thread management system includes:

  1. ThreadGroup: Each process contains a ThreadGroup that tracks:
  • All threads belonging to the process
  • Exit code information
  • Group exit status
  1. Thread Creation: New threads are created using:
process.new_thread(tid) -> ThreadBuilder
  1. Thread Listing: All threads in a process can be retrieved with:
process.threads() -> Vec<Arc<Thread>>
  1. Group Exit: A process can be marked as "group exited", which affects all its threads:
process.group_exit()

Custom Process Data

The Process structure allows associating arbitrary data with each process through a type-erased container.

Process
└── data: Box<dyn Any + Send + Sync>

Sources: src/process.rs(L40)  src/process.rs(L55 - L58)  src/process.rs(L293 - L297) 

Custom data can be:

  • Set during process creation via ProcessBuilder::data<T>(data: T)
  • Retrieved with process.data<T>(), which returns Option<&T>

This mechanism provides flexibility for higher-level subsystems to extend process functionality without modifying the core Process structure.

Process Management API Summary

OperationMethodDescription
Create init processProcess::new_init(pid)Creates the first process in the system
Create child processparent.fork(pid)Creates a new process with the specified parent
Get process IDprocess.pid()Returns the process ID
Get parentprocess.parent()Returns the parent process, if any
Get childrenprocess.children()Returns all child processes
Create threadprocess.new_thread(tid)Creates a new thread in the process
Check zombie stateprocess.is_zombie()Returns true if process is a zombie
Terminate processprocess.exit()Terminates the process, making it a zombie
Clean up zombieprocess.free()Frees resources for a zombie process
Get custom dataprocess.data()Returns custom data associated with process

Sources: src/process.rs(L49 - L341) 

Process Creation and Initialization

Relevant source files

This page documents how processes are created and initialized in the axprocess crate. We'll explore the creation of the init process, the ProcessBuilder pattern for process construction, and how child processes are created through forking. For information about the complete process lifecycle, including termination and cleanup, see Process Lifecycle.

Overview

In axprocess, all processes are created using a builder pattern that ensures proper initialization and establishment of hierarchical relationships. The system supports two primary creation paths:

  1. Creating the special "init process" (the first process in the system)
  2. Creating child processes by "forking" from existing parent processes
flowchart TD
A["Process Creation"]
B["Init Process Creation"]
C["Child Process Creation"]
D["ProcessBuilder::new_init()"]
E["parent.fork()"]
F["ProcessBuilder::build()"]
G["New Process Instance"]

A --> B
A --> C
B --> D
C --> E
D --> F
E --> F
F --> G

Sources: src/process.rs(L262 - L332) 

The Init Process

The init process is the first process in the system and serves as the "root" of the process hierarchy. It has no parent and adopts orphaned processes when their parents exit.

Creating the Init Process

The init process is created using the Process::new_init method and stored in a static variable for system-wide access.

sequenceDiagram
    participant ClientCode as "Client Code"
    participant ProcessBuilder as "ProcessBuilder"
    participant INIT_PROCstatic as "INIT_PROC (static)"
    participant NewSession as "New Session"
    participant NewProcessGroup as "New Process Group"

    ClientCode ->> ProcessBuilder: "Process::new_init(pid)"
    ClientCode ->> ProcessBuilder: "build()"
    ProcessBuilder ->> NewSession: "Session::new(pid)"
    ProcessBuilder ->> NewProcessGroup: "ProcessGroup::new(pid, session)"
    ProcessBuilder ->> INIT_PROCstatic: "INIT_PROC.init_once(process)"
    Note over INIT_PROCstatic: "Init process stored for<br>system-wide access"

Sources: src/process.rs(L262 - L272)  src/process.rs(L301 - L331)  src/process.rs(L334 - L341) 

The Init Process Responsibilities

The init process has special responsibilities in the system:

  • Cannot be terminated (the system enforces this)
  • Adopts orphaned processes when their parents exit
  • Provides the foundation for the process hierarchy

The code explicitly prevents the init process from exiting:

#![allow(unused)]
fn main() {
pub fn exit(self: &Arc<Self>) {
    if self.is_init() {
        panic!("init process cannot exit");
    }
    // Exit code continues...
}
}

Sources: src/process.rs(L207 - L226)  src/process.rs(L334 - L341) 

The ProcessBuilder Pattern

The ProcessBuilder struct provides a flexible way to configure and create new processes. It follows the builder pattern, allowing optional configurations before building the actual process.

ProcessBuilder Fields

FieldTypeDescription
pidPidProcess identifier
parentOption<Arc>Parent process (None for init)
dataBox<dyn Any + Send + Sync>Custom data associated with the process

Sources: src/process.rs(L285 - L289) 

Process Construction Flow

sequenceDiagram
    participant ClientCode as "Client Code"
    participant ProcessBuilder as "ProcessBuilder"
    participant NewProcess as "New Process"
    participant ProcessGroup as "Process Group"

    ClientCode ->> ProcessBuilder: "new_init(pid) or fork(pid)"
    opt Configure Process
        ClientCode ->> ProcessBuilder: "data(custom_data)"
    end
    ClientCode ->> ProcessBuilder: "build()"
    ProcessBuilder ->> ProcessGroup: "Get parent's group or create new"
    ProcessBuilder ->> NewProcess: "Create process with necessary fields"
    ProcessBuilder ->> ProcessGroup: "Add process to group"
    alt Has Parent
        ProcessBuilder ->> NewProcess: "Add as child to parent"
    else No Parent (Init)
        ProcessBuilder ->> NewProcess: "Store as INIT_PROC"
    end
    ProcessBuilder -->> ClientCode: "Return Arc<Process>"

Sources: src/process.rs(L262 - L332) 

Child Process Creation (Forking)

Child processes are created by "forking" from an existing parent process. This is done using the fork method on a parent process.

Fork Process

flowchart TD
A["Parent Process"]
B["ProcessBuilder"]
C["Configured Builder"]
D["Child Process"]
E["Parent's Process Group"]
F["Parent-Child Relationship"]

A --> B
B --> C
C --> D
D --> E
D --> F

Sources: src/process.rs(L275 - L281)  src/process.rs(L301 - L331) 

Inheritance During Forking

When a process is forked, the child process inherits several properties from its parent:

  1. Process Group: The child joins the parent's process group by default
  2. Parent Reference: The child maintains a reference to its parent
  3. Children Collection: The parent adds the child to its children collection

The code establishes these relationships during the build method:

// Set parent-child relationship
if let Some(parent) = parent {
    parent.children.lock().insert(pid, process.clone());
}

// Child inherits parent's group or creates new group for init
let group = parent.as_ref().map_or_else(
    || {
        let session = Session::new(pid);
        ProcessGroup::new(pid, &session)
    },
    |p| p.group(),
);

Sources: src/process.rs(L303 - L330) 

Process Initialization Details

When a new process is created, several key initialization steps occur:

  1. Process Structure: A new Process structure is allocated with the provided PID
  2. Zombie State: Set to false initially
  3. Thread Group: Empty thread group is initialized
  4. Custom Data: Any provided custom data is stored
  5. Children Map: Empty children map is created
  6. Parent Reference: Weak reference to parent is stored (if any)
  7. Process Group: Process joins parent's group or creates a new group
  8. Registration: Process is registered with its group

Memory Management Strategy

Process creation uses a careful reference counting strategy to prevent memory leaks:

  1. Strong References (Arc<Process>):
  • From parent to children
  • From process group to processes
  • From threads to their process
  1. Weak References (Weak<Process>):
  • From child to parent (prevents reference cycles)
  • From process group to processes in weak maps
flowchart TD
subgraph subGraph0["Reference Relationship"]
    P1["Parent Process"]
    C1["Child Process"]
    PG["Process Group"]
    T1["Thread"]
end

C1 --> P1
C1 --> PG
P1 --> C1
P1 --> PG
PG --> C1
PG --> P1
T1 --> P1

Sources: src/process.rs(L301 - L331) 

Practical Example

Here's how a process hierarchy might be created in code:

// Create init process with PID 0
let init = Process::new_init(0).build();

// Create a child process with PID 1
let child1 = init.fork(1).build();

// Create another child with PID 2 and custom data
let child2 = init.fork(2).data(MyCustomData { value: 42 }).build();

// Create a "grandchild" process with PID 3
let grandchild = child1.fork(3).build();

The resulting hierarchy would look like:

flowchart TD
Init["Init Process (PID 0)"]
Child1["Child Process (PID 1)"]
Child2["Child Process (PID 2)with custom data"]
Grandchild["Grandchild Process (PID 3)"]

Child1 --> Grandchild
Init --> Child1
Init --> Child2

Sources: src/process.rs(L262 - L332)  tests/common/mod.rs(L15 - L28) 

Implementation of ProcessBuilder::build

The build method is the core of process initialization. It takes the builder's configuration and constructs a fully initialized process with proper relationships.

The method performs these key steps:

  1. Determines the process group (from parent or creates new one)
  2. Constructs the Process struct with all fields
  3. Adds the process to its group
  4. Establishes parent-child relationship or marks as init process
  5. Returns the new process wrapped in an Arc

Sources: src/process.rs(L301 - L331) 

System Integration

The process creation system integrates with other components of axprocess:

  1. Thread Creation: After a process is created, threads can be added using process.new_thread(tid)
  2. Process Group Management: Processes can create or join process groups after creation
  3. Session Management: Processes can create new sessions
  4. Parent-Child Relations: The system maintains a hierarchy for resource inheritance and cleanup

Sources: src/process.rs(L168 - L177)  src/process.rs(L100 - L163) 

Summary

The axprocess crate provides a robust system for process creation and initialization that:

  1. Uses the builder pattern for flexible configuration
  2. Establishes proper hierarchical relationships
  3. Manages memory safely with appropriate reference counting
  4. Supports special handling for the init process
  5. Maintains proper process group and session memberships

This foundation enables the subsequent lifecycle management and inter-process relationships that are essential to an operating system's process subsystem.

Process Lifecycle

Relevant source files

This document details the lifecycle of a process in the axprocess crate, from creation through execution to termination and cleanup. For information about process creation techniques and initialization, see Process Creation and Initialization. For details on parent-child relationships, see Parent-Child Relationships.

Overview

Processes in the axprocess crate follow a well-defined lifecycle that ensures proper resource management and cleanup. The lifecycle consists of three primary states:


Sources: src/process.rs(L207 - L236)  src/process.rs(L285 - L332) 

Process Creation

Processes are created using the ProcessBuilder pattern, which configures and then builds a new process instance.

Init Process Creation

The initialization of the first process (init process) is a special case:

// Create the init process
let init = Process::new_init(pid).build();

The init process is stored in a static INIT_PROC variable and serves as the ancestor of all other processes. It cannot be terminated and serves as the "reaper" for orphaned processes.

Child Process Creation

Regular processes are created as children of existing processes using the fork method:

// Creating a child process
let child = parent.fork(new_pid).build();

The ProcessBuilder allows customizing the process before creation, such as setting associated data:

let child = parent.fork(new_pid)
    .data(custom_data)
    .build();

Sources: src/process.rs(L262 - L282)  src/process.rs(L285 - L332) 

Process States and Transitions


Sources: src/process.rs(L179 - L186)  src/process.rs(L196 - L236) 

Active State

An active process is fully functioning and can:

  • Create child processes
  • Create or join sessions and process groups
  • Create threads
  • Access and modify its associated data

An active process can be marked as "group exited" using the group_exit() method, which sets an internal flag but doesn't terminate the process.

Sources: src/process.rs(L179 - L186) 

Zombie State

When a process calls exit(), it enters the zombie state:

  1. It is marked as a zombie using an atomic boolean flag
  2. Its children are reassigned to the init process (or nearest subreaper)
  3. Resources are partially released, but the process structure remains in memory
  4. The process remains in its parent's child list

A zombie process retains minimal information needed for the parent to retrieve its exit status.

Sources: src/process.rs(L196 - L225) 

Process Cleanup

The final state transition occurs when a zombie process is freed using the free() method:

  1. The process is removed from its parent's child list
  2. This allows for complete deallocation when all references are dropped

The free() method will panic if called on a non-zombie process.

Sources: src/process.rs(L227 - L236) 

Process Exit Mechanism

sequenceDiagram
    participant ExitingProcess as "Exiting Process"
    participant ParentProcess as "Parent Process"
    participant InitProcess as "Init Process"
    participant ChildProcesses as "Child Processes"

    ExitingProcess ->> ExitingProcess: "is_zombie.store(true, Ordering::Release)"
    Note over ExitingProcess: Process is now a zombie
    ExitingProcess ->> InitProcess: "Get init process"
    ExitingProcess ->> ChildProcesses: "For each child"
    loop Transfer children
        ChildProcesses ->> InitProcess: "Set init as new parent"
        ExitingProcess ->> ExitingProcess: "Remove from children list"
        InitProcess ->> InitProcess: "Add to children list"
    end
    Note over ExitingProcess,ParentProcess: Parent must call free() later
    ParentProcess ->> ExitingProcess: "free()"
    ParentProcess ->> ParentProcess: "Remove child from children list"

Sources: src/process.rs(L207 - L236) 

The exit mechanism includes several key aspects:

  1. Atomic state change: The process uses atomic operations to mark itself as a zombie
  2. Child inheritance: All children are transferred to the init process
  3. Parent notification: The parent is responsible for calling free() to complete cleanup

Note that attempting to exit the init process will cause a panic, as the init process must always exist in the system.

Sources: src/process.rs(L207 - L225)  tests/process.rs(L32 - L35) 

Process Exit and Cleanup Code Flow

flowchart TD
A["Process::exit() called"]
B["Is init process?"]
C["Panic"]
D["Get init process as reaper"]
E["Mark self as zombie"]
F["Get all children"]
G["For each child"]
H["Set child's parent to reaper"]
I["Add child to reaper's children"]
J["More children?"]
K["Exit complete (zombie state)"]
L["Process::free() called"]
M["Is process zombie?"]
N["Panic"]
O["Get parent"]
P["Remove self from parent's children"]
Q["Cleanup complete"]

A --> B
B --> C
B --> D
D --> E
E --> F
F --> G
G --> H
H --> I
I --> J
J --> G
J --> K
L --> M
M --> N
M --> O
O --> P
P --> Q

Sources: src/process.rs(L207 - L236) 

Special Considerations

Init Process

The init process has special properties in the lifecycle:

  • It cannot exit (attempting to call exit() on it will panic)
  • It acts as the "reaper" for orphaned processes
  • It is created at system initialization and persists until system shutdown

Zombies and Resource Management

Zombie processes maintain minimal state while waiting for their parent to acknowledge their termination via free(). This approach:

  1. Allows parents to retrieve exit status information
  2. Prevents resource leaks by ensuring proper cleanup
  3. Maintains a clean process hierarchy in the system

Sources: src/process.rs(L196 - L236)  tests/process.rs(L16 - L23)  tests/process.rs(L37 - L44) 

Testing Process Lifecycle

The process lifecycle is validated through several test cases:

TestDescription
exit()Verifies that a process can exit and becomes a zombie
free()Ensures a zombie process can be freed and removed from parent
free_not_zombie()Confirms that freeing a non-zombie process causes a panic
init_proc_exit()Verifies that attempting to exit the init process causes a panic
reap()Tests that children of an exited process are reassigned to init

Sources: tests/process.rs(L16 - L55) 

Implementation Details

The process lifecycle is primarily implemented in the Process struct, with key lifecycle methods:

  • ProcessBuilder::build(): Creates and initializes a new process
  • Process::exit(): Terminates the process, making it a zombie
  • Process::free(): Removes the zombie process from its parent's children list
  • Process::is_zombie(): Checks if the process is in the zombie state
  • Process::group_exit(): Marks the process as group exited

Internally, the zombie state is tracked using an atomic boolean, ensuring thread-safe state transitions:

#![allow(unused)]
fn main() {
pub fn is_zombie(&self) -> bool {
    self.is_zombie.load(Ordering::Acquire)
}
}

Sources: src/process.rs(L196 - L236)  src/process.rs(L36 - L47) 

Parent-Child Relationships

Relevant source files

This document details how parent-child relationships between processes are managed in the axprocess crate. It covers the implementation of process hierarchy, relationship establishment, orphan handling, and cleanup mechanisms. For information about the complete process lifecycle, see Process Lifecycle.

Relationship Structure

In the axprocess crate, processes are organized in a hierarchical structure similar to Unix-like operating systems. Each process maintains references to both its parent and its children.

classDiagram
class Process {
    pid: Pid
    parent: SpinNoIrq~
    children: SpinNoIrq~~
    +parent() Option~
    +children() Vec~
}

Process  -->  Process
Process "1" --> "*" Process : children
Process  -->  Process

Process Hierarchy Implementation

A process stores:

  • A weak reference to its parent process (to avoid reference cycles)
  • Strong references to all its child processes (ensuring children don't get dropped prematurely)

This implementation allows for proper resource management while maintaining the process hierarchy.

Sources: src/process.rs(L43 - L44)  src/process.rs(L73 - L80) 

Establishing Parent-Child Relationships

Parent-child relationships are established during process creation. A new process is created using the fork method on an existing process, which returns a ProcessBuilder.

sequenceDiagram
    participant ParentProcess as "Parent Process"
    participant ProcessBuilder as "ProcessBuilder"
    participant ChildProcess as "Child Process"

    ParentProcess ->> ProcessBuilder: "fork(pid)"
    ProcessBuilder ->> ProcessBuilder: "data(...)"
    ProcessBuilder ->> ChildProcess: "build()"
    ChildProcess -->> ParentProcess: "Add to children"
    ChildProcess -->> ChildProcess: "Store parent reference"

When ProcessBuilder::build() is called:

  1. The new process stores a weak reference to its parent
  2. The parent adds the new process to its children collection

Sources: src/process.rs(L275 - L281)  src/process.rs(L301 - L331) 

Code Examples

Here's how the parent-child relationship is established during process creation:

  1. Parent reference in the new process:
parent: SpinNoIrq::new(parent.as_ref().map(Arc::downgrade).unwrap_or_default())
  1. Adding the child to the parent's children:
if let Some(parent) = parent {
    parent.children.lock().insert(pid, process.clone());
}

Sources: src/process.rs(L318 - L328) 

Accessing Relationships

Processes provide methods to access their relationships:

flowchart TD
P["Process"]
PP["parent()"]
C["children()"]
OAP["Option>"]
VAC["Vec>"]

C --> VAC
P --> C
P --> PP
PP --> OAP
  • parent(): Returns the parent process if it exists
  • children(): Returns a vector of all child processes

Sources: src/process.rs(L73 - L80) 

Orphan Handling

When a process exits, its children become orphans. The axprocess system handles this by reparenting these orphan processes to the init process.

sequenceDiagram
    participant Process as "Process"
    participant ChildProcesses as "Child Processes"
    participant InitProcess as "Init Process"

    Process ->> Process: "exit()"
    Process ->> Process: "is_zombie = true"
    Process ->> InitProcess: "Get init_proc"
    Process ->> ChildProcesses: "For each child"
    loop For each child
        ChildProcesses ->> ChildProcesses: "Set parent to init"
        ChildProcesses ->> InitProcess: "Add to init's children"
    end

When a process calls exit():

  1. It's marked as a zombie
  2. Its children are transferred to the init process:
  • Each child updates its parent reference to point to init
  • The child is added to init's children collection

This ensures no process becomes truly orphaned, maintaining the integrity of the process hierarchy.

Sources: src/process.rs(L207 - L224) 

Process Cleanup

When a zombie process is freed, it's removed from its parent's children collection:

flowchart TD
Z["Zombie Process"]
P["Parent"]
C["children collection"]

P --> C
Z --> P

This is performed by the free() method, which can only be called on zombie processes:

  1. It checks that the process is a zombie
  2. It removes itself from its parent's children collection

Sources: src/process.rs(L230 - L236) 

Special Role of the Init Process

The init process serves as the root of the process hierarchy and has special characteristics:

flowchart TD
subgraph subGraph0["Special Properties"]
    NP["No Parent"]
    CE["Cannot Exit"]
    OR["Orphan Reaper"]
end
I["Init Process"]
C1["Child 1"]
C2["Child 2"]
O["Orphaned Processes"]

I --> C1
I --> C2
I --> CE
I --> NP
I --> O
I --> OR

The init process:

  • Is created without a parent
  • Cannot exit (attempting to call exit() on it will panic)
  • Serves as the adoptive parent for all orphaned processes
  • Is created at system initialization and accessible via the init_proc() function

Sources: src/process.rs(L208 - L210)  src/process.rs(L262 - L272)  src/process.rs(L333 - L341) 

Testing Behavior

The process relationship behavior is verified through tests that demonstrate:

Test CaseDescription
childVerifies child processes correctly reference their parent
exitChecks that exited processes become zombies but remain in their parent's children list
freeEnsures freed zombie processes are removed from their parent's children list
reapConfirms orphaned processes (children of an exited process) are reparented to the init process

Sources: tests/process.rs(L9 - L55) 

Complete Parent-Child Lifecycle

The following diagram shows the complete lifecycle of parent-child relationships from creation through exit to cleanup:


This diagram illustrates how a process moves through its lifecycle while maintaining appropriate parent-child relationships throughout.

Sources: src/process.rs(L207 - L236)  src/process.rs(L275 - L331) 

Process Groups and Sessions

Relevant source files

Purpose and Scope

This document details the process group and session management subsystem in the axprocess crate. Process groups and sessions are hierarchical abstractions that organize processes into logical collections, similar to Unix-like operating systems. They play a crucial role in managing process relationships and controlling process behavior.

For information about specific process management and parent-child relationships, see Process Management and Parent-Child Relationships. For thread management within processes, see Thread Management.

Hierarchical Organization

Process groups and sessions form a three-level hierarchy in the process management system:

flowchart TD
S["Session"]
PG1["ProcessGroup"]
PG2["ProcessGroup"]
P1["Process"]
P2["Process"]
P3["Process"]
P4["Process"]
T1["Thread"]
T2["Thread"]
T3["Thread"]

P1 --> T1
P1 --> T2
P2 --> T3
PG1 --> P1
PG1 --> P2
PG2 --> P3
PG2 --> P4
S --> PG1
S --> PG2

This hierarchical organization provides:

  • Structured process management
  • Logical grouping of related processes
  • Potential for process control operations at different granularity levels

Sources: src/session.rs(L12 - L17)  src/process_group.rs(L12 - L17) 

Session Implementation

A session is a collection of process groups, represented by the Session struct:

classDiagram
class Session {
    sid: Pid
    process_groups: SpinNoIrq~~
    +new(sid: Pid) : Arc
    +sid() : Pid
    +process_groups() : Vec~
}

Key characteristics:

  • Each session has a unique Session ID (sid)
  • Sessions contain multiple process groups stored in a thread-safe weak reference map
  • Process groups are referenced by their Process Group ID (pgid)
  • The session implementation uses SpinNoIrq for synchronization and WeakMap for memory management

Sources: src/session.rs(L12 - L45) 

Process Group Implementation

A process group is a collection of processes, represented by the ProcessGroup struct:

classDiagram
class ProcessGroup {
    pgid: Pid
    session: Arc
    processes: SpinNoIrq~~
    +new(pgid: Pid, session: &Arc) : Arc
    +pgid() : Pid
    +session() : Arc
    +processes() : Vec~
}

Key characteristics:

  • Each process group has a unique Process Group ID (pgid)
  • Process groups maintain a strong reference to their containing session
  • Processes within a group are stored in a thread-safe weak reference map
  • Processes are referenced by their Process ID (pid)

Sources: src/process_group.rs(L12 - L47) 

Reference Management

The memory management strategy prevents memory leaks while ensuring objects remain alive as needed:

flowchart TD
subgraph subGraph0["Reference Structure"]
    P["Process"]
    PG["ProcessGroup"]
    S["Session"]
end

P --> PG
PG --> P
PG --> S
S --> PG

This approach:

  • Uses strong references (Arc) for upward relationships (Process → Process Group → Session)
  • Uses weak references for downward relationships (Session → Process Group → Process)
  • Prevents reference cycles that could cause memory leaks
  • Ensures objects persist when needed but can be garbage collected when no longer referenced

Sources: src/session.rs(L7 - L16)  src/process_group.rs(L7 - L17) 

Creation and Relationship Management

The creation flow and relationship management between these entities follows a pattern:

sequenceDiagram
    participant Process as Process
    participant ProcessGroup as ProcessGroup
    participant Session as Session

    Note over Session: Session::new(sid)
    Note over ProcessGroup: ProcessGroup::new(pgid, &session)
    Session ->> ProcessGroup: Store weak reference to group
    Note over Process: Process joins a group
    Process ->> ProcessGroup: Store weak reference to process
    Process ->> ProcessGroup: Maintain strong reference to group
    ProcessGroup ->> Session: Maintain strong reference to session

Key operations:

  1. Sessions are created with a unique SID
  2. Process groups are created within a session with a unique PGID
  3. Processes join process groups, establishing the necessary reference relationships
  4. Reference counting manages the lifecycle of these objects

Sources: src/session.rs(L19 - L26)  src/process_group.rs(L19 - L29) 

Memory Safety and Synchronization

The implementation ensures thread safety and proper memory management:

  1. Thread Safety:
  • SpinNoIrq locks protect shared data structures
  • Used for both session's process groups and process group's processes
  1. Memory Management:
  • WeakMap collections store weak references to prevent reference cycles
  • Strong references (Arc) ensure objects persist as long as needed
  • Weak references allow objects to be garbage collected when no longer needed
  1. Collection Methods:
  • Both Session and ProcessGroup provide methods to retrieve contained objects
  • Collection methods create strong references (Arc) from weak references
  • Only live objects are returned from collection methods

Sources: src/session.rs(L29 - L39)  src/process_group.rs(L32 - L47) 

Future Extensions

The session implementation contains a TODO comment about shell job control, suggesting future functional extensions:

// TODO: shell job control

This indicates planned future support for Unix-like shell job control features, which typically include:

  • Foreground/background process management
  • Job suspension and resumption
  • Terminal signal handling for process groups

Sources: src/session.rs(L16) 

Relationship to Unix Process Management

The implementation mirrors Unix-like process management concepts:

ConceptUnix-like Systemsaxprocess Implementation
ProcessBasic execution unitProcessstruct
Process GroupCollection of related processesProcessGroupstruct
SessionCollection of process groupsSessionstruct
Process Group LeaderFirst process in a groupProcess with PID matching PGID
Session LeaderProcess that creates a sessionProcess with PID matching SID

This familiar design makes the system more intuitive for developers with Unix system programming experience while leveraging Rust's memory safety features.

Process Groups

Relevant source files

Purpose and Scope

This document explains process groups in the axprocess crate, their implementation, and how they fit into the process management hierarchy. Process groups are collections of related processes that enable group-based operations and organization. For information about sessions, which contain process groups, see Sessions. For parent-child process relationships, see Parent-Child Relationships.

Process Group Hierarchy

Process groups form a middle layer in the process management hierarchy of axprocess. They provide a way to organize related processes and enable group-based operations.

flowchart TD
S["Session"]
PG["ProcessGroup"]
P1["Process"]
P2["Process"]
P3["Process"]

PG --> P1
PG --> P2
PG --> P3
S --> PG

Sources: src/process_group.rs(L12 - L17) 

Process Group Implementation

A process group is represented by the ProcessGroup struct, which maintains references to a collection of processes and belongs to a session.

classDiagram
class ProcessGroup {
    pgid: Pid
    session: Arc~Session~
    processes: SpinNoIrq~WeakMap~
    +new(pgid: Pid, session: &Arc~Session~) Arc~Self~
    +pgid() Pid
    +session() Arc~Session~
    +processes() Vec~Arc~Process~~
}

class Session {
    sid: Pid
    process_groups: WeakMap
    
}

class Process {
    pid: Pid
    group: Arc~ProcessGroup~
    
}

Process "1" --> "1" Session : belongs to
Process "1" --> "*" ProcessGroup : contains

The key components of a process group are:

ComponentTypeDescription
pgidPidThe unique identifier for the process group
sessionArcThe session this process group belongs to
processesSpinNoIrq<WeakMap<Pid, Weak>>A map of processes belonging to this group

Sources: src/process_group.rs(L12 - L17)  src/process_group.rs(L19 - L29) 

Process Group Creation and Management

Creation

Process groups are created within an existing session. Typically, a process creates a new group and becomes the leader of that group.

sequenceDiagram
    participant Process as Process
    participant NewProcessGroup as New ProcessGroup
    participant Session as Session
    participant OldProcessGroup as Old ProcessGroup

    Process ->> NewProcessGroup: create with P.pid() as pgid
    NewProcessGroup ->> Session: register with session
    Process ->> OldProcessGroup: remove from old group
    Process ->> NewProcessGroup: add to new group
    Note over Process,NewProcessGroup: Process becomes group leader

The process group ID (pgid) is typically set to the process ID (pid) of the creating process, making that process the group leader.

Sources: src/process_group.rs(L19 - L29)  tests/group.rs(L22 - L43) 

Process Movement Between Groups

Processes can move between process groups, which involves removing them from their current group and adding them to a new one.

sequenceDiagram
    participant Process as Process
    participant OldProcessGroup as Old ProcessGroup
    participant NewProcessGroup as New ProcessGroup

    Process ->> OldProcessGroup: remove from processes map
    Process ->> NewProcessGroup: add to processes map
    Process ->> Process: update group reference
    Note over OldProcessGroup: If empty, may be cleaned up

When a process moves to a new group, it's removed from its old group's process map and added to the new group's map. If the old group becomes empty (no more processes), it may be cleaned up.

Sources: tests/group.rs(L77 - L113) 

Memory Management

Process groups use a combination of strong (Arc) and weak (Weak) references to manage memory and prevent reference cycles:

  1. Processes hold strong references (Arc) to their process group
  2. Process groups hold weak references (Weak) to their processes
  3. Sessions hold weak references to process groups
flowchart TD
subgraph subGraph0["Reference Direction"]
    P["Process"]
    PG["ProcessGroup"]
    S["Session"]
end

P --> PG
PG --> P
PG --> S
S --> PG

This reference strategy ensures proper cleanup when processes or groups are no longer needed.

Sources: src/process_group.rs(L15 - L16)  tests/group.rs(L54 - L65) 

Process Group Inheritance

When a process is forked (a new child is created), the child typically inherits the parent's process group. This maintains the group relationship across process creation.

sequenceDiagram
    participant ParentProcess as Parent Process
    participant ChildProcess as Child Process
    participant ProcessGroup as Process Group

    ParentProcess ->> ChildProcess: fork() creates
    ChildProcess ->> ProcessGroup: inherits parent's group
    ProcessGroup ->> ProcessGroup: adds child to processes map
    Note over ParentProcess,ChildProcess: Both in same group

Sources: tests/group.rs(L67 - L75) 

Process Group Cleanup

Process groups are automatically cleaned up when:

  1. All processes in the group have exited and been freed
  2. There are no more strong references to the process group

This automatic cleanup is handled through Rust's reference counting mechanism and the weak reference strategy used in axprocess.

When the last process in a group is removed (either by exiting or moving to another group), the process group becomes eligible for cleanup if there are no other strong references to it.

Sources: tests/group.rs(L54 - L65)  tests/group.rs(L102 - L113) 

API Summary

The ProcessGroup struct provides the following key methods:

MethodPurpose
new(pgid, session)Creates a new process group with the given ID in the specified session
pgid()Returns the process group ID
session()Returns a reference to the session this group belongs to
processes()Returns a vector of all processes in this group

Sources: src/process_group.rs(L19 - L46) 

Use Cases

Process groups serve several important purposes in operating systems:

  1. Job Control: Allow signals to be sent to multiple related processes at once
  2. Organization: Group related processes together (e.g., a shell pipeline)
  3. Termination Control: Enable orderly shutdown of related processes

The implementation in axprocess follows patterns similar to Unix-like systems but with Rust's memory safety guarantees.

Sources: tests/group.rs(L9 - L141) 

Sessions

Relevant source files

Purpose and Scope

This document explains the concept of Sessions in the axprocess codebase. A Session represents a collection of Process Groups and forms the top level of the process hierarchy. Each process belongs to exactly one process group, and each process group belongs to exactly one session.

The session abstraction is inspired by Unix-like operating systems, where sessions are typically used to manage groups of related processes, such as those associated with a terminal login session.

Sources: src/session.rs(L12 - L17) 

Session Structure

A Session in axprocess is a simple structure that maintains a collection of process groups:

classDiagram
note for Session "Each session has a unique session ID (sid)"
class Session {
    sid: Pid
    process_groups: SpinNoIrq~~
    +sid() Pid
    +process_groups() Vec~
    +new(sid: Pid) Arc
}

class ProcessGroup {
    pgid: Pid
    session: Arc
    processes: WeakMap~
    
}

Session "1" o-- "*" ProcessGroup : contains

The Session struct contains:

  • sid: A unique session ID (of type Pid)
  • process_groups: A thread-safe weak map that links process group IDs to weak references of process groups

Sessions use weak references to their process groups to avoid reference cycles, as process groups hold strong references to their sessions.

Sources: src/session.rs(L12 - L17)  src/session.rs(L19 - L26) 

Session Hierarchy

Sessions form the top level of the process management hierarchy in axprocess. Each element in the hierarchy has specific relationships with others:

flowchart TD
S["Session"]
PG1["Process Group 1"]
PG2["Process Group 2"]
PG3["Process Group 3"]
P1["Process 1"]
P2["Process 2"]
P3["Process 3"]
P4["Process 4"]
P5["Process 5"]
P6["Process 6"]
T1["Thread 1.1"]
T2["Thread 1.2"]
T3["Thread 4.1"]

P1 --> T1
P1 --> T2
P4 --> T3
PG1 --> P1
PG1 --> P2
PG2 --> P3
PG3 --> P4
PG3 --> P5
PG3 --> P6
S --> PG1
S --> PG2
S --> PG3

This hierarchical structure allows for logical grouping of related processes and simplifies operations that need to be performed on sets of processes.

Sources: src/session.rs(L12 - L17)  tests/session.rs(L9 - L19) 

Memory Management

The session implementation uses a careful reference counting approach to prevent memory leaks and ensure proper cleanup:

flowchart TD
subgraph subGraph0["Reference Structure"]
    P["Process"]
    PG["Process Group"]
    S["Session"]
    PG_ref["Process Group (Reference)"]
    P_ref["Process (Reference)"]
end
note["Strong references point upwardWeak references point downward"]

P --> PG
PG --> P_ref
PG --> S
S --> PG_ref

Key aspects of memory management for sessions:

  • Processes hold strong references to their process groups
  • Process groups hold strong references to their sessions
  • Sessions hold weak references to their process groups
  • This prevents circular references while ensuring objects stay alive as needed

When all processes in a process group are freed, the process group is dropped, and when all process groups in a session are dropped, the session is freed.

Sources: src/session.rs(L15)  tests/session.rs(L51 - L64) 

Session Creation and Management

Creation

A session is created when a process calls create_session(). The process becomes the leader of both the new session and a new process group:

sequenceDiagram
    participant Process as Process
    participant ProcessGroup as Process Group
    participant OldProcessGroup as Old Process Group
    participant NewSession as New Session
    participant OldSession as Old Session

    Process ->> Process: create_session()
    Note over Process: Process must not be a group leader
    Process ->> NewSession: Session::new(pid)
    NewSession -->> Process: new session
    Process ->> ProcessGroup: ProcessGroup::new(pid, &session)
    ProcessGroup -->> Process: new process group
    Process ->> OldProcessGroup: leave old group
    OldProcessGroup -->> OldSession: update group membership
    Process ->> ProcessGroup: join new group
    Process -->> Process: return (session, group)

The process must not already be a group leader to create a new session. This is enforced at runtime.

Sources: src/session.rs(L19 - L26)  tests/session.rs(L21 - L44)  tests/session.rs(L47 - L49) 

Session Management

Sessions provide methods to access their properties and process groups:

  • sid(): Returns the session ID
  • process_groups(): Returns all process groups that belong to this session

A process cannot move to a process group that belongs to a different session:


Sources: src/session.rs(L29 - L39)  tests/session.rs(L86 - L96) 

Cleanup

When all processes in a session exit and are freed, the session's process groups will be empty, and eventually, the session itself will be cleaned up through Rust's reference counting mechanism:

sequenceDiagram
    participant Process as Process
    participant ProcessGroup as Process Group
    participant Session as Session

    Process ->> Process: exit()
    Process ->> Process: free()
    Note over Process: Process no longer in group
    Note over ProcessGroup: When all processes are gone
    ProcessGroup ->> ProcessGroup: Drop
    Note over ProcessGroup: Process group removes itself from session
    Note over Session: When all process groups are gone
    Session ->> Session: Drop
    Note over Session: Session is freed

Sources: tests/session.rs(L51 - L64)  tests/session.rs(L99 - L108) 

Practical Examples

Basic Session Structure

The initial process (init) automatically creates a session and process group:

let init = init_proc();
let group = init.group();
let session = group.session();

// The group and session IDs match the init process ID
assert_eq!(group.pgid(), init.pid());
assert_eq!(session.sid(), init.pid());

Sources: tests/session.rs(L9 - L19) 

Creating a New Session

A child process can create its own session:

let parent = init_proc();
let child = parent.new_child();
let (child_session, child_group) = child.create_session().unwrap();

// The child becomes the leader of both the new session and group
assert_eq!(child_group.pgid(), child.pid());
assert_eq!(child_session.sid(), child.pid());

Sources: tests/session.rs(L21 - L44) 

Implementation Details

The Session struct is implemented in src/session.rs with these key methods:

  • fn new(sid: Pid) -> Arc<Self>: Creates a new session with the given session ID
  • fn sid(&self) -> Pid: Returns the session ID
  • fn process_groups(&self) -> Vec<Arc<ProcessGroup>>: Returns all process groups in this session

The implementation uses SpinNoIrq locks for thread safety and concurrent access to session data.

Sources: src/session.rs(L19 - L39) 

For more information about how process groups interact with sessions, see Process Groups.

For details on how processes move between groups and the hierarchical relationship between sessions, groups, and processes, see Hierarchy and Movement.

Hierarchy and Movement

Relevant source files

This page explains the hierarchical relationships between sessions, process groups, and processes in the axprocess crate, and details how processes can move between different groups and sessions. For information about parent-child relationships between processes, see Parent-Child Relationships. For details about thread management within processes, see Thread Management.

Hierarchical Structure Overview

The axprocess system implements a three-level hierarchical structure inspired by Unix-like operating systems:

  1. Sessions: The top-level container that groups related process groups
  2. Process Groups: The middle-level container that groups related processes
  3. Processes: The individual execution units that can contain multiple threads

This hierarchy is used for organizing processes and implementing features like job control. Each entity in the hierarchy is identified by a unique process identifier (Pid).

flowchart TD
subgraph Session["Session"]
    S["Session (sid)"]
    PG1["ProcessGroup (pgid1)"]
    PG2["ProcessGroup (pgid2)"]
    P1["Process (pid1)"]
    P2["Process (pid2)"]
    P3["Process (pid3)"]
    P4["Process (pid4)"]
end

PG1 --> P1
PG1 --> P2
PG2 --> P3
PG2 --> P4
S --> PG1
S --> PG2

Sources: src/session.rs(L1 - L45)  src/process_group.rs(L1 - L56)  src/process.rs(L83 - L89) 

Process Group and Session Relationships

In the axprocess crate, the relationships between sessions, process groups, and processes are implemented using strong and weak references:

  • Each process stores a strong reference (Arc) to its process group
  • Each process group stores a strong reference to its session
  • Both process groups and sessions store weak references (WeakMap) to their contained entities to prevent reference cycles
classDiagram
class Session {
    sid: Pid
    process_groups: WeakMap~
    +sid() Pid
    +process_groups() Vec~
}

class ProcessGroup {
    pgid: Pid
    session: Arc
    processes: WeakMap~
    +pgid() Pid
    +session() Arc
    +processes() Vec~
}

class Process {
    pid: Pid
    group: SpinNoIrq~
    +group() Arc
    +set_group() void
    +create_session() Option~(Arc, Arc) ~
    +create_group() Option~
    +move_to_group() bool
}

Process "1" --> "*" ProcessGroup : belongs to
Process "1" --> "*" ProcessGroup : belongs to

Sources: src/session.rs(L12 - L18)  src/process_group.rs(L12 - L17)  src/process.rs(L83 - L164) 

Creating New Sessions

A process can create a new session and become its session leader by calling the create_session() method. This operation also creates a new process group within the new session, with the process as the process group leader.

sequenceDiagram
    participant Process as Process
    participant NewSession as New Session
    participant NewProcessGroup as New Process Group
    participant OldProcessGroup as Old Process Group

    Process ->> Process: create_session()
    Note over Process: Check if already session leader
    Process ->> NewSession: Session::new(pid)
    Process ->> NewProcessGroup: ProcessGroup::new(pid, &new_session)
    Process ->> OldProcessGroup: Remove process from old group
    Process ->> NewProcessGroup: Add process to new group
    Process ->> Process: Update group reference
    Process -->> Process: Return (new_session, new_group)

Key features of session creation:

  • A process cannot create a new session if it is already a session leader (when process.group().session.sid() == process.pid())
  • When a new session is created, a new process group is also created with the same ID
  • The process is moved from its old process group to the new one

Sources: src/process.rs(L100 - L123)  tests/session.rs(L21 - L44) 

Creating New Process Groups

A process can create a new process group within its current session and become its group leader by calling the create_group() method.

sequenceDiagram
    participant Process as Process
    participant Session as Session
    participant NewProcessGroup as New Process Group
    participant OldProcessGroup as Old Process Group

    Process ->> Process: create_group()
    Note over Process: Check if already group leader
    Process ->> NewProcessGroup: ProcessGroup::new(pid, &current_session)
    Process ->> OldProcessGroup: Remove process from old group
    Process ->> NewProcessGroup: Add process to new group
    Process ->> Process: Update group reference
    Process -->> Process: Return new_group

Key features of process group creation:

  • A process cannot create a new process group if it is already a process group leader (when process.group().pgid() == process.pid())
  • The new process group is created within the process's current session
  • The process is moved from its old process group to the new one

Sources: src/process.rs(L124 - L143)  tests/group.rs(L22 - L43) 

Moving Between Process Groups

A process can move to a different process group within the same session by calling the move_to_group() method.

sequenceDiagram
    participant Process as Process
    participant DestinationProcessGroup as Destination Process Group
    participant CurrentProcessGroup as Current Process Group

    Process ->> Process: move_to_group(destination_group)
    alt Already in the group
        Process -->> Process: Return true (no action needed)
    else Different session
        Process -->> Process: Return false (operation not allowed)
    else Move allowed
        Process ->> CurrentProcessGroup: Remove process from current group
        Process ->> DestinationProcessGroup: Add process to destination group
        Process ->> Process: Update group reference
        Process -->> Process: Return true (move successful)
    end

Key constraints on process movement:

  • A process can only move to a process group within the same session
  • If a process is already in the specified process group, no action is taken
  • The process is removed from its old process group and added to the new one

Sources: src/process.rs(L145 - L163)  tests/group.rs(L77 - L100)  tests/session.rs(L86 - L96) 

Process Creation and Inheritance

When a new process is created using Process::fork() and then ProcessBuilder::build(), it inherits its parent's process group by default. This behavior ensures that related processes stay within the same group unless explicitly moved.

flowchart TD
PP["Parent Process"]
PB["ProcessBuilder"]
CP["Child Process"]
PPG["Parent Process Group"]

PB --> CP
PP --> PB
PPG --> CP

Sources: src/process.rs(L285 - L332) 

Resource Cleanup and Memory Management

The hierarchical structure is designed to ensure proper cleanup of resources:

  1. When a process moves to a different group, its reference to the old group is dropped
  2. If a process was the last one in a group, the group will be automatically cleaned up when all references to it are dropped
  3. Similarly, when a process group is removed from a session, the session will be cleaned up if it was the last group

This approach prevents memory leaks while maintaining the hierarchical relationships.

flowchart TD
subgraph subGraph1["After Move"]
    GC["Garbage Collection"]
    subgraph Before["Before"]
        P2["Process"]
        PG3["Process Group"]
        S2["Session"]
        PG4["Process Group (empty)"]
        P1["Process"]
        PG1["Process Group"]
        S["Session"]
        PG2["Process Group"]
    end
end

P1 --> PG1
P2 --> PG3
PG1 --> S
PG2 --> S
PG3 --> S2
PG4 --> GC
PG4 --> S2

Sources: tests/group.rs(L54 - L65)  tests/group.rs(L102 - L113)  tests/session.rs(L52 - L64) 

Example: Moving Processes Between Groups

Here's a practical example of how process groups can be manipulated in code:

sequenceDiagram
    participant Parent as Parent
    participant Child1 as Child1
    participant Child2 as Child2
    participant Group1 as Group1
    participant Group2 as Group2

    Parent ->> Child1: new_child()
    Child1 ->> Group1: create_group()
    Parent ->> Child2: new_child()
    Child2 ->> Group2: create_group()
    Child2 ->> Child2: move_to_group(Group1)
    Note over Child2,Group1: Child2 now belongs to Group1
    Note over Group2: Group2 is now empty

This diagram illustrates the flow from tests where:

  1. A parent process creates two child processes
  2. Each child creates its own process group
  3. The second child moves to the first child's group
  4. The second child's original group becomes empty

Sources: tests/group.rs(L77 - L100) 

Constraints and Rules Summary

The following rules govern process movement in the hierarchy:

OperationConditionResult
Create sessionProcess is already a session leaderOperation fails, returnsNone
Create sessionProcess is not a session leaderNew session and group created, process moved
Create groupProcess is already a group leaderOperation fails, returnsNone
Create groupProcess is not a group leaderNew group created, process moved
Move to groupTarget group in different sessionOperation fails, returnsfalse
Move to groupTarget group in same sessionProcess moved, returnstrue
Move to groupAlready in target groupNo action, returnstrue

Sources: src/process.rs(L100 - L163)  tests/group.rs(L44 - L52)  tests/session.rs(L46 - L48) 

Practical Implications

Understanding the hierarchical structure and movement capabilities in axprocess allows for effective process management:

  1. Related processes can be grouped together for collective management
  2. Session boundaries provide isolation between unrelated process groups
  3. Process movement enables dynamic reorganization of processes based on their relationships or roles
  4. The hierarchy forms the foundation for implementing job control and terminal management

By organizing processes into groups and sessions, the system can implement sophisticated process management features commonly found in Unix-like operating systems.

Sources: src/process.rs(L83 - L164)  src/process_group.rs(L1 - L56)  src/session.rs(L1 - L45) 

Thread Management

Relevant source files

This document explains how threads are implemented and managed within the axprocess crate. It covers thread creation, lifecycle, and the relationship between threads and processes. For process-specific features, see Process Management, and for memory management aspects, see Memory Management.

Thread Structure and Components

The thread management system consists of several key components that work together to provide thread functionality:

classDiagram
class Thread {
    tid: Pid
    process: Arc
    data: Box
    +tid() Pid
    +process() &Arc
    +exit(exit_code: i32) bool
    +data() Option~&T~
}

class ThreadBuilder {
    tid: Pid
    process: Arc
    data: Box
    +new(tid: Pid, process: Arc) ThreadBuilder
    +data(data: T) ThreadBuilder
    +build() Arc
}

class Process {
    pid: Pid
    tg: SpinNoIrq
    +new_thread(tid: Pid) ThreadBuilder
    +threads() Vec~
    +is_group_exited() bool
    +group_exit() void
}

class ThreadGroup {
    threads: WeakMap~
    exit_code: i32
    group_exited: bool
    
}

Thread  -->  Process : belongs to
ThreadBuilder  -->  Thread : builds
Process  -->  ThreadGroup : contains
ThreadGroup  -->  Thread : tracks

Sources: src/thread.rs(L6 - L28)  src/thread.rs(L51 - L88)  src/process.rs(L18 - L32)  src/process.rs(L167 - L192) 

Thread Structure

The Thread struct represents an individual thread within a process:

  • It has a unique thread ID (tid) of type Pid
  • It maintains a strong reference to its parent process using Arc<Process>
  • It can store arbitrary data via a type-erased Box<dyn Any + Send + Sync>
  • It provides methods to access its properties and manage its lifecycle

Sources: src/thread.rs(L6 - L28) 

Thread Group

Each process contains a ThreadGroup which manages all threads within that process:

  • The ThreadGroup maintains a collection of weak references to threads using WeakMap<Pid, Weak<Thread>>
  • It tracks the process exit code, which is set when threads exit
  • It has a group_exited flag that can be set to indicate the entire thread group should exit

Sources: src/process.rs(L18 - L32) 

Thread Creation Process

Threads are created through a multi-step process using the builder pattern:

sequenceDiagram
    participant Process as "Process"
    participant ThreadBuilder as "ThreadBuilder"
    participant Thread as "Thread"
    participant ThreadGroup as "ThreadGroup"

    Process ->> ThreadBuilder: new_thread(tid)
    Note over Thread,ThreadBuilder: Configure thread
    ThreadBuilder ->> ThreadBuilder: data(custom_data)
    ThreadBuilder ->> Thread: build()
    Thread ->> ThreadGroup: register thread
    Note over Process,ThreadGroup: Thread is now part of the process's thread group

Sources: src/process.rs(L168 - L171)  src/thread.rs(L58 - L88) 

  1. Thread creation begins by calling Process::new_thread(tid), which returns a ThreadBuilder instance
  2. The builder can be configured with custom data using the data() method
  3. Calling build() on the builder creates the actual thread
  4. During building, the thread is registered in the process's thread group
  5. The builder returns an Arc<Thread> as the final product

This builder pattern allows for optional configuration while ensuring proper registration of the thread with its process.

Sources: src/thread.rs(L51 - L88)  src/process.rs(L168 - L171) 

Thread Lifecycle Management

Threads in axprocess go through several states during their lifetime:


Sources: src/thread.rs(L29 - L39)  src/process.rs(L167 - L192) 

Thread Exit

The thread exit process is a critical part of thread management:

  1. When a thread is ready to terminate, it calls Thread::exit(exit_code)
  2. This method:
  • Updates the thread group's exit code (if group exit hasn't been set)
  • Removes the thread from the process's thread group
  • Returns a boolean indicating if it was the last thread in the group
  1. If the thread was the last one to exit, typically the caller would trigger process termination

Sources: src/thread.rs(L29 - L39) 

Process-Thread Relationship

The relationship between processes and threads is fundamental to the system design:

flowchart TD
subgraph subGraph2["Thread 2"]
    T2["Thread Methods:- tid()- process()- exit()"]
end
subgraph subGraph1["Thread 1"]
    T1["Thread Methods:- tid()- process()- exit()"]
end
subgraph Process["Process"]
    TG["ThreadGroup"]
    P["Process Methods:- new_thread()- threads()- is_group_exited()- group_exit()"]
end

P --> TG
T1 --> P
T2 --> P
TG --> T1
TG --> T2

Sources: src/process.rs(L167 - L192)  src/thread.rs(L6 - L39) 

Process Thread Management Functions

A process provides several methods to manage its threads:

  • new_thread(tid): Creates a new thread with the given thread ID
  • threads(): Returns a list of all threads in the process
  • is_group_exited(): Checks if the thread group has been marked for exit
  • group_exit(): Marks the thread group as exited, signaling all threads to terminate

When a process's group_exit() method is called, its group_exited flag is set to true. This doesn't directly terminate threads, but serves as a signal that they should exit. Individual threads need to check this flag and respond accordingly.

Sources: src/process.rs(L167 - L192) 

Thread Exit and Process Status

When a thread exits, it may affect the process state:

  1. If the exiting thread is the last thread in the process, the process should typically be terminated
  2. The thread's exit code may become the process's exit code (unless group_exited is true)
  3. When all threads exit, resources associated with the thread group can be cleaned up

Sources: src/thread.rs(L29 - L39)  src/process.rs(L167 - L192) 

Data Storage in Threads

Both Thread and Process contain a data field of type Box<dyn Any + Send + Sync>, which allows storing arbitrary data that satisfies the Send and Sync traits:

  • The data<T: Any + Send + Sync>() method on both types allows retrieving this data when its exact type is known
  • The builder patterns for both types allow setting this data during creation
  • This mechanism provides a flexible way to associate custom data with threads and processes

This type-erased data storage enables client code to store task-specific information without modifying the core thread and process implementations.

Sources: src/thread.rs(L24 - L27)  src/thread.rs(L67 - L73) 

Thread Management Best Practices

When working with the thread management system in axprocess, consider these guidelines:

  1. Always check the return value of Thread::exit() to determine if process termination is needed
  2. Use the builder pattern properly by calling methods in a chain and ending with build()
  3. Manage thread references carefully to prevent memory leaks
  4. Be aware of the process lifecycle and how thread termination affects it

The thread management system in axprocess provides a flexible foundation for multithreaded applications while maintaining proper resource management and cleanup.

Thread Creation and Builder

Relevant source files

Purpose and Scope

This document explains the thread creation process and the ThreadBuilder pattern in axprocess. It covers how new threads are instantiated, configured, and registered with their parent processes. For information about the thread lifecycle and exit procedures, see Thread Lifecycle and Exit.

Thread Structure

In axprocess, a thread is represented by the Thread struct, which contains the following key components:

FieldTypeDescription
tidPidUnique thread identifier
processArcReference to the process that owns the thread
dataBox<dyn Any + Send + Sync>Custom data associated with the thread

The Thread struct provides methods to access its properties and manage its lifecycle:

classDiagram
class Thread {
    tid: Pid
    process: Arc
    data: Box
    +tid() Pid
    +process() &Arc
    +data() Option~&T~
    +exit(exit_code: i32) bool
}

Sources: src/thread.rs(L6 - L40) 

ThreadBuilder Pattern

Thread creation follows the builder pattern through the ThreadBuilder struct. This pattern allows for flexible configuration before final construction.

classDiagram
class ThreadBuilder {
    tid: Pid
    process: Arc
    data: Box
    +new(tid: Pid, process: Arc) Self
    +data(data: T) Self
    +build() Arc
}

class Thread {
    
    
}

Thread  -->  ThreadBuilder : "builds"

The ThreadBuilder provides a clean interface for configuring a thread before its construction:

  1. Instantiate the builder with a thread ID and process reference
  2. Optionally set custom data
  3. Build the thread, which registers it with the process's thread group

Sources: src/thread.rs(L51 - L88) 

Thread Creation Flow

The thread creation process involves several steps from initialization to registration with the thread group:

sequenceDiagram
    participant Caller as "Caller"
    participant ThreadBuilder as "ThreadBuilder"
    participant Thread as "Thread"
    participant Process as "Process"
    participant ThreadGroup as "ThreadGroup"

    Caller ->> ThreadBuilder: new(tid, process)
    Note over ThreadBuilder: Initialize builder with<br>thread ID and process
    opt Configure thread
        Caller ->> ThreadBuilder: data(custom_data)
        Note over ThreadBuilder: Set custom data
    end
    Caller ->> ThreadBuilder: build()
    ThreadBuilder ->> Thread: Create Thread
    Note over Thread: Initialize with<br>tid, process, data
    ThreadBuilder ->> Process: process.tg.lock()
    Process ->> ThreadGroup: Return locked ThreadGroup
    ThreadBuilder ->> ThreadGroup: threads.insert(tid, &thread)
    Note over ThreadGroup: Register thread in the<br>thread group
    ThreadBuilder -->> Caller: Return Arc<Thread>

Sources: src/thread.rs(L59 - L87) 

ThreadBuilder API

The ThreadBuilder API provides the following methods:

  1. new(tid: Pid, process: Arc<Process>) - Creates a new ThreadBuilder with the specified thread ID and process
  2. data<T: Any + Send + Sync>(data: T) - Associates custom data with the thread
  3. build() - Constructs the Thread and registers it with the process's thread group

Thread Construction

When build() is called, the ThreadBuilder performs these steps:

  1. Creates a new Thread instance with the configured parameters
  2. Wraps the thread in an Arc for shared ownership
  3. Registers the thread with the process's thread group
  4. Returns the Arc<Thread> to the caller

Sources: src/thread.rs(L76 - L87) 

Thread Data Management

The data field in both Thread and ThreadBuilder uses Rust's Any trait to allow storing any type that is Send and Sync. This enables the thread to associate arbitrary data types with itself.

flowchart TD
subgraph subGraph1["Data Retrieval"]
    E["Get Typed Data"]
    F["Access Custom Data"]
end
subgraph subGraph0["Thread Creation"]
    B["ThreadBuilder"]
    C["Configure"]
    D["Thread"]
end

B --> C
C --> D
D --> E
E --> F

The data<T>() method on Thread allows retrieving this custom data as a specific type by using the downcast_ref method provided by the Any trait.

Sources: src/thread.rs(L25 - L27)  src/thread.rs(L68 - L73) 

Integration with Process and Thread Group

Threads are managed within a Process through a ThreadGroup:

flowchart TD
P["Process"]
TG["ThreadGroup"]
T1["Thread 1"]
T2["Thread 2"]
TN["Thread N"]
TB["ThreadBuilder"]
TNP["New Thread"]

P --> TB
P --> TG
TB --> TG
TB --> TNP
TG --> T1
TG --> T2
TG --> TN

When a thread is created using ThreadBuilder.build(), it is automatically registered with its process's thread group (stored in the tg field of the Process struct). This registration happens by inserting the thread's ID and a reference to the thread into the thread group's threads collection.

Sources: src/thread.rs(L84) 

Thread Identity and Ownership

Each thread has a unique thread ID (tid), which is a Pid type. The thread maintains a strong reference (Arc) to its parent process, establishing a clear ownership relationship:

flowchart TD
T["Thread"]
P["Process"]
TG["ThreadGroup"]
TR["Thread References"]

P --> T
T --> P
TG --> TR
TR --> T

This reference pattern ensures:

  1. A thread cannot outlive its process
  2. Processes can track their threads without creating reference cycles
  3. Thread cleanup can occur properly when a thread exits

Sources: src/thread.rs(L7 - L11)  src/thread.rs(L19 - L22) 

Thread Lifecycle and Exit

Relevant source files

Purpose and Scope

This document explains the lifecycle of threads within the axprocess system, focusing on thread creation, execution, and termination processes. It details how threads are managed within processes and the impact of thread exit on the overall process lifecycle. For information about thread creation specifically, see Thread Creation and Builder.

Thread Structure and Components

In the axprocess system, a thread represents an execution context within a process. Each thread has its own identity and data but operates within the context of its parent process.

classDiagram
class Thread {
    tid: Pid
    process: Arc
    data: Box
    +tid() Pid
    +process() &Arc
    +data() Option~&T~
    +exit(exit_code: i32) bool
}

class Process {
    pid: Pid
    tg: SpinNoIrq
    +threads() Vec~
    +new_thread(tid: Pid) ThreadBuilder
}

class ThreadGroup {
    threads: WeakMap~
    exit_code: i32
    group_exited: bool
    
}

Thread  -->  Process : belongs to
Process  -->  ThreadGroup : contains
ThreadGroup  -->  Thread : tracks

Thread-to-Process Relationship Diagram

Sources: src/thread.rs(L7 - L27)  src/process.rs(L18 - L31)  src/process.rs(L167 - L191) 

Key Components

  1. Thread: A single execution unit with its own thread ID (tid), a reference to its parent process, and associated data.
  2. ThreadGroup: Manages all threads within a process, tracking:
  • Active threads
  • Exit code
  • Group exit status
  1. Process: Contains the thread group and provides methods for thread management.

Sources: src/thread.rs(L7 - L27)  src/process.rs(L18 - L31)  src/process.rs(L34 - L47) 

Thread Lifecycle States

Threads in axprocess move through several distinct states throughout their existence:


Thread Lifecycle States Diagram

Sources: src/thread.rs(L76 - L87)  src/thread.rs(L29 - L39) 

State Transitions

  1. Creation: A thread is created using ThreadBuilder::build(), which:
  • Creates a new Thread object with the specified parameters
  • Adds the thread to the process's thread group
  • Returns an Arc<Thread> for subsequent operations
  1. Running: After creation, a thread is considered to be in the running state (though actual scheduling is handled outside axprocess)
  2. Exit: When Thread::exit(exit_code) is called:
  • The thread is removed from the thread group
  • If this was not a group exit, the exit code is stored
  • The method returns a boolean indicating if this was the last thread in the group

Sources: src/thread.rs(L76 - L87)  src/thread.rs(L29 - L39) 

Thread Exit Process

When a thread exits, a specific sequence of operations occurs to handle cleanup and potential process termination:

sequenceDiagram
    participant Thread as "Thread"
    participant ThreadGroup as "ThreadGroup"
    participant Process as "Process"

    Thread ->> ThreadGroup: exit(exit_code)
    ThreadGroup ->> ThreadGroup: Lock thread group
    alt group_exi-
    alt ted is
    alt false
        ThreadGroup ->> ThreadGroup: Set exit_code
    end
    end
    end
    ThreadGroup ->> ThreadGroup: Remove thread from threads map
    ThreadGroup -->> Thread: Return if threads is empty
    alt Last thread exited
        Thread ->> Process: May trigger process exit
    end

Thread Exit Process Sequence Diagram

Sources: src/thread.rs(L29 - L39)  src/process.rs(L195 - L225) 

Exit Process Details

  1. Acquire Lock: The thread acquires a lock on the process's thread group.
  2. Update Exit Code: If the thread group hasn't already been marked as exited (through group_exit()), the exit code is updated with the provided value.
  3. Remove Thread: The thread is removed from the thread group's thread map.
  4. Check Last Thread: The method returns true if this thread was the last one in the group, which may trigger further actions:
pub fn exit(&self, exit_code: i32) -> bool {
    let mut tg = self.process.tg.lock();
    if !tg.group_exited {
        tg.exit_code = exit_code;
    }
    tg.threads.remove(&self.tid);
    tg.threads.is_empty()
}
  1. Process Termination: If the last thread exits, the caller is responsible for handling process termination if needed.

Sources: src/thread.rs(L29 - L39) 

Group Exit Mechanism

The thread group can be marked for group exit, which affects how individual thread exits are handled:

flowchart TD
A["Process::group_exit()"]
B["Set group_exited = true"]
C["Thread::exit(exit_code)"]
D["Check group_exited"]
E["group_exited?"]
F["Keep existing exit_code"]
G["Set exit_code = new value"]
H["Remove thread from group"]
I["Last thread?"]
J["Return true"]
K["Return false"]

A --> B
C --> D
D --> E
E --> F
E --> G
F --> H
G --> H
H --> I
I --> J
I --> K

Group Exit Mechanism Diagram

Sources: src/process.rs(L179 - L186)  src/thread.rs(L29 - L39) 

Group Exit Details

  1. Initiation: A process can be marked for group exit by calling Process::group_exit():
#![allow(unused)]
fn main() {
pub fn group_exit(&self) {
    self.tg.lock().group_exited = true;
}
}
  1. Effect on Threads: When threads exit after group exit is set:
  • The exit code from individual threads is ignored
  • The previously set exit code (before group exit) is preserved
  1. Exit Status Preservation: This mechanism allows the exit status to be fixed at a specific value regardless of how individual threads exit.

Sources: src/process.rs(L179 - L186)  src/thread.rs(L32 - L34) 

Impact on Process Lifecycle

Thread exits play a critical role in the process lifecycle:

flowchart TD
A["Thread::exit(exit_code)"]
B["Last thread?"]
C["May trigger Process::exit()"]
D["Mark process as zombie"]
E["Reparent children to init process"]
F["Process becomes zombie"]
G["Process continues running"]
H["Process::group_exit()"]
I["All threads exit with same status"]
J["Eventually leads to Process::exit()"]

A --> B
B --> C
B --> G
C --> D
D --> E
E --> F
H --> I
I --> J

Thread Exit Impact on Process Diagram

Sources: src/thread.rs(L29 - L39)  src/process.rs(L195 - L225) 

Key Considerations

  1. Last Thread Exit: When the last thread exits, the process itself may need to exit, which is typically handled by the scheduler or executor.
  2. Zombie Process: When a process exits, it becomes a zombie until its parent collects its exit status and frees it:
#![allow(unused)]
fn main() {
pub fn exit(self: &Arc<Self>) {
    // Check not init process
    // Mark as zombie
    self.is_zombie.store(true, Ordering::Release);
    // Reparent children to init process
    // Additional cleanup
}
}
  1. Resource Cleanup:
  • Thread resources are cleaned up when the thread is removed from the thread group
  • Process resources are only fully cleaned up when the zombie process is freed

Sources: src/process.rs(L195 - L236) 

Memory Management and Reference Counting

The axprocess system employs careful memory management to ensure proper resource cleanup:

flowchart TD
subgraph subGraph1["Weak References"]
    ThreadGroup["ThreadGroup"]
end
subgraph subGraph0["Strong References"]
    Thread["Thread"]
    Process["Process"]
end

Process --> ThreadGroup
Thread --> Process
ThreadGroup --> Thread

Reference Relationship Diagram

Sources: src/thread.rs(L7 - L11)  src/process.rs(L18 - L22)  src/thread.rs(L76 - L87) 

Key Memory Management Patterns

  1. Thread to Process: Threads maintain strong references (Arc) to their parent processes to ensure the process remains alive as long as any thread is running.
  2. ThreadGroup to Thread: The thread group uses weak references to threads, allowing threads to be dropped when they exit.
  3. Creation: When a thread is created, it's added to the process's thread group using a weak reference.
  4. Cleanup: When a thread exits, it's removed from the thread group, allowing its memory to be reclaimed if there are no other references.

Sources: src/thread.rs(L76 - L87)  src/thread.rs(L29 - L39)  src/process.rs(L18 - L22) 

Thread-Process Interaction Summary

The lifecycle of threads is tightly coupled with the lifecycle of their parent process:

Thread ActionProcess Effect
Thread creationAdded to process's thread group
Normal thread exitRemoved from thread group, exit code recorded if first exit
Last thread exitMay trigger process termination
Process group exitAll subsequent thread exits preserve initial exit code
Process exitAll resources partially released, becomes zombie
Process freeAll resources fully released

Sources: src/thread.rs(L29 - L39)  src/process.rs(L167 - L191)  src/process.rs(L195 - L236) 

Conclusion

Understanding the thread lifecycle and exit process is crucial for effective process management in the axprocess system. Threads are the execution units of processes, and their creation and termination directly impact the process lifecycle. The system provides mechanisms for individual thread exit as well as coordinated group exit, with careful resource management through Rust's ownership model.

Memory Management

Relevant source files

This document explains how memory is managed in the axprocess crate, focusing on reference counting patterns, hierarchical object management, and cleanup mechanisms. For information about zombie processes and cleanup specifically, see Zombie Processes and Cleanup. For details about reference counting and ownership patterns, see Reference Counting and Ownership.

Overview of Memory Management Strategy

The axprocess crate implements a hierarchical process management system that uses Rust's ownership model and reference counting patterns to ensure memory safety while maintaining proper object relationships. The system employs both strong references (Arc) and weak references (Weak) strategically to prevent memory leaks and reference cycles.

flowchart TD
subgraph subGraph0["Memory Management Strategy"]
    A["Process Management Objects"]
    B["Strong Reference (Arc)"]
    C["Weak Reference (Weak)"]
    D["Upward References(Child→Parent Type)"]
    E["Downward References(Parent→Child Type)"]
    F["Circular References"]
end

A --> B
A --> C
B --> D
C --> E
C --> F

Sources: src/process.rs(L1 - L10)  src/process_group.rs(L1 - L9)  src/session.rs(L1 - L9) 

Reference Hierarchy and Ownership Model

The axprocess crate implements a hierarchical memory management model with carefully designed ownership relationships between different components:


Sources: src/process.rs(L35 - L47)  src/process_group.rs(L12 - L17)  src/session.rs(L12 - L17)  src/thread.rs(L7 - L11) 

Strong vs Weak References

The system carefully balances the use of strong references (Arc) and weak references (Weak) to maintain object relationships while preventing memory leaks:

ComponentFieldReference TypePurpose
ProcesschildrenStrongKeep child processes alive while parent exists
ProcessparentWeakPrevent reference cycles between parent-child
ProcessgroupStrongKeep process group alive while process exists
ProcessGroupsessionStrongKeep session alive while process group exists
ProcessGroupprocessesWeakAllow processes to be cleaned up independently
Sessionprocess_groupsWeakAllow process groups to be cleaned up independently
ThreadprocessStrongKeep process alive while thread exists
Processtg.threadsWeakAllow threads to be cleaned up independently

Sources: src/process.rs(L35 - L47)  src/process_group.rs(L12 - L17)  src/session.rs(L12 - L17)  src/thread.rs(L7 - L11) 

Core Data Structures

The memory management system relies on specialized data structures for managing references:

classDiagram
class StrongMap {
    
    +insert(key, value)
    +remove(key)
    +values() Vec~Arc~T~~
}

class WeakMap {
    
    +insert(key, &Arc~T~)
    +remove(key)
    +values() Vec~Arc~T~~
    +upgrade(key) Option~Arc~T~~
}

class SpinNoIrq {
    
    +lock() MutexGuard
    +new(T) SpinNoIrq~T~
}

class Process {
    
    
}

class ProcessGroup {
    
    
}

class Session {
    
    
}

Process  -->  StrongMap : "children"
Process  -->  WeakMap : "tg.threads"
ProcessGroup  -->  WeakMap : "processes"
Session  -->  WeakMap : "process_groups"
Process  -->  SpinNoIrq : "contains (thread-safe)"

Sources: src/process.rs(L14)  src/process.rs(L18 - L22)  src/process_group.rs(L16)  src/session.rs(L15) 

Process Creation and Memory Allocation

The ProcessBuilder pattern manages memory allocation during process creation, ensuring proper initialization of reference relationships:

sequenceDiagram
    participant Client as Client
    participant ProcessBuilder as "ProcessBuilder"
    participant Process as "Process"
    participant ProcessGroup as "ProcessGroup"
    participant Session as "Session"
    participant INIT_PROC as INIT_PROC

    Client ->> ProcessBuilder: fork(pid) or new_init(pid)
    Client ->> ProcessBuilder: data(custom_data)
    Client ->> ProcessBuilder: build()
    ProcessBuilder ->> Process: create Process object
    alt Init Process
        Process ->> Session: new(pid)
        Session ->> ProcessGroup: new(pid, session)
    else Child Process
        Process -->> Process: inherit parent's group
    end
    Process ->> ProcessGroup: add self (weak ref)
    alt Init Process
        Process ->> INIT_PROC: initialize lazy static
    else Child Process
        Process ->> Process: add to parent's children (strong ref)
    end
    ProcessBuilder -->> Client: return Arc<Process>

Sources: src/process.rs(L260 - L341)  src/process_group.rs(L19 - L29)  src/session.rs(L19 - L27) 

Zombie Process Management

When a process exits, it becomes a zombie, and its memory management changes:


During the zombie state:

  1. Process marks itself as a zombie using atomic boolean
  2. Child processes are reparented to the init process
  3. Process resources are partially released
  4. The parent must call free() to complete cleanup

Sources: src/process.rs(L195 - L237)  src/thread.rs(L29 - L40) 

Parent-Child Memory Management

The parent-child relationship memory management is particularly important:


The parent keeps strong references to children in a StrongMap, while children have weak references to their parent. This prevents reference cycles while maintaining the parent-child relationship.

Sources: src/process.rs(L70 - L81)  src/process.rs(L195 - L237) 

Thread Memory Management

Threads are managed within a process using a thread group:


Threads maintain strong references to their parent process, ensuring the process stays alive as long as any thread is running. The process maintains weak references to its threads, preventing reference cycles.

Sources: src/process.rs(L18 - L31)  src/thread.rs(L7 - L40) 

Session and Process Group Memory Management

Sessions and process groups form the higher levels of the hierarchy:


Process groups maintain strong references to their session, while processes maintain strong references to their process group. This upward ownership pattern ensures that higher-level objects remain alive as long as any lower-level object needs them.

Sources: src/process.rs(L83 - L164)  src/process_group.rs(L12 - L47)  src/session.rs(L12 - L45) 

Memory Safety Mechanisms

The axprocess crate employs several mechanisms to ensure memory safety:

  1. Thread-safe access: Using SpinNoIrq locks for shared mutable state
  2. Atomic operations: Using AtomicBool for zombie state tracking
  3. Builder pattern: Ensuring proper initialization with ProcessBuilder and ThreadBuilder
  4. Reference counting: Using Arc and Weak for managing object lifetimes
  5. Explicit cleanup: Using exit() and free() methods for proper resource cleanup

Sources: src/process.rs(L35 - L47)  src/process.rs(L195 - L237)  src/thread.rs(L29 - L40) 

Summary

The memory management system in axprocess creates a hierarchical model where:

  1. Objects lower in the hierarchy (threads, processes) hold strong references to objects higher up (process groups, sessions)
  2. Objects higher in the hierarchy hold weak references to objects lower down
  3. Special cases like parent-child process relationships use weak references for parents to avoid reference cycles
  4. Thread-safe access is ensured through spinlocks and atomic operations
  5. Zombie state management prevents premature cleanup while allowing proper resource release

This design ensures memory safety while maintaining the flexibility required for process management in an operating system.

Sources: src/process.rs src/process_group.rs src/session.rs src/thread.rs

Reference Counting and Ownership

Relevant source files

This document explains how the axprocess crate implements memory management through Rust's reference counting mechanisms. It details the ownership patterns between system components (Sessions, ProcessGroups, Processes, and Threads) and how they prevent memory leaks while maintaining proper object lifetimes.

For information about cleanup of terminated processes, see Zombie Processes and Cleanup.

Reference Counting Fundamentals

The axprocess crate relies on Rust's smart pointers to manage memory and object lifetimes:

  • Arc (Atomic Reference Counting): Provides shared ownership of a value with thread-safe reference counting
  • Weak: A non-owning reference that doesn't prevent deallocation when all Arc references are dropped

This approach avoids both manual memory management and garbage collection, guaranteeing memory safety while maintaining predictable resource cleanup.

flowchart TD
A["Arc"]
O["Object"]
W["Weak"]
RC["Reference Count"]
D["Deallocate"]

A --> O
A --> RC
RC --> D
W --> O
W --> RC

Diagram: Reference Counting Basics

Sources: src/process.rs(L1 - L4) 

Ownership Hierarchy in axprocess

The axprocess system employs a careful hierarchy of strong and weak references to maintain proper component ownership while preventing reference cycles.

classDiagram
class Process {
    pid: Pid
    children: StrongMap~
    parent: Weak
    group: Arc
    
}

class ProcessGroup {
    pgid: Pid
    session: Arc
    processes: WeakMap~
    
}

class Session {
    sid: Pid
    process_groups: WeakMap~
    
}

class Thread {
    tid: Pid
    process: Arc
    
}

Process  -->  Process
Process  -->  Process : "weak reference to parent"
Process  ..>  Process
Process  -->  ProcessGroup : "strong reference"
ProcessGroup  -->  Session : "strong reference"
ProcessGroup  ..>  Process : "weak references"
Session  ..>  Process : "weak references"
Thread  -->  Process : "strong reference"

Diagram: Reference Relationships Between Components

Sources: src/process.rs(L35 - L47)  src/process_group.rs(L13 - L17)  src/session.rs(L13 - L17) 

Reference Direction Strategy

The system uses a deliberate pattern for determining which direction uses strong vs. weak references:

Upward Strong References

Components hold strong references (Arc) to their "container" components:

  • Processes strongly reference their ProcessGroup
  • ProcessGroups strongly reference their Session

This ensures container components remain alive as long as any child component needs them.

Downward Weak References

Container components hold weak references to their "members":

  • Sessions weakly reference their ProcessGroups
  • ProcessGroups weakly reference their Processes
  • ThreadGroups weakly reference their Threads

This prevents reference cycles while allowing containers to access their members.

Hierarchical Strong References

Processes hold strong references to their children, ensuring child processes remain valid while the parent exists. This reflects the parent-child ownership model where parents are responsible for their children's lifecycle.

flowchart TD
subgraph subGraph2["Hierarchical Strong References"]
    Parent["Parent Process"]
    Child["Child Process"]
    S2["Session"]
    PG2["ProcessGroup"]
    P["Process"]
    PG["ProcessGroup"]
end

Child --> Parent
P --> PG
PG --> S
PG2 --> P2
Parent --> Child
S2 --> PG2

Diagram: Reference Direction Strategy

Sources: src/process.rs(L43 - L46)  src/process_group.rs(L14 - L16)  src/session.rs(L14 - L15) 

Implementation Details

Process Ownership

The Process struct maintains:

  • Strong references to children in a StrongMap
  • Weak reference to its parent
  • Strong reference to its ProcessGroup
Process {
    children: SpinNoIrq<StrongMap<Pid, Arc<Process>>>,
    parent: SpinNoIrq<Weak<Process>>,
    group: SpinNoIrq<Arc<ProcessGroup>>,
}

Sources: src/process.rs(L43 - L46) 

ProcessGroup Ownership

The ProcessGroup struct maintains:

  • Strong reference to its Session
  • Weak references to its member Processes
ProcessGroup {
    session: Arc<Session>,
    processes: SpinNoIrq<WeakMap<Pid, Weak<Process>>>,
}

Sources: src/process_group.rs(L14 - L16) 

Session Ownership

The Session struct maintains:

  • Weak references to its member ProcessGroups
Session {
    process_groups: SpinNoIrq<WeakMap<Pid, Weak<ProcessGroup>>>,
}

Sources: src/session.rs(L14 - L15) 

Reference Management During Object Creation

When creating objects, the system carefully establishes the appropriate references:

  1. When a Process is created:
  • It acquires a strong reference to its ProcessGroup
  • The ProcessGroup stores a weak reference back to the Process
  • If it has a parent, the parent stores a strong reference to it
  • It stores a weak reference to its parent
  1. When a ProcessGroup is created:
  • It acquires a strong reference to its Session
  • The Session stores a weak reference back to the ProcessGroup
sequenceDiagram
    participant ProcessBuilder as ProcessBuilder
    participant Process as Process
    participant ProcessGroup as ProcessGroup
    participant Session as Session

    Note over ProcessBuilder: ProcessBuilder::build()
    alt No parent (init process)
        ProcessBuilder ->> Session: Session::new(pid)
        ProcessBuilder ->> ProcessGroup: ProcessGroup::new(pid, &session)
    else Has parent
        ProcessBuilder ->> ProcessGroup: parent.group()
    end
    ProcessBuilder ->> Process: Create new Process
    Process ->> ProcessGroup: group.processes.insert(pid, weak_ref)
    alt Has parent
        Process ->> Process: parent.children.insert(pid, strong_ref)
    else No parent (init process)
        Process ->> Process: INIT_PROC.init_once(process)
    end

Diagram: Reference Setup During Process Creation

Sources: src/process.rs(L302 - L331)  src/process_group.rs(L21 - L28)  src/session.rs(L20 - L26) 

Reference Management During Process Termination

When a Process exits:

  1. It is marked as a zombie
  2. Its children are re-parented to the init process
  3. The children update their weak parent reference to point to the init process
  4. The init process takes strong ownership of the children
sequenceDiagram
    participant Process as Process
    participant InitProcess as Init Process
    participant ChildProcesses as Child Processes

    Process ->> Process: is_zombie.store(true)
    Process ->> InitProcess: Get init_proc()
    Process ->> Process: Take children
    loop For each child
        Process ->> ChildProcesses: Update weak parent reference to init
        Process ->> InitProcess: Add child to init's children
    end

Diagram: Reference Management During Process Exit

Sources: src/process.rs(L207 - L225) 

Memory Safety Considerations

The reference counting design in axprocess provides several safety guarantees:

Safety FeatureImplementationBenefit
No reference cyclesStrategic use of weak referencesPrevents memory leaks
Component lifetime guaranteesUpward strong referencesComponents can't be deallocated while in use
Clean resource releaseWeak references in containersEnables efficient cleanup without dangling pointers
Automatic cleanupArc drop semanticsResources are freed when no longer needed
Thread safetyArc's atomic reference countingSafe to use across threads

Sources: src/process.rs(L35 - L47)  src/process_group.rs(L13 - L17)  src/session.rs(L13 - L17) 

Practical Example: Process Lifecycle References

Let's trace the reference management during a process's lifecycle:

  1. Process creation:
  • Parent process creates a child using fork() and ProcessBuilder::build()
  • Child gets a strong reference to parent's process group
  • Parent stores a strong reference to child
  • Child stores a weak reference to parent
  1. Process execution:
  • Process maintains its references throughout execution
  1. Process termination:
  • Process calls exit() and is marked as zombie
  • Child processes are re-parented to init
  • Parent process eventually calls free() to remove its strong reference
  • When all strong references are gone, process is deallocated

Sources: src/process.rs(L207 - L236)  src/process.rs(L275 - L331) 

Utility Functions for Reference Management

The codebase provides several methods to manage references between components:

MethodPurposeReference Type
Process::parent()Get parent processWeak → Strong conversion
Process::children()Get child processesStrong references
Process::group()Get process groupStrong reference
ProcessGroup::session()Get sessionStrong reference
ProcessGroup::processes()Get member processesWeak → Strong conversion
Session::process_groups()Get process groupsWeak → Strong conversion

Sources: src/process.rs(L73 - L80)  src/process.rs(L86 - L88)  src/process_group.rs(L33 - L46)  src/session.rs(L30 - L38) 

Conclusion

The reference counting and ownership model in axprocess provides a robust foundation for memory management by:

  1. Using strong references strategically to ensure components remain alive as needed
  2. Using weak references to prevent reference cycles
  3. Following a consistent pattern of upward strong references and downward weak references
  4. Maintaining proper parent-child relationships through appropriate reference types

This approach leverages Rust's ownership model to provide memory safety without garbage collection, ensuring efficient and predictable resource management.

Zombie Processes and Cleanup

Relevant source files

This document explains how axprocess manages terminated processes (zombies) and their eventual cleanup. It covers the zombie state, resource management, process inheritance, and the cleanup mechanisms that ensure proper resource deallocation. For related information about the overall process lifecycle, see Process Lifecycle.

Zombie Process Concept

In axprocess, a zombie process is a process that has terminated execution but still exists in the system's process table. When a process exits, it doesn't immediately disappear - it enters a zombie state where some minimal information is retained until its parent process acknowledges the termination.


Sources: src/process.rs(L196 - L236) 

Zombie State Implementation

When a process terminates, it's marked as a zombie through the Process::exit() method, which sets the is_zombie atomic flag to true. In this state:

  1. The process is no longer executing but still exists in the process table
  2. Resources are partially released
  3. Exit status is preserved for the parent process to retrieve
  4. Child processes are reassigned to the init process

The zombie state allows the parent process to retrieve exit information from its children before they're completely deallocated.

classDiagram
class Process {
    pid: Pid
    is_zombie: AtomicBool
    tg: SpinNoIrq
    data: Box
    children: StrongMap~
    parent: Weak
    group: Arc
    +is_zombie() bool
    +exit() void
    +free() void
}

class ThreadGroup {
    threads: WeakMap~
    exit_code: i32
    group_exited: bool
    
}

Process  -->  ThreadGroup : contains

Sources: src/process.rs(L35 - L47)  src/process.rs(L196 - L225) 

Zombie Process Cleanup

Cleanup of zombie processes is a two-step process:

  1. A process terminates by calling Process::exit(), which marks it as a zombie
  2. The parent process calls Process::free() to complete the cleanup

The free() method removes the zombie process from its parent's children list. If a process is freed before it's marked as a zombie, the system will panic to prevent incorrect resource management.

sequenceDiagram
    participant ChildProcess as "Child Process"
    participant ParentProcess as "Parent Process"
    participant InitProcess as "Init Process"

    ChildProcess ->> ChildProcess: exit()
    Note over ChildProcess: Sets is_zombie = true
    alt Parent still alive
        ParentProcess ->> ChildProcess: free()
        Note over ChildProcess,ParentProcess: Remove from parent's children
    else Parent already exited
        InitProcess ->> ChildProcess: free()
        Note over ChildProcess,InitProcess: Remove from init's children
    end

Sources: src/process.rs(L227 - L236)  tests/process.rs(L25 - L44) 

Resource Management During Exit

The exit() implementation handles several key cleanup tasks:

  1. Marks the process as a zombie using atomic operations
  2. Reassigns child processes to a reaper (currently always the init process)
  3. Updates parent references in all child processes
  4. Maintains the process in the parent's children list for later cleanup

Table: Key Resources in Zombie Processes

ResourceStatus in Zombie ProcessCleaned Up By
Memory for Process structureStill allocatedfree()method
Child process referencesTransferred to initexit()method
Parent referenceMaintainedParent'schildrenmap
Process Group membershipMaintainedNot removed untilfree()
Exit codePreservedStored in ThreadGroup

Sources: src/process.rs(L196 - L225) 

Orphan Process Handling

When a parent process exits before its children, the children become "orphaned" and are inherited by the init process. This prevents zombie processes from becoming permanent if their parents exit without cleaning them up.

flowchart TD
subgraph subGraph1["After Parent Exit"]
    ParentZ["Parent Process (Zombie)"]
    subgraph subGraph0["Before Parent Exit"]
        Init["Init Process"]
        Child1a["Child Process 1"]
        Child2a["Child Process 2"]
        Parent["Parent Process"]
        Child1["Child Process 1"]
        Child2["Child Process 2"]
    end
end

Init --> Child1a
Init --> Child2a
Init --> ParentZ
Parent --> Child1
Parent --> Child2

Implementation details:

  1. When a process exits, it transfers all its children to the init process (or designated subreaper)
  2. Each child's parent reference is updated to point to the new parent
  3. These processes now appear in the init process's children collection
  4. The init process becomes responsible for cleaning them up when they exit

Sources: src/process.rs(L207 - L224)  tests/process.rs(L47 - L55) 

Cleanup Implementation Details

The zombie cleanup is implemented through reference management. Let's examine how this is done:

flowchart TD
subgraph subGraph0["Reference Management"]
    Parent["Parent Process"]
    Child["Child Process"]
    Zombie["Zombie State"]
    Freed["Removed from parent"]
end

Child --> Parent
Child --> Zombie
Parent --> Child
Zombie --> Freed

Key implementation points:

  1. The parent holds strong references (Arc<Process>) to its children in a StrongMap
  2. Children hold weak references (Weak<Process>) to their parent
  3. When free() is called, the zombie process is removed from its parent's children map
  4. This removes the strong reference, allowing memory deallocation when all references are gone

The Process::free() method also checks that a process is actually a zombie before freeing it, to prevent accidental cleanup of active processes.

Sources: src/process.rs(L227 - L236)  tests/process.rs(L25 - L29) 

Special Case: Init Process

The init process requires special handling in the context of zombies:

  1. The init process cannot exit (calling exit() on it will panic)
  2. It's responsible for cleaning up orphaned processes
  3. It must properly handle zombie processes inherited from terminated parents

This special status ensures that there's always a process available to clean up orphaned zombies, preventing resource leaks.

Sources: src/process.rs(L207 - L209)  tests/process.rs(L31 - L35) 

Resource Management Considerations

Proper zombie process management is essential for preventing resource leaks:

  1. Memory leaks: Zombie processes that are never freed can accumulate and waste memory
  2. Process ID exhaustion: Each zombie still occupies a process ID
  3. Parent responsibility: Parents must clean up their zombie children

Users of this API must ensure they properly handle the cleanup of zombie processes by calling free() after retrieving any needed exit information.

Sources: src/process.rs(L227 - L236) 

Development and Testing

Relevant source files

This document outlines the development practices, testing methodologies, and CI/CD pipeline for the axprocess crate. It provides information for developers who want to contribute to or modify the codebase, explaining how to set up a development environment, run tests, and understand the automated workflows in place.

For information about specific process management functionality, see Process Management or Thread Management.

Development Environment

The axprocess crate is built using Rust's standard development tools and follows modern Rust development practices. The codebase uses the nightly Rust toolchain for development and testing.

Code Style and Formatting

Code formatting is strictly defined through the project's rustfmt.toml configuration file. All code contributions should adhere to these formatting guidelines.

# Key rustfmt settings
unstable_features = true
style_edition = "2024"
group_imports = "StdExternalCrate"
imports_granularity = "Crate"
normalize_comments = true
wrap_comments = true
reorder_impl_items = true
format_strings = true
format_code_in_doc_comments = true

To ensure consistent formatting, run rustfmt with the project's configuration before submitting any code changes:

cargo +nightly fmt

Sources: rustfmt.toml(L1 - L19) 

Development Workflow

Typical Development Workflow

flowchart TD
A["Clone Repository"]
B["Setup Nightly Toolchain"]
C["Implement Feature/Bug Fix"]
D["Run Tests Locally"]
E["Format Code with rustfmt"]
F["Check with Clippy"]
G["Create Pull Request"]
H["CI Checks Run"]
I["Code Review"]
J["Address Feedback"]
K["Merge to main"]

A --> B
B --> C
C --> D
D --> E
E --> F
F --> G
G --> H
H --> I
I --> J
I --> K
J --> H

Sources: .github/workflows/ci.yml(L1 - L62)  rustfmt.toml(L1 - L19) 

Testing Methodology

The axprocess crate employs several testing approaches to ensure code quality and correctness. The codebase follows Rust's standard testing conventions, with tests organized within the source files themselves.

Types of Tests

Testing Structure in axprocess

flowchart TD
subgraph subGraph0["Test Execution"]
    E["cargo test --all-features"]
end
A["axprocess Tests"]
B["Unit Tests"]
C["Integration Tests"]
D["Documentation Tests"]
B1["Process Component Tests"]
B2["Thread Management Tests"]
B3["Session/Group Tests"]
C1["Component Interaction Tests"]
D1["API Example Tests"]

A --> B
A --> C
A --> D
B --> B1
B --> B2
B --> B3
B1 --> E
B2 --> E
B3 --> E
C --> C1
C1 --> E
D --> D1
D1 --> E

Running Tests Locally

To run the full test suite locally:

cargo test --all-features

For running specific tests:

cargo test <test_name> --all-features

For verbose test output:

cargo test -- --nocapture

Sources: .github/workflows/ci.yml(L29 - L30) 

CI/CD Pipeline

The project uses GitHub Actions for continuous integration and deployment, ensuring that all code changes are automatically tested and documented.

CI Workflow

CI/CD Pipeline Architecture

flowchart TD
A["Push to main/PR"]
B["GitHub Actions CI Workflow"]
C["check job"]
D["doc job"]
C1["Setup nightly toolchain"]
C2["Run clippy linter"]
C3["Run cargo test"]
D1["Setup nightly toolchain"]
D2["Build documentation"]
D3["Prepare doc artifact"]
E["deploy job"]
E1["Deploy to GitHub Pages"]

A --> B
B --> C
B --> D
C --> C1
C1 --> C2
C2 --> C3
D --> D1
D1 --> D2
D2 --> D3
D3 --> E
E --> E1

Sources: .github/workflows/ci.yml(L1 - L62) 

CI Jobs and Tasks

The CI pipeline consists of three main jobs:

JobPurposeKey Tasks
checkCode quality & testingRun clippy linter, execute test suite
docDocumentationBuild API documentation, prepare artifact
deployPublicationDeploy documentation to GitHub Pages

The CI workflow is triggered on:

  • Push events to the main branch
  • Pull requests targeting the main branch

Each job in the workflow runs on the latest Ubuntu environment.

Environment Variables

The CI environment sets the following variables:

RUST_BACKTRACE: 1

This ensures that any test failures provide detailed backtraces to help identify the source of problems.

Sources: .github/workflows/ci.yml(L15 - L16) 

Documentation Generation

The documentation job automatically generates API documentation using cargo doc and deploys it to GitHub Pages. This ensures that the latest documentation is always available online.

sequenceDiagram
    participant GitHubActions as "GitHub Actions"
    participant DocJob as "Doc Job"
    participant GitHubPages as "GitHub Pages"

    GitHubActions ->> DocJob: Trigger on main branch changes
    DocJob ->> DocJob: Setup nightly toolchain
    DocJob ->> DocJob: Run "cargo doc --all-features --no-deps"
    DocJob ->> DocJob: Create index.html with redirect
    DocJob ->> GitHubActions: Upload artifact
    GitHubActions ->> GitHubPages: Deploy artifact
    GitHubPages ->> GitHubPages: Publish documentation

Sources: .github/workflows/ci.yml(L32 - L61) 

Best Practices for Contributors

When contributing to the axprocess crate:

  1. Always use the nightly Rust toolchain as specified in the CI configuration
  2. Ensure code passes clippy linting with cargo clippy --all-features --all-targets
  3. Add appropriate tests for new functionality
  4. Format code according to the project's rustfmt configuration
  5. Add documentation comments for public APIs
  6. Verify that all tests pass before submitting a pull request

Following these practices ensures that contributions integrate smoothly with the existing codebase and pass the automated CI checks.

Sources: .github/workflows/ci.yml(L1 - L62)  rustfmt.toml(L1 - L19) 

Testing Approach

Relevant source files

This document outlines the testing methodology used for the axprocess crate, focusing on how process management components are tested within the system. The axprocess crate leverages Rust's built-in testing framework to ensure proper functionality of process, process group, and session abstractions.

Test Organization

The test suite is organized into multiple files, each focused on testing specific subsystems:

flowchart TD
A["Tests Structure"]
B["process.rs"]
C["group.rs"]
D["session.rs"]
E["common/mod.rs"]
B1["Process lifecycle tests"]
B2["Parent-child relationship tests"]
C1["Process group functionality tests"]
C2["Group membership tests"]
D1["Session management tests"]
D2["Session-group relationship tests"]
E1["Common test utilities"]
E2["ProcessExt trait"]

A --> B
A --> C
A --> D
A --> E
B --> B1
B --> B2
C --> C1
C --> C2
D --> D1
D --> D2
E --> E1
E --> E2

Sources: tests/process.rs tests/group.rs tests/session.rs tests/common/mod.rs

Testing Infrastructure

Test Initialization

The axprocess tests utilize a common initialization mechanism that runs before any tests:

sequenceDiagram
    participant TestFramework as "Test Framework"
    participant Commoninitfunction as "Common init function"
    participant Processnew_init as "Process::new_init"

    Note over Commoninitfunction: Runs before any tests using
    TestFramework ->> Commoninitfunction: Load test module
    Commoninitfunction ->> Commoninitfunction: alloc_pid()
    Commoninitfunction ->> Processnew_init: new_init(pid)
    Processnew_init ->> Processnew_init: build()
    Note over Processnew_init: init process created

The ctor crate is used to automatically initialize the test environment by creating an initial process before any tests run:

Sources: tests/common/mod.rs(L15 - L18) 

PID Allocation

Tests use a simple atomic counter to allocate unique process IDs:

#![allow(unused)]
fn main() {
static PID: AtomicU32 = AtomicU32::new(0);

fn alloc_pid() -> u32 {
    PID.fetch_add(1, Ordering::SeqCst)
}
}

This ensures that each test process receives a unique PID without conflicts.

Sources: tests/common/mod.rs(L9 - L13) 

ProcessExt Trait

To simplify test code, a ProcessExt trait provides helper methods for common operations:

#![allow(unused)]
fn main() {
pub trait ProcessExt {
    fn new_child(&self) -> Self;
}

impl ProcessExt for Arc<Process> {
    fn new_child(&self) -> Self {
        self.fork(alloc_pid()).build()
    }
}
}

This extension trait makes test code more concise by providing shortcuts for creating child processes.

Sources: tests/common/mod.rs(L20 - L28) 

Test Categories

Process Lifecycle Tests

These tests verify the fundamental process management capabilities:

Test NamePurpose
childVerifies parent-child relationship creation
exitTests process termination and zombie state transition
free_not_zombieVerifies that freeing non-zombie processes causes panic
init_proc_exitEnsures init process cannot be terminated
freeTests resource cleanup after process termination
reapVerifies orphan handling when parent processes exit

Example test verifying process exit:

#[test]
fn exit() {
    let parent = init_proc();
    let child = parent.new_child();
    child.exit();
    assert!(child.is_zombie());
    assert!(parent.children().iter().any(|c| Arc::ptr_eq(c, &child)));
}

Sources: tests/process.rs(L8 - L55) 

Process Group Tests

Tests in this category verify process group functionality:

Test NamePurpose
basicTests basic process group properties
createVerifies process group creation
create_leaderTests group leader constraints
cleanupVerifies resource cleanup
inheritTests group inheritance by child processes
move_toTests moving processes between groups
move_cleanupVerifies empty group cleanup
move_backTests moving processes back to previous groups
cleanup_processesTests group cleanup after processes exit

Sources: tests/group.rs(L8 - L141) 

Session Tests

These tests verify session functionality:

Test NamePurpose
basicTests basic session properties
createVerifies session creation
create_leaderTests session leader constraints
cleanupVerifies resource cleanup
create_groupTests group creation within a session
move_to_different_sessionVerifies cross-session move constraints
cleanup_groupsTests session cleanup after groups disappear

Sources: tests/session.rs(L8 - L108) 

Test Method Patterns

The axprocess test suite follows several patterns:

flowchart TD
A["Standard Test Pattern"]
B["Unsupported markdown: list"]
C["Unsupported markdown: list"]
D["Unsupported markdown: list"]
E["Error Test Pattern"]
F["Unsupported markdown: list"]
G["Unsupported markdown: list"]
H["Cleanup Test Pattern"]
I["Unsupported markdown: list"]
J["Unsupported markdown: list"]
K["Unsupported markdown: list"]
L["Unsupported markdown: list"]

A --> B
B --> C
C --> D
E --> F
F --> G
H --> I
I --> J
J --> K
K --> L

Sources: tests/process.rs tests/group.rs tests/session.rs

Standard Tests

Most tests follow a structure of:

  1. Initialize the test environment (create necessary processes)
  2. Perform the operation being tested
  3. Assert the expected outcomes using assert! or similar functions

Example from group.rs:

#[test]
fn basic() {
    let init = init_proc();
    let group = init.group();
    assert_eq!(group.pgid(), init.pid());
    
    let child = init.new_child();
    assert!(Arc::ptr_eq(&group, &child.group()));
    
    let processes = group.processes();
    assert!(processes.iter().any(|p| Arc::ptr_eq(p, &init)));
    assert!(processes.iter().any(|p| Arc::ptr_eq(p, &child)));
}

Sources: tests/group.rs(L8 - L20) 

Error Tests

Tests that verify error handling use the #[should_panic] attribute:

#![allow(unused)]
fn main() {
#[test]
#[should_panic]
fn free_not_zombie() {
    init_proc().new_child().free();
}
}

This verifies that attempting to free a non-zombie process triggers a panic as expected.

Sources: tests/process.rs(L25 - L29) 

Resource Cleanup Tests

Tests that verify proper resource cleanup often use weak references to ensure resources are properly deallocated:

#[test]
fn cleanup() {
    let child = init_proc().new_child();
    
    let group = Arc::downgrade(&child.create_group().unwrap());
    assert!(group.upgrade().is_some());
    
    child.exit();
    child.free();
    drop(child);
    assert!(group.upgrade().is_none());
}

Sources: tests/group.rs(L54 - L65) 

Running the Tests

Tests can be run using the standard Cargo test command:

cargo test

For more specific subsets of tests:

cargo test --test process  # Run only process tests
cargo test --test group    # Run only group tests
cargo test --test session  # Run only session tests

Relationship to System Architecture

The testing approach directly mirrors the core architecture of the axprocess system:

flowchart TD
subgraph subGraph1["Test Modules"]
    TP["process.rs"]
    TG["group.rs"]
    TS["session.rs"]
end
subgraph subGraph0["System Architecture"]
    PA["Process Abstraction"]
    PG["Process Groups"]
    S["Sessions"]
    PR["Parent-Child Relationships"]
end

PA --> TP
PG --> TG
PR --> TP
S --> TS

This one-to-one mapping between system components and test modules ensures comprehensive test coverage.

Sources: tests/process.rs tests/group.rs tests/session.rs

Best Practices for Adding Tests

Based on the existing test patterns, here are the best practices for adding new tests to the axprocess crate:

  1. Use the common utilities: Leverage the ProcessExt trait and other utilities in common/mod.rs
  2. Follow the established patterns: Maintain consistency with existing test structure
  3. Test one behavior per test: Each test should focus on a specific functionality
  4. Test both success and failure paths: Add #[should_panic] tests for error conditions
  5. Verify resource cleanup: Use weak references to verify proper resource deallocation
  6. Maintain independence: Tests should not depend on each other's state

By following these practices, new tests will integrate well with the existing test suite and maintain test quality.

CI/CD Pipeline

Relevant source files

This document details the Continuous Integration and Continuous Deployment (CI/CD) pipeline configured for the axprocess repository. It explains how automated testing, linting, and documentation generation are set up to ensure code quality and maintain up-to-date documentation.

For information about testing approaches and how to run tests manually, see Testing Approach.

Pipeline Overview

The axprocess repository uses GitHub Actions for its CI/CD pipeline, which automatically runs on code changes to verify quality and deploy documentation. The pipeline ensures that:

  1. Code follows style guidelines and passes static analysis
  2. All tests pass successfully
  3. Documentation is automatically generated and deployed
flowchart TD
subgraph subGraph1["CI/CD Pipeline"]
    check["check job:Linting & Testing"]
    doc["doc job:Documentation Generation"]
    deploy["deploy job:Deploy to GitHub Pages"]
end
subgraph subGraph0["Trigger Events"]
    push["Push to main branch"]
    pr["Pull Request to main branch"]
end

check --> doc
doc --> deploy
pr --> check
push --> check

Sources: .github/workflows/ci.yml(L1 - L62) 

Pipeline Trigger Events

The CI/CD pipeline is configured to run automatically in response to specific Git events:

Event TypeBranchAction
PushmainRun full pipeline
Pull RequestmainRun full pipeline

The pipeline uses GitHub's concurrency controls to avoid redundant runs:

concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}-${{ github.event_name }}
  cancel-in-progress: true

This means that if multiple commits are pushed in quick succession, earlier workflow runs will be canceled in favor of the most recent one, saving CI resources.

Sources: .github/workflows/ci.yml(L3 - L13) 

CI Jobs and Steps

The pipeline consists of three main jobs:

flowchart TD
subgraph deploy["deploy"]
    dp1["Deploy to GitHub Pages"]
    d1["Checkout Code"]
    c1["Checkout Code"]
end
subgraph doc["doc"]
    subgraph check["check"]
        dp1["Deploy to GitHub Pages"]
        d1["Checkout Code"]
        d2["Setup Rust toolchain"]
        d3["Build Documentation"]
        d4["Upload Artifact"]
        c1["Checkout Code"]
        c2["Setup Rust toolchain"]
        c3["Run Clippy"]
        c4["Run Tests"]
    end
end

c1 --> c2
c2 --> c3
c3 --> c4
d1 --> d2
d2 --> d3
d3 --> d4

Sources: .github/workflows/ci.yml(L18 - L61) 

Check Job

The check job runs on Ubuntu and performs the following steps:

  1. Checks out the repository code
  2. Sets up the Rust nightly toolchain with the Clippy component
  3. Runs Clippy for static analysis with warnings treated as errors
  4. Runs all tests with all features enabled
check:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - name: Setup Rust toolchain
      run: |
        rustup default nightly
        rustup component add clippy
    - name: Clippy
      run: cargo clippy --all-features --all-targets -- -Dwarnings
    - name: Test
      run: cargo test --all-features

Sources: .github/workflows/ci.yml(L19 - L30) 

Documentation Job

The doc job is responsible for generating the Rust documentation:

  1. Checks out the repository code
  2. Sets up the Rust nightly toolchain
  3. Builds the documentation with all features enabled
  4. Creates an index.html redirect page
  5. Uploads the generated documentation as an artifact
doc:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - name: Setup Rust toolchain
      run: |
        rustup default nightly
    - name: Build docs
      run: |
        cargo doc --all-features --no-deps
        printf '<meta http-equiv="refresh" content="0;url=%s/index.html">' $(cargo tree | head -1 | cut -d' ' -f1 | tr '-' '_') > target/doc/index.html
    - name: Upload artifact
      uses: actions/upload-pages-artifact@v3
      with:
        path: target/doc

Sources: .github/workflows/ci.yml(L32 - L46) 

Deploy Job

The deploy job takes the documentation artifact and deploys it to GitHub Pages:

  1. Uses GitHub's deploy-pages action to publish the documentation
  2. Requires appropriate GitHub permissions configured in the workflow
deploy:
  runs-on: ubuntu-latest
  needs: doc
  permissions:
    contents: read
    pages: write
    id-token: write
  environment:
    name: github-pages
    url: ${{ steps.deployment.outputs.page_url }}
  steps:
    - name: Deploy to GitHub Pages
      id: deployment
      uses: actions/deploy-pages@v4

Sources: .github/workflows/ci.yml(L48 - L61) 

Relationship to Code Structure

The CI/CD pipeline interacts with different parts of the axprocess codebase:

flowchart TD
subgraph subGraph1["axprocess Codebase"]
    src["src/ Directory"]
    tests_dir["tests/ Directory"]
    cargo["Cargo.toml"]
end
subgraph subGraph0["CI/CD Pipeline Components"]
    clippy["Clippy Static Analysis"]
    tests["Unit & Integration Tests"]
    docs["Documentation Generator"]
end

clippy --> cargo
clippy --> src
docs --> cargo
docs --> src
tests --> tests_dir

Sources: .github/workflows/ci.yml(L1 - L62)  Cargo.toml(L1 - L16) 

Environment Configuration

The CI/CD pipeline uses specific environment configurations:

  1. Uses Rust nightly toolchain for all steps
  2. Sets RUST_BACKTRACE=1 for better error reporting
  3. Runs on Ubuntu Linux
flowchart TD
subgraph subGraph0["Environment Setup"]
    env["Environment Variables:RUST_BACKTRACE=1"]
    rust["Rust Setup:- Channel: nightly- Components: clippy"]
    platform["Platform:Ubuntu Latest"]
end
CI["CI/CD Jobs"]

env --> CI
platform --> CI
rust --> CI

Sources: .github/workflows/ci.yml(L15 - L16)  .github/workflows/ci.yml(L23 - L26) 

Documentation Deployment Flow

The documentation deployment process follows these steps:

sequenceDiagram
    participant GitHubActions as GitHub Actions
    participant CargoDoc as Cargo Doc
    participant GitHubPages as GitHub Pages

    GitHubActions ->> CargoDoc: Generate Documentation
    Note over CargoDoc: Processes all source files<br>with rustdoc
    CargoDoc ->> GitHubActions: Create doc artifacts
    Note over GitHubActions: Generates index.html redirect
    GitHubActions ->> GitHubPages: Upload documentation
    Note over GitHubPages: Documentation published to<br>Github Pages URL

Sources: .github/workflows/ci.yml(L32 - L61) 

Best Practices for Developers

When working with the axprocess repository, developers should be aware of the CI/CD pipeline requirements:

  1. Clippy Compliance: All code must pass Clippy checks with no warnings (-Dwarnings flag is enabled)
  2. Test Coverage: New features should include tests, which will be automatically run by the pipeline
  3. Documentation: Code should be properly documented as it will be automatically published
  4. Build Requirements: The pipeline uses the nightly Rust toolchain, so code should be compatible with it

Conclusion

The CI/CD pipeline for axprocess provides automated quality checks and documentation deployment, ensuring that:

  1. Code meets style and quality standards through static analysis
  2. All tests pass on each change
  3. Documentation is automatically built and deployed to GitHub Pages
  4. Developers receive quick feedback on their code changes

This automation helps maintain a high-quality codebase and up-to-date documentation with minimal manual intervention.

Sources: .github/workflows/ci.yml(L1 - L62)