Compile-time Hex String Validation in Rust using Const Evaluation

Learn how to leverage Rust's const evaluation features to validate hex strings at compile time, ensuring type safety and zero runtime overhead.

A deep dive into using Rust’s const evaluation features to validate hex strings at compile time, ensuring both correctness and zero runtime overhead.

Introduction

Compile-time evaluation is a powerful feature in Rust that allows us to perform computations during compilation rather than at runtime. This can lead to better performance and earlier error detection. In this post, we’ll explore how to use Rust’s const evaluation features to validate hex strings at compile time, focusing on a practical example of validating Sui blockchain’s object IDs.

The Problem

When working with blockchain or cryptographic systems, we often need to handle fixed-length hex strings that represent identifiers or addresses. For example, in the Sui blockchain, object IDs are 32-byte values typically represented as 64-character hex strings (with an optional “0x” prefix).

Working with these hex strings presents several challenges. First, we need to validate that the string has exactly 64 characters after any prefix. Second, we must ensure that all characters are valid hexadecimal digits (0-9, a-f, A-F). Third, we want to perform these validations with zero runtime overhead. Finally, we need to provide helpful compile-time error messages when validation fails.

The Solution: Compile-time Validation

Our solution leverages Rust’s const evaluation features to validate hex strings at compile time. We achieve this through a combination of const functions for parsing, trait-based length validation, custom diagnostic messages, and const trait bounds. This approach ensures that any validation errors are caught during compilation, leading to zero runtime overhead.

Type-level Length Validation

First, we define types and traits to validate string length at compile time. The implementation centers around a zero-sized struct IsValidLength<const LENGTH: usize> with a const generic parameter. We create a marker trait ValidLength with a custom error message, and implement it only for IsValidLength<64>. Finally, we provide a const function that requires the ValidLength trait.

pub struct IsValidLength<const LENGTH: usize>;

#[diagnostic::on_unimplemented(
    message = "literal passed into has the wrong length, its length should exactly be 64",
)]
pub trait ValidLength {}

impl ValidLength for IsValidLength<64> {}

pub const fn is_valid_length<T: ValidLength>() {}

Character Validation

For character validation, we use a similar trait-based approach. We define a ValidCharacters trait with custom error messages, and implement it only for valid inputs using a HasValidLength<const VALID: bool> type. This pattern allows us to provide clear error messages while maintaining type safety.

#[diagnostic::on_unimplemented(
    message = "the literal passed in has some invalid characters",
)]
pub trait ValidCharacters {}

pub struct HasValidLength<const VALID: bool>;

impl ValidCharacters for HasValidLength<true> {}

pub const fn has_valid_characters<T: ValidCharacters>() {}

Const Parsing Function

The core parsing logic runs at compile time through a const function. This implementation handles both uppercase and lowercase hex characters, returns a Result to handle parsing errors, and uses fixed-size arrays for zero-allocation parsing. The function takes a string slice and a starting position, returning either a 32-byte array or an error message.

pub const fn parse_sui_address(address: &str, start: usize) -> Result<[u8; 32], &'static str> {
    let mut trimmed_address = [0u8; 64];
    let mut i = 0;

    while i < 64 && start + i < address.len() {
        trimmed_address[i] = address.as_bytes()[start + i];
        i += 1;
    }

    let mut bytes = [0u8; 32];
    let mut j = 0;

    while j < 32 {
        let byte = match (
            hex_char_to_nibble(trimmed_address[j * 2]),
            hex_char_to_nibble(trimmed_address[j * 2 + 1]),
        ) {
            (Some(high), Some(low)) => (high << 4) | low,
            _ => return Err("failed to parse address"),
        };
        bytes[j] = byte;
        j += 1;
    }

    Ok(bytes)
}

const fn hex_char_to_nibble(c: u8) -> Option<u8> {
    match c {
        b'0'..=b'9' => Some(c - b'0'),
        b'a'..=b'f' => Some(c - b'a' + 10),
        b'A'..=b'F' => Some(c - b'A' + 10),
        _ => None,
    }
}

The ObjectID Type

To ensure type safety, we wrap the parsed bytes in a dedicated struct. The ObjectID type encapsulates a 32-byte array and provides a const constructor function.

pub struct ObjectID([u8; 32]);

impl ObjectID {
    pub const fn new(bytes: [u8; 32]) -> Self {
        Self(bytes)
    }
}

Putting It All Together: The Macro

The real magic happens in the object_id! macro. This macro orchestrates the entire validation process, combining length checking, character validation, and byte parsing into a seamless compile-time operation.

#[macro_export]
macro_rules! object_id {
    ($address:expr) => {{
        const ADDRESS_AND_START: (&str, usize) = $crate::check_length!($address);
        const BYTES_AND_ERR: ([u8; 32], bool) = match $crate::parse_sui_address(ADDRESS_AND_START.0, ADDRESS_AND_START.1) {
            Ok(parsed_address) => (parsed_address, true),
            Err(_) => ([0_u8; 32], false),
        };
        const ERR: bool = BYTES_AND_ERR.1;
        $crate::has_valid_characters::<HasValidLength<ERR>>();
        $crate::ObjectID::new(BYTES_AND_ERR.0)
    }};
}

The macro is complemented by a helper macro for length checking that handles the “0x” prefix and validates the string length:

#[macro_export]
macro_rules! check_length {
    ($address:expr) => {{
        const START: usize = 0;
        const ADDRESS_BYTES_LENGTH_AND_START: (usize, usize) = if $address.len() >= 2 && $address.as_bytes()[0] == b'0' && $address.as_bytes()[1] == b'x' {
            const START: usize = 2;
            ($address.as_bytes().len() - START, START)
        } else {
            ($address.as_bytes().len() - START, START)
        };
        const ADDRESS_BYTES_LENGTH: usize = ADDRESS_BYTES_LENGTH_AND_START.0;
        $crate::is_valid_length::<IsValidLength<ADDRESS_BYTES_LENGTH>>();
        ($address, ADDRESS_BYTES_LENGTH_AND_START.1)
    }}
}

How It Works

The validation system operates in several stages. First, the check_length! macro examines the input string, checking for the optional “0x” prefix and calculating the effective string length. It uses type-level validation through IsValidLength<LENGTH> to ensure the correct length at compile time.

Next, the parse_sui_address function attempts to parse each character of the input string. It returns a tuple containing both the parsed bytes and a success flag. This flag is then used with HasValidLength<VALID> to trigger appropriate compile-time errors for invalid characters.

Throughout this process, custom diagnostic messages provide clear feedback about any validation failures. All of this happens during compilation, ensuring that no runtime overhead is incurred for these checks.

Usage Examples

The macro can be used in various contexts, with the compiler providing helpful error messages for invalid inputs. Here’s a practical example:

fn main() {
    // This compiles successfully
    let valid_id = object_id!("0xd1ec56d5d92d3e3d74ce6e50e1b2a6b505bedd7d7305ead61d5093619bedb2a7");
    
    // This fails to compile due to invalid characters
    let invalid_char = object_id!("0xd1ec56d5d92d3e3d74ce6e50e1b2a6b505beGG7d7305ead61d5093619bedb2a7");
    
    // This fails to compile due to incorrect length
    let invalid_length = object_id!("0xd1ec56d5d92d3e3d74ce6e50e1b2a6b505bedd7d7305ead61d5093619bedb2a7abc");
}

When validation fails, the compiler produces clear error messages that help developers identify and fix the issues:

error[E0277]: literal passed into has the wrong length, its length should exactly be 64
   --> src/main.rs💯35
    |
100 |         $crate::is_valid_length::<IsValidLength<ADDRESS_BYTES_LENGTH>>();
    |                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ the trait `ValidLength` is not implemented for `IsValidLength<73>`
...
114 |     object_id!("0xd1ec56d5d92d3e3d74ce6e50e1b2a6b505beGG7d7305ead61d5093619bedb2a7abcabcabc");                                           ...
    |     ----------------------------------------------------------------------------------------- in this macro invocation
    |
    = help: the trait `ValidLength` is implemented for `IsValidLength<64>`
note: required by a bound in `is_valid_length`
   --> src/main.rs:11:33
    |
11  | pub const fn is_valid_length<T: ValidLength>() {}
    |                                 ^^^^^^^^^^^ required by this bound in `is_valid_length`
    = note: this error originates in the macro `$crate::check_length` which comes from the expansion of the macro `object_id` (in Nightly builds, run with -Z macro-backtrace for more info)

error[E0277]: the literal passed in has some invalid characters
   --> src/main.rs:84:40
    |
84  |         $crate::has_valid_characters::<HasValidLength<ERR>>();
    |                                        ^^^^^^^^^^^^^^^^^^^ the trait `ValidCharacters` is not implemented for `HasValidLength<false>`
...
110 |     object_id!("0xd1ec56d5d92d3e3d74ce6e50e1b2a6b505beGG7d7305ead61d5093619bedb2a7");
    |     -------------------------------------------------------------------------------- in this macro invocation
    |
    = help: the trait `ValidCharacters` is implemented for `HasValidLength<true>`
note: required by a bound in `has_valid_characters`
   --> src/main.rs:22:38
    |
22  | pub const fn has_valid_characters<T: ValidCharacters>() {}
    |                                      ^^^^^^^^^^^^^^^ required by this bound in `has_valid_characters`
    = note: this error originates in the macro `object_id` (in Nightly builds, run with -Z macro-backtrace for more info)

For more information about this error, try `rustc --explain E0277`.
error: could not compile `playground` (bin "playground") due to 2 previous errors

Advanced Considerations

The performance benefits of this approach are significant. Since all validation happens at compile time, there is zero runtime overhead. The use of fixed-size arrays instead of vectors eliminates allocation costs, and the compile-time checks mean no runtime validation is necessary.

From a type safety perspective, the system provides strong guarantees. Length and character checks are enforced by the type system itself, the ObjectID type properly encapsulates the underlying bytes, and the compile-time validation ensures that invalid inputs simply cannot compile.

The error handling system is both sophisticated and user-friendly. Rust’s diagnostic attributes provide clear, contextual error messages, while the separation of length and character validation concerns makes it easy to identify exactly what went wrong. The early detection of errors during compilation prevents issues from manifesting at runtime.

Best Practices

When implementing similar compile-time validation systems, several practices have proven effective. Using const generics for compile-time numeric validation provides type-level guarantees. Custom traits with clear error messages make validation failures easy to understand. Separating different validation concerns into distinct traits improves code organization and maintainability.

Fixed-size arrays should be preferred over vectors when the size is known at compile time, as they eliminate allocation overhead. Making functions const whenever possible allows more work to be done during compilation, reducing runtime costs. These practices together create robust, efficient validation systems.

Conclusion

Rust’s const evaluation features enable powerful compile-time validation capabilities. By combining const functions, traits, and macros, we can create validation systems that catch errors early while incurring no runtime overhead. This approach is particularly valuable for domains like blockchain and cryptography where string validation is critical.

The implementation we’ve explored demonstrates the power of compile-time validation, providing clear error messages, ensuring type safety, and maintaining high performance. This pattern can be adapted for many scenarios where compile-time validation of string literals is needed, offering both safety and efficiency.

Playground Link

You can try out the code in the Playground.

References

The Rust Reference’s section on const evaluation provides detailed information about what operations are allowed in const contexts. The Rust RFC on const generics explains the type-level features we’ve used. The Rust documentation on custom diagnostics shows how to create helpful error messages. Finally, the Rust By Example chapter on macros demonstrates the macro system we’ve leveraged for our implementation.

Rust Reference: Const Evaluation

Rust RFC: Const Generics

Rust Documentation: Custom Diagnostics

Rust By Example: Macros