Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Benchmarks

TeaLeaf includes a Criterion-based benchmark suite that measures encode/decode performance and output size across multiple serialization formats.

Running Benchmarks

# Run all benchmarks
cargo bench -p tealeaf-core

# Run a specific scenario
cargo bench -p tealeaf-core -- small_object
cargo bench -p tealeaf-core -- large_array_1000
cargo bench -p tealeaf-core -- tabular_5000

# List available benchmarks
cargo bench -p tealeaf-core -- --list

Results are saved to target/criterion/ with HTML reports and JSON data. Criterion tracks historical performance across runs.

Formats Compared

Each scenario benchmarks encode and decode across six formats:

FormatLibraryNotes
TeaLeaf ParsetealeafText parsing (.tl → in-memory)
TeaLeaf BinarytealeafBinary compile/read (.tlbx)
JSONserde_jsonStandard JSON serialization
MessagePackrmp_serdeBinary, schemaless
CBORciboriumBinary, schemaless
ProtobufprostBinary with generated code from .proto definitions

Note: Protobuf benchmarks use prost with code generation via build.rs. The generated structs have known field offsets at compile time, giving Protobuf a structural speed advantage over TeaLeaf’s dynamic key-based access.

Benchmark Scenarios

GroupData ShapeSizesWhat It Tests
small_objectConfig-like object1Header overhead, small payload efficiency
large_array_100Array of Point structs100Array encoding at small scale
large_array_1000Array of Point structs1,000Array encoding at medium scale
large_array_10000Array of Point structs10,000Array encoding at large scale, throughput
nested_structsNested objects2 levelsNesting overhead
nested_structs_100Nested objects100 levelsDeep nesting scalability
mixed_typesHeterogeneous data1Strings, numbers, booleans mixed
tabular_100@table User records100Schema-bound tabular data, small
tabular_1000@table User records1,000Schema-bound tabular data, medium
tabular_5000@table User records5,000Schema-bound tabular data, large

Each group measures both encode (serialize) and decode (deserialize) operations, using Throughput::Elements for per-element metrics on scaled scenarios.

Size Comparison Results

From cargo run --example size_report on tealeaf-core:

FormatSmall Object10K Points1K Users
JSON1.00x1.00x1.00x
Protobuf0.38x0.65x0.41x
MessagePack0.35x0.63x0.38x
TeaLeaf Binary3.56x0.15x0.47x

Key observations:

  • Small objects: TeaLeaf has a 64-byte header overhead. For objects under ~200 bytes, JSON or MessagePack are more compact.
  • Large arrays: String deduplication and schema-based compression produce 6-7x better compression than JSON for 10K+ records.
  • Tabular data: @table encoding with positional storage is competitive with Protobuf, with the advantage of embedded schemas.

Speed Characteristics

TeaLeaf’s dynamic key-based access is ~2-5x slower than Protobuf’s generated code:

OperationTeaLeafProtobufJSON (serde)
Parse textModerateN/AFast
Decode binaryModerateFastN/A
Random key accessO(1) hashO(1) fieldO(n) parse

Why TeaLeaf is slower than Protobuf:

  1. Dynamic dispatch – fields resolved by name at runtime; Protobuf uses generated code with known offsets
  2. String table lookup – each string access requires a table lookup
  3. Schema resolution – schema structure parsed from binary at load time

When this matters:

  • Hot loops decoding millions of records → consider Protobuf
  • Cold reads or moderate throughput → TeaLeaf is fine
  • Size-constrained transmission → TeaLeaf’s smaller binary compensates for slower decode

Code Structure

tealeaf-core/benches/
├── benchmarks.rs          # Entry point: criterion_group + criterion_main
├── common/
│   ├── mod.rs             # Module exports
│   ├── data.rs            # Test data generation functions
│   └── structs.rs         # Rust struct definitions (serde-compatible)
└── scenarios/
    ├── mod.rs             # Module exports
    ├── small_object.rs    # Small config object benchmarks
    ├── large_array.rs     # Scaled array benchmarks (100-10K)
    ├── nested_structs.rs  # Nesting depth benchmarks (2-100)
    ├── mixed_types.rs     # Heterogeneous data benchmarks
    └── tabular_data.rs    # @table User record benchmarks (100-5K)

Each scenario module exports bench_encode and bench_decode functions. Scaled scenarios accept a size parameter.

For optimization tips and practical guidance on when to use each format, see Performance.