← Back to All Papers
PAPER II

JSON✯Atomic

Author: Dan Voulez

Institution: The LogLine Foundation

Version: 1.0.1

Date: 2026-02-05

Thesis: Same meaning must produce same bytes. Same bytes must produce same hash. Same hash is same identity.

Paper II — JSON✯Atomic

The Identity Layer

Normative keywords per RFC 2119/8174 (MUST/SHOULD/MAY) apply.


The Problem (A True Story)

November 2023. A smart contract audit. $12 million at stake.

The contract hashed a JSON document to verify agreement between parties. Both parties signed the "same" document. Both hashes were different.

// Party A's serializer produced:
{"amount": 12000000, "recipient": "0x7a3f..."}

// Party B's serializer produced:
{"recipient": "0x7a3f...", "amount": 12000000}

Same meaning. Different bytes. Different hashes. The contract rejected both signatures as invalid.

The fix took three weeks and $200,000 in legal fees. The root cause? JSON doesn't guarantee key order.

This was not a bug. This was a design flaw in every system that treats JSON as a serialization format without canonicalization.

JSON✯Atomic eliminates this class of failure entirely.


I. The Principle

Same semantics ⇒ same bytes ⇒ same hash ⇒ same identity.
use json_atomic::canonize;

// These two objects have the same meaning
let obj_a = json!({"b": 1, "a": 2});
let obj_b = json!({"a": 2, "b": 1});

// JSON✯Atomic produces identical bytes
let bytes_a = canonize(&obj_a);
let bytes_b = canonize(&obj_b);

assert_eq!(bytes_a, bytes_b);
// Both produce: {"a":2,"b":1}

// Therefore identical hashes
let hash_a = blake3::hash(&bytes_a);
let hash_b = blake3::hash(&bytes_b);

assert_eq!(hash_a, hash_b);
// Both produce: b3:7f3a9b2c4d5e6f7a8b9c0d1e2f3a4b5c...

This is not a feature. It is the foundation upon which all other papers rest.

  • Paper I requires canonical tuples to chain
  • Paper III requires canonical capsules to verify
  • Paper IV requires canonical ASTs to prove
  • Paper V requires canonical receipts to audit
  • Paper VI requires canonical policies to execute

The byte is the unit of law.


II. Install It Now

# Add to your Rust project
cargo add json-atomic

# Or install the CLI
cargo install logline-cli

use json_atomic::{canonize, verify, Error};

fn main() -> Result<(), Error> {
let document = json!({
"who": "did:logline:agent:alice",
"did": "transfer",
"this": {"amount": 1000, "to": "bob"},
"when": "2026-02-05T14:30:00Z"
});

// Canonicalize
let canonical_bytes = canonize(&document)?;

// Hash (identity)
let identity = blake3::hash(&canonical_bytes);
println!("Identity: b3:{}", hex::encode(identity.as_bytes()));

// Verify another serialization matches
let other_bytes = r#"{"did":"transfer","this":{"amount":1000,"to":"bob"},"when":"2026-02-05T14:30:00Z","who":"did:logline:agent:alice"}"#;
assert!(verify(&document, other_bytes.as_bytes())?);

Ok(())
}


III. The Data Model

JSON✯Atomic operates on a disciplined subset of JSON (RFC 8259).

Allowed Values

TypeSpecification
nullLiteral null
booleanLiterals true, false
integerArbitrary precision, base-10, no floats
stringUTF-8, normalized to NFC
arrayOrder-preserving sequence of allowed values
objectString-to-value map, no duplicate keys

Prohibitions

ConditionError Code
Float, decimal point, exponent, NaN, InfE_FLOAT
Leading zeros (except 0)E_NUM_FMT
Invalid UTF-8E_UTF8
Duplicate object keysE_DUP_KEY
// json-atomic/src/validate.rs

use thiserror::Error;

#[derive(Debug, Error)]
pub enum ValidationError {
#[error("E_FLOAT: Floating point numbers are not allowed: {0}")]
Float(f64),

#[error("E_NUM_FMT: Invalid number format: {0}")]
NumberFormat(String),

#[error("E_UTF8: Invalid UTF-8 sequence at byte {0}")]
InvalidUtf8(usize),

#[error("E_DUP_KEY: Duplicate key in object: {0}")]
DuplicateKey(String),

#[error("E_NFC: String not in NFC form: {0}")]
NotNfc(String),
}

pub fn validate(value: &Value) -> Result<(), ValidationError> {
match value {
Value::Null | Value::Bool(_) => Ok(()),

Value::Number(n) => {
// Floats are forbidden
if n.is_f64() && !n.is_i64() && !n.is_u64() {
return Err(ValidationError::Float(n.as_f64().unwrap()));
}
Ok(())
}

Value::String(s) => {
// Must be valid UTF-8 (Rust guarantees this)
// Must be NFC normalized
if !unicode_normalization::is_nfc(s) {
return Err(ValidationError::NotNfc(s.clone()));
}
Ok(())
}

Value::Array(arr) => {
for item in arr {
validate(item)?;
}
Ok(())
}

Value::Object(obj) => {
let mut seen_keys = HashSet::new();
for (key, val) in obj {
if !seen_keys.insert(key) {
return Err(ValidationError::DuplicateKey(key.clone()));
}
validate(val)?;
}
Ok(())
}
}
}

Special case: -0 MUST canonicalize to 0.


IV. Canonical Form

Canonical JSON (CJSON) is JSON serialized under these constraints:

4.1 Whitespace

  • No superfluous whitespace
  • No newlines
  • No trailing commas
// WRONG: Has whitespace
{ "a": 1, "b": 2 }

// CORRECT: Canonical
{"a":1,"b":2}

4.2 Arrays

[v₁,v₂,...,vₙ]

Element order MUST be preserved. Arrays are ordered.

4.3 Objects

{k₁:v₁,k₂:v₂,...,kₘ:vₘ}

Keys MUST be sorted by lexicographic order of UTF-8 bytes.

// Input with any key order
{"zebra": 1, "apple": 2, "mango": 3}

// Canonical output (keys sorted)
{"apple":2,"mango":3,"zebra":1}

Locale collation MUST NOT be used. Raw UTF-8 byte comparison only.

4.4 Strings

1. Normalize to NFC (Unicode Canonical Decomposition, followed by Canonical Composition)
2. Encode as UTF-8
3. Apply minimal escaping:
- "\"
- \\\
- Control characters U+0000..U+001F → \u00XX
4. The / character MUST NOT be escaped
5. Non-control characters MUST be emitted as UTF-8 directly

// json-atomic/src/escape.rs

pub fn escape_string(s: &str) -> String {
let normalized = unicode_normalization::UnicodeNormalization::nfc(s)
.collect::<String>();

let mut result = String::with_capacity(normalized.len() + 2);
result.push('"');

for ch in normalized.chars() {
match ch {
'"' => result.push_str("\\\""),
'\\' => result.push_str("\\\\"),
'\n' => result.push_str("\\n"),
'\r' => result.push_str("\\r"),
'\t' => result.push_str("\\t"),
c if c.is_control() => {
// Control characters as \u00XX
result.push_str(&format!("\\u{:04x}", c as u32));
}
c => result.push(c), // UTF-8 directly
}
}

result.push('"');
result
}

4.5 Integers

  • Base-10 decimal ASCII
  • Leading - only if value is negative and non-zero
  • Leading + forbidden
  • Leading zeros forbidden
  • -0 serializes as 0
  • Precision is arbitrary (big integers supported)
// json-atomic/src/integer.rs

use num_bigint::BigInt;

pub fn serialize_integer(n: &BigInt) -> String {
// Handle negative zero
if n.sign() == Sign::NoSign || *n == BigInt::from(0) {
return "0".to_string();
}

// No leading zeros, no leading plus
n.to_string()
}

#[cfg(test)]
mod tests {
#[test]
fn negative_zero_becomes_zero() {
let neg_zero = BigInt::from(0) * BigInt::from(-1);
assert_eq!(serialize_integer(&neg_zero), "0");
}

#[test]
fn large_integers_work() {
let big = BigInt::parse_bytes(
b"123456789012345678901234567890",
10
).unwrap();
assert_eq!(
serialize_integer(&big),
"123456789012345678901234567890"
);
}
}

4.6 Literals

true, false, null — lowercase, exactly as shown.


V. The Algorithm

// json-atomic/src/canonize.rs

pub fn canonize(value: &Value) -> Result<Vec<u8>, CanonizeError> {
// First validate
validate(value)?;

// Then serialize to canonical form
let mut output = Vec::new();
write_canonical(value, &mut output)?;
Ok(output)
}

fn write_canonical(value: &Value, out: &mut Vec<u8>) -> Result<(), CanonizeError> {
match value {
Value::Null => {
out.extend_from_slice(b"null");
}

Value::Bool(true) => {
out.extend_from_slice(b"true");
}

Value::Bool(false) => {
out.extend_from_slice(b"false");
}

Value::Number(n) => {
// Must be integer
let i = n.as_i64()
.or_else(|| n.as_u64().map(|u| u as i64))
.ok_or(CanonizeError::FloatNotAllowed)?;

// Handle -0
let s = if i == 0 { "0".to_string() } else { i.to_string() };
out.extend_from_slice(s.as_bytes());
}

Value::String(s) => {
let escaped = escape_string(s);
out.extend_from_slice(escaped.as_bytes());
}

Value::Array(arr) => {
out.push(b'[');
for (i, item) in arr.iter().enumerate() {
if i > 0 {
out.push(b',');
}
write_canonical(item, out)?;
}
out.push(b']');
}

Value::Object(obj) => {
out.push(b'{');

// Sort keys by UTF-8 bytes (NOT locale)
let mut keys: Vec<_> = obj.keys().collect();
keys.sort_by(|a, b| a.as_bytes().cmp(b.as_bytes()));

for (i, key) in keys.iter().enumerate() {
if i > 0 {
out.push(b',');
}

// Key
let escaped_key = escape_string(key);
out.extend_from_slice(escaped_key.as_bytes());

out.push(b':');

// Value
write_canonical(&obj[*key], out)?;
}

out.push(b'}');
}
}

Ok(())
}

Constraints enforced by this algorithm:


  • No randomness (deterministic key ordering)

  • No locale dependency (UTF-8 byte comparison)

  • No trailing whitespace

  • Identical output on all platforms


VI. Identity by Hash

The Binding

IDENTITY(value) = BLAKE3(CANONIZE(value))

Expressed as: b3:

// json-atomic/src/identity.rs

use blake3::Hasher;

/// Compute the content address (identity) of a value
pub fn identity(value: &Value) -> Result<ContentAddress, CanonizeError> {
let canonical = canonize(value)?;
let hash = blake3::hash(&canonical);
Ok(ContentAddress::from_blake3(hash))
}

#[derive(Debug, Clone, PartialEq, Eq, Hash)]
pub struct ContentAddress {
bytes: [u8; 32],
}

impl ContentAddress {
pub fn from_blake3(hash: blake3::Hash) -> Self {
Self {
bytes: *hash.as_bytes(),
}
}

/// Format as b3:hex
pub fn to_string(&self) -> String {
format!("b3:{}", hex::encode(&self.bytes))
}

/// Parse from b3:hex format
pub fn from_str(s: &str) -> Result<Self, ParseError> {
let hex_part = s.strip_prefix("b3:")
.ok_or(ParseError::MissingPrefix)?;

let bytes = hex::decode(hex_part)
.map_err(ParseError::InvalidHex)?;

if bytes.len() != 32 {
return Err(ParseError::InvalidLength(bytes.len()));
}

let mut arr = [0u8; 32];
arr.copy_from_slice(&bytes);
Ok(Self { bytes: arr })
}
}

The Rule

Any field of the form _hash or _cid MUST be computed over canonical bytes produced by this specification.

Hashing non-canonical representations is non-compliant and will cause verification failures.


VII. Formal Properties

PropertyStatementTest
P1 — IdempotenceCANONIZE(parse(CANONIZE(x))) = CANONIZE(x)test_idempotence
P2 — ConfluenceSame abstract value ⇒ same canonical bytestest_confluence
P3 — Identity stabilityb3(x) = b3(y) ⟺ structural equalitytest_identity_stability
P4 — SensitivityAny semantic difference changes canonical bytestest_sensitivity
P5 — IndependenceAll conforming implementations produce identical outputtest_independence
// json-atomic/src/tests/properties.rs

#[test]
fn test_idempotence() {
let values = vec![
json!(null),
json!(true),
json!(42),
json!("hello"),
json!([1, 2, 3]),
json!({"a": 1, "b": 2}),
];

for value in values {
let canonical = canonize(&value).unwrap();
let reparsed: Value = serde_json::from_slice(&canonical).unwrap();
let recanonical = canonize(&reparsed).unwrap();

assert_eq!(canonical, recanonical, "P1 violated for {:?}", value);
}
}

#[test]
fn test_confluence() {
// Different representations, same meaning
let a = json!({"b": 1, "a": 2});
let b = json!({"a": 2, "b": 1});

let ca = canonize(&a).unwrap();
let cb = canonize(&b).unwrap();

assert_eq!(ca, cb, "P2 violated: same meaning must produce same bytes");
}

#[test]
fn test_identity_stability() {
let a = json!({"x": 1});
let b = json!({"x": 1});
let c = json!({"x": 2}); // Different value

let ha = identity(&a).unwrap();
let hb = identity(&b).unwrap();
let hc = identity(&c).unwrap();

assert_eq!(ha, hb, "P3 violated: equal values must have equal identity");
assert_ne!(ha, hc, "P3 violated: different values must have different identity");
}

#[test]
fn test_sensitivity() {
// Tiny differences must produce different hashes
let a = json!({"value": 100});
let b = json!({"value": 101});

let ha = identity(&a).unwrap();
let hb = identity(&b).unwrap();

assert_ne!(ha, hb, "P4 violated: semantic difference must change identity");
}


VIII. Conformance Vectors

Every implementation MUST pass these vectors byte-for-byte.

// json-atomic/src/tests/vectors.rs

#[test]
fn c1_key_ordering() {
let input = json!({"b": 1, "a": 2});
let expected = br#"{"a":2,"b":1}"#;
assert_eq!(canonize(&input).unwrap(), expected);
}

#[test]
fn c2_unicode_strings() {
let input = json!(["z", "á", "a"]); // NFC normalized
let expected = br#"["z","á","a"]"#; // UTF-8 direct, not \u escapes
assert_eq!(canonize(&input).unwrap(), expected);
}

#[test]
fn c3_nesting() {
let input = json!({"x": [{"k": "v"}, {}], "y": true});
let expected = br#"{"x":[{"k":"v"},{}],"y":true}"#;
assert_eq!(canonize(&input).unwrap(), expected);
}

#[test]
fn c4_zero_normalization() {
// Note: serde_json doesn't preserve -0, but if it did:
let input = json!({"n1": 0, "n2": 0, "n3": 10});
let expected = br#"{"n1":0,"n2":0,"n3":10}"#;
assert_eq!(canonize(&input).unwrap(), expected);
}

#[test]
fn c5_error_duplicate_key() {
// This requires parsing raw JSON since serde_json deduplicates
let raw = r#"{"a":1,"a":2}"#;
let result = canonize_raw(raw);
assert!(matches!(result, Err(ValidationError::DuplicateKey(_))));
}

#[test]
fn c6_error_float() {
let input = json!({"x": 1.5});
let result = canonize(&input);
assert!(matches!(result, Err(CanonizeError::FloatNotAllowed)));
}

#[test]
fn c7_deep_nesting() {
let input = json!({
"level1": {
"level2": {
"level3": {
"value": 42
}
}
}
});
let expected = br#"{"level1":{"level2":{"level3":{"value":42}}}}"#;
assert_eq!(canonize(&input).unwrap(), expected);
}

#[test]
fn c8_empty_structures() {
assert_eq!(canonize(&json!({})).unwrap(), br#"{}"#);
assert_eq!(canonize(&json!([])).unwrap(), br#"[]"#);
}

#[test]
fn c9_control_characters() {
let input = json!({"text": "line1\nline2\ttab"});
let expected = br#"{"text":"line1\nline2\ttab"}"#;
assert_eq!(canonize(&input).unwrap(), expected);
}


IX. CLI Verification

# Verify a JSON file is canonical
logline json verify document.json

# Output if canonical:
# ✓ document.json is canonical
# Identity: b3:7f3a9b2c4d5e6f7a8b9c0d1e2f3a4b5c6d7e8f9a...

# Output if not canonical:
# ✗ document.json is NOT canonical
# Reason: Keys not in lexicographic order at path $.metadata
# Canonical form written to document.canonical.json

# Canonicalize a file
logline json canonize input.json -o output.json

# Compare two files for semantic equality
logline json compare a.json b.json

# Output:
# Semantically equal: YES
# Identity: b3:7f3a9b2c...

# Run conformance tests
logline json test-vectors

# Output:
# C1 Key Ordering: PASS
# C2 Unicode Strings: PASS
# C3 Nesting: PASS
# C4 Zero Normalization: PASS
# C5 Error Duplicate: PASS
# C6 Error Float: PASS
# C7 Deep Nesting: PASS
# C8 Empty Structures: PASS
# C9 Control Characters: PASS
# All 9 vectors passed.


X. Integration

PaperDependency
I — LogLine9-field tuple sealed with JSON✯Atomic; prev_hash = b3(canonical) chains history
III — LLLVCapsules, manifests, evidence carry b3 over canonical bytes
IV — TDLNast_cid, canon_cid, proofs, receipts are canonical
V — SIRPEnvelopes, peer descriptors, receipts are canonical
VI — ChipHALs, manifests, policy texts are canonical
Every paper depends on this one. If canonicalization fails, everything fails.

XI. Threats Eliminated

AttackMitigation
Whitespace injectionWhitespace disallowed
Key-shuffle attacksKeys ordered by UTF-8 bytes
Unicode spoofingNFC normalization
Float precision driftFloats forbidden
Trailing comma injectionTrailing commas disallowed
BOM injectionBOM disallowed
Encoding confusionUTF-8 only

XII. The Invariant Connection

InvariantJSON✯Atomic Role
I1 IntegrityReceipts bind b3(canonical) to effect
I2 LegalitySchema validation on canonical form
I3 AttributionSignatures bind author to canonical bytes
I4 ReproducibilityCanonical bytes enable exact replay
I5 ObservabilityReceipt streams of canonical facts

XIII. Conclusion

JSON✯Atomic makes the byte the unit of truth.

When two implementations produce identical bytes for identical meaning, verification replaces interpretation. When identity is a hash, disputes collapse into computation.

The entire LogLine architecture depends on this layer. Without deterministic canonicalization, there is no chain integrity, no proof verification, no content addressing.

With it, every artifact in the system is exactly what it claims to be.


The Equation

Same meaning → Same bytes → Same hash → Same identity

Verification replaces argument.


Next: Paper III — LLLV