
There is a class of bug that shows up in nearly every language when you work with binary data. You read a file as a string, pass it to a hashing function that wants bytes, and get the wrong answer because the string went through a UTF-8 encode that you did not ask for. Or you receive an HTTP response body as bytes, try to concatenate it with a string, and get a type error at runtime. Or you read bytes from disk, write them to a network socket through a layer that silently converts to a string, and the data arrives corrupted.

These bugs are hard to find because they often look like they work. The hash comes back as a valid hex string. The HTTP response parses fine in your test environment. The corruption only shows up when the input contains bytes that are not valid UTF-8, which might be never in your test data and always in production.

The root cause is always the same: the language does not distinguish between "a sequence of bytes" and "a string of text" at the type level, or it distinguishes them but makes conversion implicit.

## How Vary handles it

Vary has two distinct types: `Str` for text and `Bytes` for binary data. They are not interchangeable. You cannot pass a `Str` where a `Bytes` is expected, or vice versa. Converting between them requires an explicit function call that names the encoding.

This distinction runs through every module that does I/O.

The filesystem module has two pairs of read/write functions:

```python
import fs

# Text I/O (reads and writes Str)
let text = fs.read_text("config.json").unwrap()
fs.write_text("output.txt", text).unwrap()

# Binary I/O (reads and writes Bytes)
let data = fs.read_bytes("image.png").unwrap()
fs.write_bytes("copy.png", data).unwrap()
```

There is no function called `read_file` that returns "whatever the file contains." You decide up front whether you are working with text or binary data.

The HTTP module follows the same split. Response bodies come back as `Str` by default (for JSON APIs), but you can get the raw `Bytes`:

```python
import http

let response = http.get("https://example.com/api/data")
let json_text: Str = response.body       # text
let raw: Bytes = response.body_bytes()    # binary
```

The crypto module goes further. All primary operations accept and return `Bytes`:

```python
import crypto

let data: Bytes = fs.read_bytes("document.pdf").unwrap()
let hash = crypto.sha256(data)           # Bytes in, CryptoHash out
print(hash.hex())                        # explicit conversion to Str
```

If you want to hash a string, you use the `_str` suffix variant:

```python
let hash = crypto.sha256_str("hello")    # Str convenience
```

The suffix makes the type boundary visible. You always know whether you are in string-land or bytes-land.

## Encoding is explicit

Converting between `Bytes` and `Str` goes through encoding functions that name the encoding:

```python
import crypto

let raw: Bytes = crypto.random_bytes(32)

# Bytes to Str (explicit encoding)
let hex: Str = crypto.hex_encode(raw)
let b64: Str = crypto.base64_encode(raw)

# Str to Bytes (explicit decoding)
let decoded: Bytes = crypto.hex_decode(hex)
let also_decoded: Bytes = crypto.base64_decode(b64)
```

There is no implicit `toString()` on `Bytes`. If you try to print a `Bytes` value or concatenate it with a string, the type checker stops you. You have to pick an encoding.

This is annoying the first time you hit it. It is a relief the fiftieth time, when you realize you have never had to debug a "wrong encoding" issue.

## A real pipeline

Here is what a realistic binary workflow looks like in Vary. Read a file, hash it, upload it, verify the server's response:

```python
import fs
import crypto
import http
import json

# Read binary file
let payload: Bytes = fs.read_bytes("artifact.tar.gz").unwrap()

# Hash it locally
let local_hash = crypto.sha256(payload)

# Upload (HTTP module accepts Bytes body)
let response = http.post("https://storage.example.com/upload", payload)

# Server returns the hash it computed
let server_hash = json.parse(response.body).get("sha256").as_str()

# Compare
if local_hash.hex() == server_hash {
    print("Upload verified")
}
```

Every variable has a clear type. `payload` is `Bytes`, `local_hash` is `CryptoHash`, `response.body` is `Str` (JSON text), `server_hash` is `Str`. The only type boundary crossing is `local_hash.hex()`, which converts the hash to a hex string for comparison. At no point did binary data silently become a string or vice versa.

If you have spent time debugging a corrupted file upload where everything worked in tests but production data came through mangled, this kind of explicitness starts to feel less like ceremony and more like insurance.

## What we avoided

Python famously struggles with this. In Python 2, `str` was bytes and `unicode` was text, and mixing them produced silent mojibake. Python 3 fixed it by making `str` always text and adding a separate `bytes` type, but the ecosystem still has libraries that accept either, and the error messages when you mix them up are confusing.

Go takes a different approach: `string` and `[]byte` are freely convertible with a type cast, but every conversion copies the data. The compiler does not warn you when you convert back and forth unnecessarily.

Vary is stricter than both. No implicit conversion, no cheap cast, and every I/O function picks a side. The cost is a few extra characters when you need to cross the boundary. The benefit is that "why is my binary data corrupted" is a bug you never have to debug.

## The design principle

The idea behind the `Bytes` type is the same one behind `Decimal` vs `Float` and `Money` vs `Decimal`: when two things look similar but have different semantics, give them different types.

Text and binary data look similar in memory. Both are sequences of values. But text must be valid UTF-8 while bytes have no encoding constraint. You can uppercase text but not bytes. You can XOR bytes but not text. Text can fail to decode; bytes never do. These are not edge cases. They are fundamental differences in how the data behaves, and collapsing them into one type hides that.

Fewer keystrokes now, more 3am debugging sessions later. We picked the other tradeoff.
