JSON Best Practices: 12 Patterns Every Developer Should Use
Beyond formatting and validation: how to structure JSON for readability, performance, and forward compatibility. Real patterns from production codebases, the anti-patterns that bite at scale, and what to do about timestamps, IDs, optional fields, and schema evolution.
Most JSON 'style guides' boil down to formatting rules. Useful, but barely scratching the surface. The real wins come from how you structure the data — key naming, nesting depth, when to use arrays vs objects, when to denormalize, what to do about timestamps and IDs, and how you evolve the schema without breaking every consumer. This guide is the cumulative output of a decade of writing JSON for APIs, configs, logs, and inter-service contracts. None of it is theoretical; every pattern is one I've seen either save weeks of work or cause a multi-day outage.
1. Pick one casing convention and enforce it
Most public APIs settle on either snake_case (Stripe, Twitter, GitHub, AWS) or camelCase (Google, Slack, modern JavaScript frameworks). Either works. Which one is better is a matter of taste; what matters is that you pick one and apply it consistently across every endpoint, every field, every nested object.
The worst possible choice is mixing them within the same payload: { firstName: 'Alice', last_name: 'Smith' }. Consumers immediately need a manual list of which fields use which convention. Anyone writing client code in a strongly-typed language gets confused; type-generators produce weird hybrid types. Anyone querying with jq or similar has to remember exact spellings.
Lint for this in CI. If you're using OpenAPI, you can configure spectral to fail builds on convention violations. If you have a custom JSON Schema, write a script that recursively walks it and asserts every property name matches your convention's regex.
2. Use ISO 8601 strings for every timestamp
Unix epochs (milliseconds since 1970-01-01) feel performant — they're just numbers, smaller to store, faster to compare. But they're unreadable in logs and require timezone reasoning at every consumer. ISO 8601 strings — '2026-05-16T14:30:00Z' — are self-documenting, sort lexicographically, parse cleanly in every language with a date library, and survive every JSON parser without precision loss.
Always include the timezone suffix. The 'Z' at the end means UTC; '+05:30' means India Standard Time, '-08:00' means Pacific. A timestamp without a timezone (a 'naive' timestamp) is ambiguous and a source of bugs. If you're getting timestamps from a system that doesn't include the timezone, document the assumption ('all timestamps in this API are UTC') and validate that assumption in your CI.
When to use epoch numbers anyway
Two cases: when you need millisecond-precision arithmetic across distributed systems (e.g., distributed tracing), or when you're packing data into a tight wire format (e.g., MQTT messages from sensors). For everything else — REST APIs, configs, event payloads — use ISO strings.
3. Prefer arrays over numbered keys
// Bad — numbered keys
{
"item_1": "alpha",
"item_2": "beta",
"item_3": "gamma"
}
// Good — an array
{
"items": ["alpha", "beta", "gamma"]
}Numbered keys are an anti-pattern. They force consumers to loop over Object.keys() and parse out the numeric suffix. Object key iteration order is technically guaranteed in modern JavaScript engines but historically wasn't, leading to bugs where item_10 came before item_2 alphabetically. The moment you add an item_10, anything that uses regex matching breaks.
If the keys carry meaning (not just position), use an array of objects: [{ id: 1, label: 'alpha' }, { id: 2, label: 'beta' }]. Now you can sort, filter, and reference items by id without parsing strings.
4. Be precise about null vs missing vs empty
These three states mean different things and consumers will treat them differently. Pick one to mean 'this value exists but is empty' and document it.
- null — the field is present in the schema but has no value. Use for optional fields that are explicitly unset.
- missing key — the field doesn't apply in this context. Use sparingly; forces consumers to write 'if 'address' in obj' checks.
- empty object {} — distinct from null. Use only when the field is a container that happens to be empty (e.g., 'metadata: {}' for an item with no metadata yet).
- empty string '' — distinct from null. Use when the value is a string that happens to be empty (e.g., a search query that the user cleared).
- empty array [] — distinct from null. Use when the value is a list that happens to be empty (e.g., no comments yet).
My rule: prefer null over missing keys for optional fields. The reason is simpler client code. Instead of writing `obj.address && obj.address.city`, you write `obj.address?.city`. The null check becomes one safer operator instead of two property accesses with an implicit truthiness check.
5. Always document optionality, ideally with JSON Schema
Two responses with the same shape but different missing keys force consumers to write defensive checks everywhere. JSON Schema lets you declare which fields are required and which are optional, plus type constraints, min/max values, regex patterns, and enum values. Generate types from your schema; never the other way around.
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"required": ["id", "name"],
"properties": {
"id": { "type": "integer", "minimum": 1 },
"name": { "type": "string", "minLength": 1, "maxLength": 100 },
"email": { "type": "string", "format": "email" },
"created_at": { "type": "string", "format": "date-time" }
}
}Validate at API boundaries (HTTP request/response, queue consumers, database loaders), in unit tests (every fixture should validate against the schema), and in CI (block merges if schemas change in breaking ways). Don't validate inside business logic — by then it's too late and too slow.
6. Cap nesting at four levels deep
Deeply nested JSON is hard to read, hard to query, and hard to validate. Beyond 4–5 levels, the cognitive cost is real. Flatten where you can. Instead of { user: { profile: { contact: { email: '[email protected]' } } } }, use { user_id: 1, email: '[email protected]' } and provide an endpoint that joins back to the full user object if needed.
When you genuinely need nesting (representing a tree of categories, an org chart, a nested comment thread), keep each level shallow. Use IDs and pagination instead of embedding entire sub-trees. GraphQL-style 'pick the fields you need' patterns help here too, even if you're using REST.
7. Be deliberate about numbers vs strings
Send numbers as numbers. { 'price': 9.99 }, not { 'price': '9.99' }. Stringified numbers force every consumer to parseFloat them, lose type information, and add bytes. JSON's number type is well-defined and parsers handle it cleanly.
The big exception: identifiers that are technically numeric but are bigger than 2^53. JavaScript represents all numbers as 64-bit floats, which can only precisely represent integers up to 9,007,199,254,740,992. Twitter learned this early; their tweet IDs overflow and silently lose precision in JavaScript clients. The fix: send them as strings. Discord, Twitter, and most modern APIs that use snowflake IDs serialize them as strings for exactly this reason.
Same rule for phone numbers, ZIP codes, account numbers — anything where leading zeros matter or the value happens to be numeric. They're not really numbers; they're identifiers that look like numbers. String them.
8. Pretty-print for humans, minify for the wire
Indentation is a development concern, not a production one. API responses should be minified — they're 20–40% smaller and parsers don't care about whitespace. Gzip on top makes the difference even bigger because gzip compresses repeated structure efficiently and pretty-printed JSON has more repetition.
If you're worried about debuggability, every modern HTTP client (Postman, Insomnia, browser DevTools, curl with --compressed) auto-pretty-prints JSON responses for human display. The wire stays compact; the human view stays readable. You get the best of both.
9. Version every payload
Even for internal JSON — config files, event payloads, log records — add a top-level 'version' or 'schema_version' field from day one. The day you need to change the shape and support both old and new consumers, you'll thank yourself. Without a version field, consumers can only sniff the shape ('does this object have field X?') to detect the format, which is fragile and gets worse over time.
Use semantic versioning (1.0, 1.1, 2.0) and document what each version means. Backward-compatible changes bump the minor version; breaking changes bump the major version. Consumers can switch on the major version to choose how to parse.
10. Validate with JSON Schema at runtime
TypeScript types only check at compile time. JSON Schema validates at runtime — at API boundaries, in tests, in CI pipelines, in admin tools. The two systems are complementary. Generate types from your schema (json-schema-to-typescript is the standard tool) so your runtime checks and compile-time checks always agree.
Use ajv (the de facto Node validator) or similar in your language of choice. It's fast — modern validators compile schemas into JIT-optimized validation functions that can validate millions of objects per second. The cost is essentially zero for any real-world payload.
11. Think about security: prototype pollution and JSON hijacking
JSON has two well-known security pitfalls that bite at scale.
Prototype pollution
If you naively merge user-provided JSON into a JavaScript object via Object.assign or lodash.merge, an attacker can set '__proto__' and pollute the prototype chain of all objects in the process. Mitigation: don't merge untrusted objects, or use Object.create(null) for the target, or use a library that filters out dangerous keys (immer is safe; older lodash versions were not).
JSON hijacking
An older attack against APIs that return JSON arrays at the top level. The malicious site loads your API as a <script> and overrides Array's constructor to capture the data. Modern browsers prevent this by enforcing Same-Origin Policy on JSON responses, but you should still wrap top-level arrays in an object: { data: [...] } instead of [...]. Belt and suspenders.
12. Avoid these anti-patterns in production
- Returning different shapes from the same endpoint based on conditions. If your endpoint sometimes returns an object and sometimes an array, your consumers have to type-check first. Pick one shape and stick with it.
- Empty-string-as-null. Some APIs use '' to mean null. This forces consumers to special-case it. If a value is missing, use null or omit the field.
- Mixed-type arrays. JSON allows them, but they break type-generators and confuse consumers. Use uniform types in arrays.
- Stringified JSON inside JSON. Sometimes seen: { 'data': '{\"name\": \"Alice\"}' }. Now consumers have to JSON.parse twice. Just nest the object directly.
- Date strings without timezone information. As discussed, sources of bugs.
- API responses gzipped but not advertised. If you compress responses, set the Content-Encoding header so consumers know to decompress.
- Numeric IDs that exceed 2^53 sent as numbers. Twitter learned this; you don't have to.
Validating your JSON in practice
All of the above is easier when you have a fast feedback loop. Our JSON Formatter at yalikit.com/tools/json-formatter validates as you type, points you to syntax errors with line/column precision, and can auto-fix the common JS-object-isms. For schema-level validation, the JSON Schema Generator can infer a schema from a sample payload, which you can then validate other payloads against in CI.
How these patterns play out at scale
Every one of the patterns above looks small in isolation. The compound interest hits at scale. A 100-endpoint API with consistent naming, ISO timestamps, schema validation, and version fields is dramatically easier to evolve than one without. New endpoints become trivial because they follow established patterns. Onboarding new engineers takes hours instead of weeks because they only have to learn the conventions once.
I've worked on APIs that followed all twelve practices and APIs that followed none. The first kind ships new features without breaking consumers; the second kind has a permanent feature freeze because every change is risky. The difference isn't talent or budget — it's whether the team made these decisions before they had to.
Common questions about JSON conventions
Is camelCase or snake_case better for JSON?
Neither is objectively better. Pick the one that matches your codebase's primary language. Python and Ruby codebases lean snake_case; JavaScript/TypeScript codebases lean camelCase. Consistency within a project matters more than which one you pick.
Should I use JSON Schema or OpenAPI?
OpenAPI uses JSON Schema for its data definitions, so they're not competing — they're complementary. Use OpenAPI to describe your full API (endpoints, parameters, responses); use JSON Schema for the individual data shapes. If you're not building a public API, JSON Schema alone is enough.
Is BSON or MessagePack better than JSON for performance?
For most web APIs, no. The bottleneck is usually the network and the database, not JSON parsing. Switching to a binary format gets you maybe 10–20% on hot paths but costs you human-readability and broad tooling support. Stay with JSON unless profiling proves it's the bottleneck.
Open json formatter now
No signup. No upload. Runs entirely in your browser.
Open tool →Founder of YaliKit. Builds developer tools full-time and ships every tool you see on the site. Previously worked on data platforms at scale. Writes about JSON, CSV, regex, performance, and the small details that make browser tools feel native.