JSON Formatting, Validation, and Schema in Practice
Summary (TL;DR)
A team I worked with last month shipped a one-line edit to feature_flags.json. JSON.parse accepted it, CI was green, and once it hit staging every checkout branch fell back to the legacy code path because flags.checkout_v2 was the string "true" instead of the boolean true. “Checking JSON” had quietly become three different activities, and they had only performed the first two. Pretty-printing reshapes whitespace so a human can read a blob; it does not verify anything. Syntax validation confirms that a string is parseable JSON according to RFC 8259 — matching brackets, proper quoting, correct number literals — and that is precisely what JSON.parse does. Structural validation is a separate step that asks whether the parsed value has the shape your program expects: required fields, correct types, allowed enum values, string lengths, numeric ranges. That last step is exactly what JSON Schema was designed for, and it is the one most teams skip until a production bug like the one above forces them to add it. A reliable pipeline uses all three at the right layer: format when you need to read the data, parse to catch malformed input, and run an Ajv-style schema check at trust boundaries — incoming API requests, configuration files, and cross-service messages.
Background
JSON is defined by RFC 8259 (and the equivalent ECMA-404). The grammar is small on purpose. A JSON document is one of: a string, a number, true, false, null, an array, or an object. Strings are double-quoted and support a short list of escapes including \n, \t, \", \\, and \uXXXX for Unicode code units. Numbers follow a decimal grammar with optional sign, fractional part, and exponent, but there is no distinction between integer and floating-point at the syntax level. Objects are unordered collections of string-keyed members, arrays are ordered lists, and whitespace outside of strings is insignificant.
What JSON intentionally does not include is almost as important. There are no comments, no trailing commas, no single-quoted strings, and no hex or binary literals. Unicode characters outside the ASCII range must appear either as raw UTF-8 bytes in the encoded stream or as \u escapes. Extensions like JSON5 and HJSON relax some of these rules, but they are separate formats and parsers that accept strict JSON will reject them.
Once a document is parsed, its syntactic validity says nothing about whether it is the right data. A login payload like {"user": "x", "pass": "y"} is perfectly valid JSON, even if your endpoint expected {"username": "...", "password": "..."}. To catch that class of mistake you need a schema: a machine-readable description of what counts as an acceptable document. JSON Schema (the current recommended meta-schema is Draft 2020-12) fills that gap. It supports required fields, type constraints, enum and const, string patterns via regex, numeric ranges, array items and uniqueness, object properties and additionalProperties, and composition via allOf, oneOf, anyOf, and $ref. Ajv 8.12 compiles a schema once into a JavaScript validator function, which is fast enough for hot paths — in one Node 20.11 service I instrumented, a compiled Ajv check on a typical request body sat in the tens-of-microseconds range.
Pretty-printing is the most mundane of the three. It only inserts whitespace — indentation and newlines — without changing meaning. Most editors and command-line tools do it; browser dev tools do it automatically in the network panel. It is useful for humans and irrelevant to machines.
Data / Comparison
| Aspect | Pretty-printing | Syntax validation | JSON Schema validation |
|---|---|---|---|
| Purpose | Make data readable | Ensure the text parses as JSON | Ensure parsed data matches an expected shape |
| Detects | Nothing — whitespace only | Unmatched brackets, bad quotes, invalid escapes, trailing commas | Missing fields, wrong types, out-of-range values, unknown keys |
| Does not detect | Structural issues, semantic issues | Structural issues (expected shape), business rules | Semantic rules beyond the schema, cross-field invariants without custom keywords |
| Typical tooling | JSON.stringify(obj, null, 2), jq, IDE formatter | JSON.parse, jq -e, any parser | Ajv 8.x, python-jsonschema, OpenAPI validators |
| Where to run | Developer tools, logs | At every parse boundary (implicit) | At API request/response, config load, message boundaries |
The three columns are not alternatives; they are sequential layers. Opening a 12 MB single-line package-lock.json in a raw editor is how you end up with a string your diff tool refuses to compare, and prettifying it into indented form does not validate anything — it just makes the shape legible. That distinction matters because the next two layers, syntax and schema, often get mistaken for “I pretty-printed it and nothing exploded.” Pretty-print to read, parse to catch malformed text, and validate with a schema to catch wrong shape. Shipping only the first two is a common gap.
Real-world Scenarios
Scenario 1 — Debugging an API response. A third-party payment gateway returns a 600-line JSON body and the browser shows it as a single line. Pretty-printing it — either with browser dev tools, a local formatter, or curl ... | jq . — turns it into something a human can skim. No validation happens here; the goal is legibility while you hunt for the field that looks wrong, and the important discipline is not to call this step “validated.”
Scenario 2 — Loading a configuration file. A service reads config.json at startup. A strict JSON parser catches syntax errors like a stray trailing comma and refuses to start, which is the right behavior. But, as the opening incident showed, a valid file with retries: "three" instead of retries: 3 will parse just fine and only fail when the code tries to compare a string to a number. The pattern that has worked for me is to put Ajv.compile(schema)(config) in the very first lines of the entry point and call process.exit(1) on failure — a five-minute change that has paid for itself the next two times someone hand-edited the file.
Scenario 3 — Verifying an OpenAPI contract. A team ships an OpenAPI 3.1 document that describes endpoints, request bodies, and response shapes under components.schemas. Contract tests take the example payloads from the spec and validate them against the schemas using a JSON Schema validator. When a server implementation drifts — say, starts returning an integer where the spec promised a string — the contract test flags the mismatch before a client breaks in production. This is the same JSON Schema engine you would use for a lone config file; the difference is the scale of coverage.
Common Misconceptions
“JSON.parse is enough to validate JSON.” It validates syntax, not shape. A parser will happily return an object missing half the fields your code needs. Treat JSON.parse as a gate against malformed text and layer a schema check on top when the data crosses a trust boundary.
“JSON Schema is server-side only.” Validating in the browser before sending a request gives users instant feedback and cuts load on the server. Many form libraries and Ajv itself run comfortably in the browser. Server-side validation still has to run — never trust the client — but client-side checks improve UX without weakening the security model.
“JSON5 is just JSON with comments.” JSON5 adds comments, unquoted keys, trailing commas, hex numbers, and more. That makes it friendlier as a human-edited config format — the best-known consumer is tsconfig.json, which actually uses JSONC (a different non-standard superset) — but anything that strictly follows RFC 8259 will not accept it. Use JSON5/JSONC where the consumer documents support; emit strict JSON when writing to the network or to any tool whose parser you do not control.
“YAML is just JSON with indentation.” Every valid JSON document is valid YAML, but YAML adds features — anchors and aliases, tagged types, multiple documents in one file, block and folded scalars, indentation-sensitive parsing — that introduce bugs JSON cannot. The classic one is the so-called “Norway problem”: YAML 1.1 interprets NO as the boolean false unless you quote it. Moving from JSON to YAML to “make it readable” trades one class of problems for another.
Checklist
- Do you just need to read the data? Pretty-print it. Do not claim it is “validated.”
- Is it crossing a trust boundary (HTTP request, message queue, config file)? Parse and then run a JSON Schema check, not only a parse.
- Do you care about unknown fields? Set
additionalProperties: falsein the schema and decide whether to reject or strip extras. - Are the errors actionable? Configure the validator (for example, Ajv with
allErrors: true) to return every violation so users see all problems at once. - Does the schema live next to the code that uses the data? Drift between a spec and an implementation is easier to prevent when the schema is a source of truth for both.
- Is the format stable enough for JSON? If humans need to edit it and you control the parser, JSON5/JSONC or TOML may be more forgiving. If machines exchange it, stay with strict JSON.
Related Tool
The Patrache Studio JSON formatter runs in the browser, so the payload you paste never leaves your machine — useful when you are inspecting a production response that contains personal data. Payloads rarely live alone: if the JSON you are formatting includes an embedded binary, the rules in Base64 and URL Encoding: Purpose, Pitfalls, Correct Usage explain why the blob expands and when a different transport is appropriate. If the payload includes IDs, UUID v1 vs v4 vs v7: Picking a DB Primary Key covers why the exact version you generate on the server affects the way downstream systems index, sort, and cache that JSON.
References
- IETF RFC 8259, “The JavaScript Object Notation (JSON) Data Interchange Format” — https://datatracker.ietf.org/doc/html/rfc8259
- JSON Schema Specification (Draft 2020-12) — https://json-schema.org/specification
- Ajv — Another JSON Schema Validator — https://ajv.js.org/
- MDN, “Working with JSON” — https://developer.mozilla.org/en-US/docs/Learn/JavaScript/Objects/JSON