YAML and JSON solve the same fundamental problem — representing structured data as text — but they make different tradeoffs. JSON optimizes for machines: strict syntax, no ambiguity, fast parsing. YAML optimizes for humans: indentation-based nesting, comments, less punctuation. The choice between them depends on who’s reading the file and what’s consuming it.
YAML is (mostly) a superset of JSON
The YAML 1.2 spec explicitly states that any valid JSON document is also valid YAML. In practice, this is true: you can paste a JSON object into a YAML parser and it will work. But the reverse isn’t true — YAML supports features that JSON doesn’t: comments, anchors and aliases, multi-line strings, bare keys, and type inference.
This superset relationship is useful when migrating between formats. If you have a JSON config file and want to add comments, you can rename it to .yaml and start editing without converting the existing content. The JSON syntax remains valid.
The Norway problem
YAML’s most infamous gotcha involves implicit type coercion. In YAML 1.1 (the version most parsers default to), bare values are automatically interpreted based on their content. The value NO becomes the boolean false.
# country_codes.yaml
countries:
- name: Norway
code: NO # parsed as boolean false
- name: Germany
code: DE # parsed as string "DE"
# After parsing:
# { countries: [{ name: "Norway", code: false }, { name: "Germany", code: "DE" }] }The YAML 1.1 boolean recognition list is absurdly broad: y, Y, yes, Yes, YES, on, On, ON all become true. Their counterparts all become false. YAML 1.2 narrowed this to only true and false, but most tools still use YAML 1.1 parsing by default. PyYAML, the most popular Python YAML library, uses 1.1 rules. Ruby’s Psych uses 1.1 rules.
The fix: quote any string that could be misinterpreted. Or switch to a YAML 1.2–strict parser. Or use JSON, where strings are always quoted and there’s no ambiguity.
More YAML gotchas
Tabs vs spaces
YAML forbids tabs for indentation. Only spaces are allowed. This isn’t just a style preference — a tab character in a YAML file produces a parse error. Many editors default to tabs, and the error message from most parsers is unhelpful: found character '\t' that cannot start any token. If you’re editing YAML by hand, configure your editor to insert spaces on Tab.
Multiline strings
YAML has multiple multiline string syntaxes, and the differences are subtle:
# Literal block scalar — preserves newlines
description: |
This is line one.
This is line two.
# Result: "This is line one.\nThis is line two.\n"
# Folded block scalar — folds newlines into spaces
description: >
This is a long
paragraph.
# Result: "This is a long paragraph.\n"
# Literal block, strip trailing newline
description: |-
No trailing newline here.
# Result: "No trailing newline here."
# Folded block, strip trailing newline
description: >-
Also no trailing
newline.
# Result: "Also no trailing newline."That’s four variants (|, >, |-, >-), and there are additional chomping indicators ( |+ and >+) that keep all trailing newlines. JSON has one way to represent a string: double quotes with \n for newlines. More verbose, but no ambiguity.
Dates parsed as dates
YAML auto-parses values that look like dates into date objects:
version: 2024-01-15 # parsed as a Date object, not a string
version: "2024-01-15" # parsed as the string "2024-01-15"
port: 8080 # parsed as integer 8080
port: "8080" # parsed as string "8080"
ratio: 1e3 # parsed as integer 1000
ratio: "1e3" # parsed as string "1e3"If you have a version string like 2024-01-15 or an octal-looking value like 0123, YAML may parse it as something you didn’t intend. Again, quoting fixes this, but you have to remember to do it. JSON requires quotes around every string key and value, which eliminates the problem entirely.
When to use YAML
YAML shines when humans are the primary audience:
- Configuration files— Docker Compose, Kubernetes manifests, GitHub Actions workflows, Ansible playbooks. These are hand-edited regularly, and comments are critical for explaining why a setting exists. JSON doesn’t support comments at all (JSON5 does, but adoption is limited).
- CI/CD pipelines— GitHub Actions, GitLab CI, CircleCI all use YAML. The indentation-based nesting reads naturally for pipeline definitions, and inline comments document non-obvious steps.
- Infrastructure as code— Helm charts, CloudFormation templates, Compose files. These are often hundreds of lines and benefit from YAML’s reduced punctuation and anchors for reusing blocks.
YAML’s anchor/alias feature (& and *) lets you define a block once and reference it elsewhere. This is useful for DRY config, but it also makes files harder to read if overused — the reader has to chase references to understand the final structure.
When to use JSON
JSON wins when machines are the primary audience:
- APIs— REST and GraphQL APIs use JSON almost universally. Every language has a built-in or standard-library JSON parser. Parsing is fast, serialization is fast, and the format is unambiguous.
- Data interchange— when two services communicate, the data crosses a serialization boundary. JSON’s strict grammar means both sides parse the payload identically. YAML’s implicit typing means the same document can produce different results in different parsers.
- Anything that gets programmatically generated or consumed — package.json, tsconfig.json, lock files. These are read/written by tools more often than by humans. JSON’s deterministic serialization makes diffing and merging predictable.
- When security matters— YAML parsers that support the full spec are complex. PyYAML’s
yaml.load()with the default Loader can execute arbitrary Python objects — a deserialization vulnerability. JSON parsers don’t have this problem because the format is too simple to encode executable logic.
Tooling support
JSON has better tooling support across the board:
- JSON Schema is mature and widely supported. Editors like VS Code use it for autocompletion and validation in tsconfig.json, package.json, and other well-known files. YAML schemas exist but are less standardized.
- jqis the standard command-line JSON processor. There’s
yqfor YAML, but it has multiple incompatible implementations (Mike Farah’s Go version vs Andrey Kislyuk’s Python wrapper around jq) with different syntax. - Parsing speed— JSON parsers are significantly faster than YAML parsers. YAML’s grammar is context-sensitive and requires multiple passes to resolve anchors and tags. For a small config file this doesn’t matter. For processing thousands of documents in a pipeline, it adds up.
- Language support— every programming language includes a JSON parser in its standard library (Python
json, Goencoding/json, RubyJSON, JavaScriptJSON.parse). YAML requires a third-party library in most languages.
JSONC and JSON5: the middle ground
Some projects use JSON variants that add the most-requested YAML features without the full complexity:
- JSONC(JSON with Comments) — used by VS Code’s
settings.jsonandtsconfig.json. It adds//and/* */comments plus trailing commas. Everything else is standard JSON. - JSON5— adds comments, trailing commas, unquoted keys, single-quoted strings, multi-line strings, and hex numbers. Closer to JavaScript object literal syntax.
If you want comments in a config file but don’t want the full YAML spec, JSONC or JSON5 might be the right choice. The downside is that standard JSON parsers reject them, so your toolchain needs explicit support.
The quick decision
Pick YAML when: humans edit the file regularly, comments are needed, the file is part of a YAML-native ecosystem (Kubernetes, GitHub Actions, Ansible), and you can enforce quoting conventions to dodge the implicit typing traps.
Pick JSON when: machines generate or consume the file, you need cross-language compatibility, parsing speed matters, or you don’t want to worry about implicit type coercion.
When you need to move data between the two formats, the YAML ↔ JSON Converter handles the conversion in your browser — paste YAML and get JSON, or paste JSON and get YAML, with multi-document support and syntax highlighting.