Structured Data

Information organized in a defined format with explicit field names, data types, and relationships -- making it directly readable by machines without human interpretation.

Also known as: structured information, machine-readable data

Structured data is information that has been organized according to a predefined schema: field names are explicit, data types are consistent, and the format is predictable from record to record. Common formats include JSON, CSV, XML, SQL tables, and JSON-LD.

The alternative is unstructured data: free text, images, audio, and video that machines must interpret rather than read directly. Semi-structured data (like HTML) falls in between -- it has some organizational markers but is not consistently machine-parseable.

For AI systems, structured data is preferable because it removes ambiguity. A machine reading structured data knows exactly which field contains the price, the category, the name, and the date -- it does not have to extract and guess these from prose. This makes structured datasets faster to process, more accurate to use, and more reliable to share.

In the context of web publishing, 'structured data' often refers specifically to JSON-LD or schema.org markup embedded in HTML pages. In the AI dataset context, it refers more broadly to any data organized in a machine-readable schema.