top of page

Semi-structured data

Structured is data that is contained in a field for a particular record - such as data in a spreadsheet or a relational database. Unstructured data. Most data is unstructured - electronic files, metadata, web pages.

Semi-structured data is data that is not organized in a relational database, but which contains tags that separate elements of the data. An object database uses semi-structured data. It does not use tables, but instead creates relationships between two data sets directly. Many-to-many relations are established, rather than one-to-many as in relational databases. Object databases work better with complex data.

JSON, email, electronic data interchange (EDI) files, and .xml files are examples of semi-structured files. A .xml file for the contents of a Word document uses hundreds of tags for character formatting, footnotes, etc. - which are nested together in different levels. A .docx file consists of XML files inside a ZIP archive.

bottom of page