JSON Schema Features
dotjson
expects a JSON Schema to enforce language model output. This guide explains supported features and provides practical examples.
dotjson
supports most features from JSON Schema specification version 2020-12. For an introduction to JSON Schema, visit the official documentation.
dotjson
does not impose limits on the number of properties or depth of nested objects. dotjson
supports a variety of rich JSON schema features such as recursive schemas, inline regular expressions, array and string length constraints, and more.
Tip
Throughout this guide, examples are shown with both the schema definition and a valid example object that conforms to the schema.
Basic schema example
A JSON schema looks like this:
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"properties": {
"first_name": {"type": "string"},
"age": {"type": "integer"},
"height": {"type": "number"},
"is_customer": {"type": "boolean"}
},
"required": ["first_name", "age", "height", "is_customer"]
}
dotjson
disables tokens inconsistent with the structure of a JSON object, or that would not validate against the given schema. A language model constrained by dotjson
might output a result like this:
{
"first_name": "John",
"age": 37,
"height": 1.80,
"is_customer": true
}
Note
The $schema
is not technically required, but it is considered good practice to include it for validation and interoperability purposes.
Core JSON data types
The type of a value in dotjson
is specified using the type
keyword.
For example, string values are defined as:
{"type": "string"}
This section provides an overview of valid entries for the type
keyword.
String
Strings represent text data. dotjson
supports the following string constraints:
minLength
andmaxLength
, specifiying the minimum and maximum number of characters in the text.pattern
, a regular expression defining the text the language model must generate.const
, specifying that the value must be exactly a pre-specified value.const
is useful when you do not want the language model to generate a value in your schema, such as for identifiers, names, etc.enum
, which requires a field to be one of several pre-specified options.format
, a commonly-used format for various types of text data, such as email, ipv4 addresses, or UUIDs.
Note
dotjson
does not support multiple constraints of different types. For example, minLength
and maxLength
cannot be used with format
.
String length constraints
Strings may be constrained to have minimum and/or maximum character lengths using the minLength
and maxLength
characters.
Using minLength
:
{
"type": "object",
"properties": {
"password": {
"type": "string",
"minLength": 8,
"description": "Password must be at least 8 characters"
}
},
"required": ["password"]
}
{
"password": "secureP@ssw0rd"
}
Specifying a maximum number of characters with maxLength
:
{
"type": "object",
"properties": {
"username": {
"type": "string",
"maxLength": 20,
"description": "Username cannot exceed 20 characters."
}
},
"required": ["username"]
}
{
"username": "kilroy"
}
Note
String length constraints can increase compilation times, though they do not significantly impact runtime performance.
Regular expressions with pattern
JSON strings can be constrained to follow a particular format determined by a regular expression through the use of the pattern
keyword. dotjson
supports most standard regular expression features, though please see unsupported features for details on unsupported features.
Here is a simple example of pattern
usage to match a five-digit US zip code, or the extended nine-digit zip code:
{
"type": "object",
"properties": {
"zipCode": {
"type": "string",
"pattern": "[0-9]{5}(-[0-9]{4})?",
"description": "US ZIP code in 5-digit or 5+4 format"
}
},
"required": ["zipCode"]
}
{
"zipCode": "12345"
}
or
{
"zipCode": "12345-6789"
}
Common text formats with format
The format
keyword allows you to specify that a string field must conform to a predefined, commonly used format. This is useful for ensuring that a model’s output text adheres to a standardized format, such as email, URI, UUID, dates, etc.
Supported format
values are:
email
- Email addresses (e.g.,[email protected]
)hostname
- Domain names (e.g.,example.com
)ipv4
- IPv4 addresses (e.g.,192.168.1.1
)uri
- URI addresses (e.g.,2001:db8:85a3::8a2e:370:7334
)uri-reference
- URI references (e.g.,https://example.com/resource?param=value
)uuid
- UUIDs (e.g.,123e4567-e89b-12d3-a456-426614174000
)date
- ISO8601 dates (e.g.,2023-04-15
)time
- ISO8601 times (e.g.,14:30:15Z
)date-time
- ISO8601 date-times (e.g.,2023-04-15T14:30:15Z
)duration
- ISO8601 durations (e.g.,P1Y2M3DT4H5M6S
)
Applying format
arguments is useful when model output must be used in standard applications. Using the uri
format guarantees that the model’s output is parsable by any web tool, though as with any language model the semantic correctness of the URI cannot be guaranteed.
The following schema demonstrates the use of all format
types available.
{
"type": "object",
"properties": {
"email": {
"type": "string",
"format": "email",
"description": "Email address"
},
"hostname": {
"type": "string",
"format": "hostname",
"description": "Domain name"
},
"ipv4": {
"type": "string",
"format": "ipv4",
"description": "IPv4 address"
},
"uri": {
"type": "string",
"format": "uri",
"description": "URI address"
},
"uri_reference": {
"type": "string",
"format": "uri-reference",
"description": "URI reference"
},
"uuid": {
"type": "string",
"format": "uuid",
"description": "Universally Unique Identifier"
},
"date": {
"type": "string",
"format": "date",
"description": "ISO8601 date"
},
"time": {
"type": "string",
"format": "time",
"description": "ISO8601 time"
},
"date_time": {
"type": "string",
"format": "date-time",
"description": "ISO8601 date-time"
},
"duration": {
"type": "string",
"format": "duration",
"description": "ISO8601 duration"
}
},
"required": ["email", "hostname", "ipv4", "uri", "uri_reference", "uuid", "date", "time", "date_time", "duration"]
}
{
"email": "[email protected]",
"hostname": "example.com",
"ipv4": "192.168.1.1",
"uri": "https://example.com/resource?param=value",
"uri_reference": "https://example.com/resource?param=value",
"uuid": "123e4567-e89b-12d3-a456-426614174000",
"date": "2023-04-15",
"time": "14:30:15Z",
"date_time": "2023-04-15T14:30:15Z",
"duration": "P1Y2M3DT4H5M6S"
}
Number & integer
Numeric values use types integer
for integer values, and number
for any numeric value. number
also includes integers.
A common use case for numeric types is extracting information from webpages, transcripts, or images.
{
"type": "object",
"properties": {
"product_id": {
"type": "integer",
"description": "Product identifier"
},
"price": {
"type": "number",
"description": "Current product price"
},
"name": {
"type": "string"
}
},
"required": ["product_id", "price", "name"]
}
{
"product_id": 1235,
"price": 79.99,
"name": "Wireless Headphones"
}
Boolean
boolean
s are true
or false
values and are useful for simple binary classification tasks.
Let’s start with a simple complaint classifier that identifies whether a customer message is a complaint:
{
"type": "object",
"properties": {
"is_complaint": {
"type": "boolean"
}
},
"required": [
"is_complaint"
]
}
{
"is_complaint": false
}
This simple schema allows a language model to classify messages as complaints or not. In a real application, you might want to extend this with additional fields, as we’ll see in later examples.
Null
The null
type explicitly defines a field that must have a null value. Setting fields to null explicitly is not a common practice on its own, but null
is frequently used to define optional values in combination with other JSON schema features, such as anyOf
.
{
"type": "object",
"properties": {
"always_null": {
"type": "null"
}
},
"required": ["always_null"]
}
{
"always_null": null
}
null
becomes more useful when combined with other types to create optional fields or to represent the absence of a value.
Let’s extend our earlier complaint classifier to make it more practical. In addition to determining whether a message is a complaint, we’ll add an optional response field for customer service representatives to use:
{
"type": "object",
"properties": {
"is_complaint": {
"type": "boolean",
"description": "Whether the customer message is classified as a complaint"
},
"response_to_complaint": {
"anyOf": [
{"type": "string"},
{"type": "null"}
],
"default": null,
"description": "Optional response text for customer service to use if this is a complaint"
}
},
"required": [
"is_complaint"
]
}
{
"is_complaint": true,
"response_to_complaint": "I'm sorry to hear that you're having trouble with the Super Happy Fun Ball! Let's get you a refund right away."
}
Or for a non-complaint:
{
"is_complaint": false,
"response_to_complaint": null
}
Note
This example doesn’t implement conditional validation (which is not fully supported in dotjson
). A language model might generate inconsistent values - for example, setting is_complaint
to false but still providing a response. Your application should handle these potential inconsistencies, as in this python code:
if data["is_complaint"] and data["response_to_complaint"] is not None:
print(f"Response to complaint: {data['response_to_complaint']}")
elif data["is_complaint"]:
print("This is a complaint that needs a response")
else:
print("Not a complaint")
Arrays
Arrays are used for collections of elements. Arrays can be specified in two ways:
- The
items
keyword that defines a subschema shared by all elements in the array. - The
prefixItems
keyword, which specifies the subschema for the first N elements in the array.
Homogeneous arrays are specified using
{ "type": "array", "items": { "type": "number" } }
dotjson
supports:
- Item validation using the
items
keyword for homogeneous arrays. - Tuple validation with
prefixItems
for ordered, mixed-type arrays. - Array length enforcement using the
minItems
andmaxItems
keywords.
Specifying element type with items
Item validation is used to ensure that all elements in an array are of the same type, using the keyword items
.
All items in an item-validated array must be of the same type, i.e. [1,2,3]
. Mixed types like [1, '2', true]
are disallowed.
This schema requires the model to produce only an array of number
s:
{
"type": "object",
"properties": {
"some_numbers": {
"type": "array",
"items": {
"type": "number"
}
}
},
"required": ["some_numbers"]
}
{"some_numbers":[1,2,3,4,5]}
Here’s a more practical example schema designed to generate hashtags for a piece of text content. Note that pattern
is assigned the regular expression #[a-z]+
. All elements of the hashtags
array must begin with a #
and use only lowercase letters.
{
"type": "object",
"properties": {
"hashtags": {
"type": "array",
"items": {
"type": "string",
"pattern": "#[a-z]+"
}
}
},
"required": ["hashtags"]
}
{
"hashtags":["#gardening", "#plants", "#zombies"]
}
You can specify any valid JSON schema as the value of items
. Here’s an example of an array containing objects with contact information:
{
"type": "object",
"properties": {
"contacts": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"email": { "type": "string", "format": "email" },
"phone": { "type": "string" }
},
"required": ["name", "email"]
},
"description": "List of contact information"
}
},
"required": ["contacts"]
}
{
"contacts": [
{
"name": "Jane Smith",
"email": "[email protected]",
"phone": "555-1234"
},
{
"name": "John Doe",
"email": "[email protected]"
}
]
}
Array length constraints
Control the number of items in an array using minItems
and maxItems
:
minItems
: Minimum number of elements required in the arraymaxItems
: Maximum number of elements allowed in the array
{
"type": "object",
"properties": {
"topFiveMovies": {
"type": "array",
"items": { "type": "string" },
"minItems": 1,
"maxItems": 5,
"description": "User's top 1-5 favorite movies"
}
},
"required": ["topFiveMovies"]
}
{
"topFiveMovies": ["The Matrix", "Inception", "Interstellar"]
}
Object
Required properties
JSON object properties are not required by default, but can be made required by including the property name in the required
array. If a property is not included in required
, the model will choose whether or not to include it in the final output.
In this example, the model must produce a name and email, but can choose whether to include a phone number:
{
"type": "object",
"properties": {
"name": { "type": "string" },
"email": { "type": "string" },
"phone": { "type": "string" }
},
"required": ["name", "email"]
}
{
"name": "Jane Smith",
"email": "[email protected]"
}
or
{
"name": "Jane Smith",
"email": "[email protected]",
"phone": "(123) 456-7890"
}
Note
Property validation with additionalProperties
The additionalProperties
keyword defaults to false
, meaning language models cannot generate properties that are not explicitly defined in the schema.
additionalProperties: true
is not currently supported, and will result in an error.
For nested objects, you can specify required properties at each level independently. The required
array applies only to properties at the same level where it is defined.
This example shows a user profile schema with contact information as a nested object. In the main object, username
and contact
are required, while birthdate
is optional. Within the nested contact
object, email
is required, while phone
and address
are optional:
{
"type": "object",
"properties": {
"username": {
"type": "string",
"description": "User's login name"
},
"birthdate": {
"type": "string",
"format": "date",
"description": "User's date of birth"
},
"contact": {
"type": "object",
"properties": {
"email": {
"type": "string",
"format": "email",
"description": "Primary contact email"
},
"phone": {
"type": "string",
"description": "Contact phone number"
},
"address": {
"type": "object",
"properties": {
"street": { "type": "string" },
"city": { "type": "string" },
"country": { "type": "string" }
},
"required": ["street", "city"],
"description": "Physical address (street and city required if address is provided)"
}
},
"required": ["email"],
"description": "Contact information (email required)"
}
},
"required": ["username", "contact"]
}
{
"username": "jsmith2024",
"contact": {
"email": "[email protected]",
"address": {
"street": "123 Main Street",
"city": "Springfield"
// country is optional
}
// phone is optional
}
// birthdate is optional
}
With optional fields included:
{
"username": "jsmith2024",
"birthdate": "1990-05-15",
"contact": {
"email": "[email protected]",
"phone": "555-123-4567",
"address": {
"street": "123 Main Street",
"city": "Springfield",
"country": "United States"
}
}
}
In this example:
- At the top level, only
username
andcontact
must be included - Within the
contact
object, onlyemail
is required - If an
address
is provided, it must include bothstreet
andcity
properties
Enumerated values with enum
The enum
keyword restricts a value to a predefined set of options. It can be used with any JSON data type – strings, numbers, booleans, or even objects and arrays. When enum
is used, the model can only generate one of the specified values.
This is particularly useful for:
- Forcing the model to choose from a specific set of categories
- Ensuring standardized responses
- Creating controlled vocabularies
- Building classification or tagging systems
Here’s a simple example that forces the model to classify a product into one of three categories:
{
"type": "object",
"properties": {
"product_category": {
"type": "string",
"enum": ["electronics", "clothing", "home_goods"],
"description": "The category this product belongs to"
},
"product_name": {
"type": "string",
"description": "Name of the product"
}
},
"required": ["product_category", "product_name"]
}
{
"product_category": "electronics",
"product_name": "Wireless Headphones"
}
You can also use enum
with numeric values:
{
"type": "object",
"properties": {
"priority": {
"type": "integer",
"enum": [1, 2, 3, 5, 8],
"description": "Task priority using Fibonacci sequence"
},
"task": {
"type": "string",
"description": "Description of the task"
}
},
"required": ["priority", "task"]
}
{
"priority": 3,
"task": "Update user documentation"
}
Practical enum applications
Enums are particularly useful for building structured classification systems. This example demonstrates a more complex feedback categorization system:
{
"type": "object",
"properties": {
"feedback_type": {
"type": "string",
"enum": ["bug_report", "feature_request", "compliment", "complaint", "question"],
"description": "The category of user feedback"
},
"severity": {
"type": "string",
"enum": ["critical", "high", "medium", "low"],
"description": "How severe or important the feedback is"
},
"product_area": {
"type": "string",
"enum": ["ui", "performance", "security", "documentation", "billing", "other"],
"description": "The area of the product this feedback relates to"
},
"message": {
"type": "string",
"description": "The actual feedback message from the user"
},
"suggested_action": {
"type": "string",
"description": "Suggested next steps based on the feedback"
}
},
"required": ["feedback_type", "severity", "product_area", "message", "suggested_action"]
}
{
"feedback_type": "bug_report",
"severity": "high",
"product_area": "performance",
"message": "The application becomes extremely slow after uploading more than 5 images at once.",
"suggested_action": "Investigate image processing queue and implement batch processing with progress indicators."
}
Tip
Communicate available enum options to the model when using enums. You can inform the model directly by providing a list of options, or by describing the options in such a way that the model can infer the options.
Constant values with const
The const
keyword allows you to specify an exact value that must be generated. When const
is used, the model can only generate tokens consistent with the specified value. This applies to any JSON data type - strings, numbers, booleans, objects, or arrays.
const
is particularly useful when:
- You don’t want the model to freely generate certain values (like IDs, usernames, or other fixed data)
- You want to simplify your data model by showing the model all your data and asking it to fill in only specific fields
- You want to provide context to the model without including it in the prompt
String constants
Here’s an example where we use const
to pin a UUID and genre to known values, and allow the model to generate a story:
{
"type": "object",
"properties": {
"id": {
"type": "string",
"const": "e29d2712-8e93-4486-b8a9-d99e84f3dd6b"
},
"genre":{
"type":"string",
"const": "science fiction"
},
"story": {
"type": "string",
"minLength": 20,
"description": "A short story generated by the language model."
}
},
"required": ["id", "story", "genre"]
}
{
"id": "e29d2712-8e93-4486-b8a9-d99e84f3dd6b",
"genre": "science fiction",
"story": "Once upon a time, there was a very depressed robot named Marvin."
}
Tip
In this example, setting "genre": "science fiction"
as a constant provides thematic guidance to the model. This is a powerful technique because:
- It ensures consistent metadata in your output (the ID and genre never change)
- It allows you to guide the content generation without putting those instructions in your prompt
- The model recognizes the genre constraint and produces content that fits that theme
- You can easily change the genre constant to get different themed content without changing your prompt
Object constants
The const
keyword can also be applied to objects and arrays. This is useful when you want to fix certain structural elements while allowing the model to generate others.
Here’s an example of a product configuration where the available options are fixed, but the user preferences are generated by the model:
{
"type": "object",
"properties": {
"product_info": {
"type": "object",
"const": {
"name": "Smart Speaker",
"model": "Echo-2023",
"available_colors": ["black", "white", "blue"],
"available_features": ["voice_control", "music_streaming", "smart_home"]
}
},
"model_selection": {
"type": "object",
"properties": {
"color": {
"type": "string",
"enum": ["black", "white", "blue"],
"description": "Model's color preference"
},
"selected_features": {
"type": "array",
"items": {
"type": "string",
"enum": ["voice_control", "music_streaming", "smart_home"]
},
"description": "Features the model wants to enable"
},
"quantity": {
"type": "integer",
"minimum": 1,
"description": "Number of units to purchase"
}
},
"required": ["color", "selected_features", "quantity"]
}
},
"required": ["product_info", "model_selection"]
}
{
"product_info": {
"name": "Smart Speaker",
"model": "Echo-2023",
"available_colors": ["black", "white", "blue"],
"available_features": ["voice_control", "music_streaming", "smart_home"]
},
"model_selection": {
"color": "blue",
"selected_features": ["voice_control", "smart_home"],
"quantity": 2
}
}
In this example, the product_info
object is entirely fixed with const
, while the model generates the model_selection
object based on the schema constraints. This pattern is useful for scenarios where you want to provide the model with fixed reference information while it generates related content.
Schema combinations
References and schema reuse with $defs
References allow you to define a schema once and reuse it multiple times, which is particularly valuable for large, complex schemas.
References allow you to define common structures once and reference them throughout your schema. References have the advantage of
- Improved readability
- Consistency across the schema
- Modular design
The $defs
keyword at the root level of the schema creates a library of reusable schema definitions, while the $ref
keyword references those definitions in the primary schema.
dotjson
only supports references to definitions within the same schema, not external references such as file://
or http://
. The reference #
references the root of the schema, and subschemas can be referenced using #/$defs/<name>
.
Here’s a minimal example of reference syntax:
{
"type": "object",
"properties": {
"user": { "$ref": "#/$defs/person" }
},
"required": ["user"],
"$defs": {
"person": {
"type": "object",
"properties": {
"name": { "type": "string" },
"age": { "type": "integer" }
}
}
}
}
Here’s a more practical example that demonstrates schema reuse. We’ll define an address
structure once and reference it for both billing and shipping addresses, avoiding redundant definitions:
{
"type": "object",
"properties": {
"billing_address": { "$ref": "#/$defs/address" },
"shipping_address": { "$ref": "#/$defs/address" }
},
"$defs": {
"address": {
"type": "object",
"properties": {
"street": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" },
"zip": { "type": "string" }
},
"required": ["street", "city", "state", "zip"]
}
},
"required": ["billing_address"]
}
Here is an example of a schema that should use references but does not. Note the repeated definition of the same schema.
{
"type": "object",
"properties": {
"billing_address": {
"type": "object",
"properties": {
"street": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" },
"zip": { "type": "string" }
},
"required": ["street", "city", "state", "zip"]
},
"shipping_address": {
"type": "object",
"properties": {
"street": { "type": "string" },
"city": { "type": "string" },
"state": { "type": "string" },
"zip": { "type": "string" }
},
"required": ["street", "city", "state", "zip"]
}
},
"required": ["billing_address"]
}
{
"billing_address": {
"street": "123 Main St",
"city": "Anytown",
"state": "CA",
"zip": "12345"
},
"shipping_address": {
"street": "456 Market St",
"city": "Somewhere",
"state": "NY",
"zip": "67890"
}
}
Without references, you would need to duplicate the entire address structure definition, making the schema harder to maintain. If you later needed to add a field like country
to all addresses, you would only need to update it in the snippet definition.
References can also be nested to create more complex hierarchical structures. Here, we define both a person
schema and an address
schema, with person
referencing address
:
{
"type": "object",
"properties": {
"people": {
"type": "array",
"items": { "$ref": "#/$defs/person" }
}
},
"required": ["people"],
"$defs": {
"person": {
"type": "object",
"properties": {
"firstName": { "type": "string" },
"lastName": { "type": "string" },
"address": { "$ref": "#/$defs/address" }
},
"required": ["firstName", "lastName", "address"]
},
"address": {
"type": "object",
"properties": {
"street": { "type": "string" },
"city": { "type": "string" },
"country": { "type": "string" }
},
"required": ["street", "city", "country"]
}
}
}
{
"people": [
{
"firstName": "Elizabeth",
"lastName": "Hodges",
"address": {
"street": "1479 Harbor Oaks Drive",
"city": "Los Angeles",
"country": "United States"
}
},
{
"firstName": "Henry",
"lastName": "Wicket",
"address": {
"street": "4628 Summerfield Place",
"city": "Fargo",
"country": "United States"
}
}
]
}
Recursive schemas
dotjson
supports recursive, self-referencing schemas of unlimited depth. See the JSON Schema guide for more information on recursion.
A simple example is an object with a string name
field and a children
field containing an array of objects, also with name
and children
fields. We denote the recursion with "items": { "$ref": "#" }
, which means that all items in the children
array must follow the same schema as the root schema.
{
"type": "object",
"properties": {
"name": { "type": "string" },
"children": {
"type": "array",
"items": { "$ref": "#" }
}
},
"required": ["name"]
}
{
"name": "Root",
"children": [
{
"name": "Child 1",
"children": [
{
"name": "Grandchild 1",
"children": []
}
]
},
{
"name": "Child 2"
}
]
}
Recursive schemas are useful for using language models to generate hierarchical structures. This can include modeling/extracting organization charts, product taxonomies, navigation menus, knowledge graphs, etc.
The following example demonstrates a recursive schema for modeling knowledge graphs of arbitrary size and depth. Each node in the graph represents a concept with a name, definition, and list of related concepts. The recursive nature of the schema means that each related concept follows the same structure, allowing for a rich network of interconnected concepts and definitions.
{
"type": "object",
"properties": {
"concept": {
"type": "string",
"description": "The main concept or topic"
},
"definition": {
"type": "string",
"description": "Brief definition of the concept"
},
"related_concepts": {
"type": "array",
"items": {
"$ref": "#"
},
"description": "Related sub-concepts that help explain the main concept"
}
},
"required": ["concept", "definition"]
}
{
"concept": "Artificial Intelligence (AI)",
"definition": "AI is the simulation of human intelligence in machines that are programmed to think and learn
like humans do, and to perform tasks that typically require human intelligence to complete.",
"related_concepts": [
{
"concept": "Machine Learning (ML)",
"definition": "A subset of AI that involves the development of algorithms and statistical models that enable
computers to perform a specific task without explicit instructions.",
"related_concepts": [
{
"concept": "Supervised Learning",
"definition": "A type of machine learning where a model is trained on labeled data to learn and predict
outcomes."
},
{
"concept": "Unsupervised Learning",
"definition": "A type of machine learning where a model is trained on data without labels to identify
patterns and groupings within the data."
}
]
},
{
"concept": "Knowledge Representation",
"definition": "The encoding of knowledge in a form that can be easily understood and processed by AI.",
"related_concepts": [
{
"concept": "Semantic Networks",
"definition": "Graphs used to represent knowledge using nodes and edges to show the relationships
between concepts."
},
{
"concept": "Ontologies",
"definition": "Formal representations of knowledge describing a set of concepts and the relationships
between them."
}
]
}
]
}
Note
Complex recursive schemas can be slow to compile due to their complexity. You may wish to make sure you cache indices to amortize compilation costs.
Choosing a subschema with anyOf
anyOf
allows the model to choose from one of the available subschemas. anyOf
is useful when you want to define several “paths” for the model to respond in.
Here is an example of a schema that requires a language model to respond as if it were a webserver. It will return a status code (400
, 200
, 404
, etc.) Schemas like this are useful when the model is able to manage an applications internals and send messages directly back to the caller.
{
"type": "object",
"properties": {
"response": {
"anyOf": [
{
"type": "object",
"properties": {
"success": { "type": "boolean", "const": true },
"status": { "type": "integer", "enum": [200] },
"data": {
"type": "object",
"properties": {
"id": { "type": "string" },
"name": { "type": "string" },
"timestamp": { "type": "string" }
},
"required": ["id", "name", "timestamp"]
}
},
"required": ["success", "status", "data"]
},
{
"type": "object",
"properties": {
"success": { "type": "boolean", "const": false },
"status": { "type": "integer", "enum": [400, 401, 403, 404, 500] },
"error": {
"type": "object",
"properties": {
"code": { "type": "string" },
"message": { "type": "string" },
"details": { "type": "string" }
},
"required": ["code", "message"]
}
},
"required": ["success", "error"]
},
{
"type": "object",
"properties": {
"success": { "type": "boolean", "const": false },
"status": { "type": "integer", "enum": [301, 302] },
"redirect": {
"type": "object",
"properties": {
"url": { "type": "string" },
"temporary": { "type": "boolean" }
},
"required": ["url", "temporary"]
}
},
"required": ["success", "redirect"]
}
]
}
},
"required": ["status", "response"]
}
Successful response:
{
"response": {
"success": true,
"status": 200,
"data": {
"id": "user_12345",
"name": "John Doe",
"timestamp": "2025-03-20T14:32:10Z"
}
}
}
Code 400 response:
{
"response": {
"success": false,
"status": 400,
"error": {
"code": "INVALID_INPUT",
"message": "The provided email address is invalid",
"details": "Email must be in the format [email protected]"
}
}
}
Redirect response:
{
"response": {
"success": false,
"status": 302,
"redirect": {
"url": "https://api.example.com/v2/resources",
"temporary": true
}
}
}
Real-world examples
Sentiment analysis
{
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": ["positive", "neutral", "negative"],
"description": "The classified sentiment category"
},
"confidence": {
"type": "number",
"minimum": 0,
"maximum": 1,
"description": "Confidence score of the sentiment classification"
},
"analysis": {
"type": "string",
"description": "Detailed analysis explaining the sentiment classification"
}
},
"required": ["sentiment", "confidence", "analysis"]
}
{
"sentiment": "positive",
"confidence": 0.87,
"analysis": "The text contains multiple positive expressions and enthusiastic language, with no significant negative elements."
}
Tip
For language model applications, consider using a combination of enum
for categorizable outputs and free-form strings for explanations and analysis.
Unsupported features
The following JSON Schema features are not fully supported in dotjson
:
- Conditional validation (if-then-else), dependent properties and schemas
patternProperties
- Numeric constraints (e.g.
multipleOf
,maxmimum
,minimum
,exclusiveMaximum
,exclusiveMinimum
) - Unique items in arrays
multipleOf
schemaoneOf
,allOf
andnot
Certain regular expression patterns are not supported:
\b
for word boundaries- Backwards references, e.g.
\1
or(?P=open)
- Conditional matches
(?(1)a|b)
- Lookbacks
(?=bar)
- Lookaheads
foo(?=bar)
- Lookbehinds
(?<=foo)bar
- Atomic groups
(?>pattern)
- Recursion
(?R)
or(?1)
- Non-capturing groups
(?:pattern)
- Named captures
(?P<name>pattern)
- Inline modifiers
(?i)case-insensitive
- Subroutines
\g<1>
- Branch resets
(?|pattern1|pattern2)
- Inline comments
(?#comment)
- Code callouts
(*MARK:name)
- Version checks
(*VERSION)
- Whitespace-insensitive patterns, e.g.
(?x)pattern # comment
Warning
Using these unsupported features may result in unexpected behavior or validation errors.