Example
This page provides a complete working example of how to use dotjson
to constrain an LLM to generate valid JSON.
Depending on your installation method, you may need to adjust the include paths. The example below shows both system-wide and project-local include options, assuming you’ve followed the installation guide.
This example demonstrates:
- Creating a
Vocabulary
for gpt2 - Building an
Index
for a simple JSON schema - Using the
LogitsProcessor
to mask out invalid tokens during generation - Simulating an LLM’s token generation with random logits
- Performing weighted sampling from the valid tokens
- Continuing generation until either the EOS token is generated or a maximum token limit is reached
This example is a standalone implementation without an actual LLM backend. It uses random values to simulate token logits. In a real implementation, these would come from your language model’s forward pass.
Example code
Save the following code as example.cpp
.
example.cpp
// For system-wide installation
#include <dottxt/dotjson/dotjson.hpp>
#include <dottxt/dotjson/cxx.h>
// // For project-local installation
// #include "dottxt/dotjson/include/dotjson.hpp"
// #include "dottxt/dotjson/include/cxx.h"
#include <iostream>
#include <limits>
#include <vector>
#include <random>
#include <algorithm>
int main() {
// The vocabulary for gpt2 has 50257 tokens
std::string model = "gpt2";
u_int32_t eos_token_id = 50256; // EOS token to check if we have completed sampling
// The schema is a simple object with one integer property
std::string schema =
"{\"type\":\"object\",\"properties\":{\"x\":{\"type\":\"integer\"}}}";
// Create the vocabulary and index
dotjson::Vocabulary vocabulary(model);
dotjson::Index index(schema, vocabulary);
// The mask value is 0, batch size is 1
u_int16_t mask_value = 0;
u_int32_t batch_size = 1;
u_int32_t vocab_size = vocabulary.max_token_id() + 1;
// Setup for random number generation, used to simulate language model
// logit calculation.
std::random_device rd;
std::mt19937 gen(rd());
// Since we don't have an LLM in this example, we'll be using randomly
// assigned logit scores between 1-100. Note that 0 is not in this range,
// as our mask_value is 0.
std::uniform_int_distribution<u_int16_t> logits_distrib(1, 100);
// Create the logits and context, initialized with random numbers
std::vector<u_int16_t> logits_base(vocab_size);
for (u_int16_t& logit : logits_base) {
logit = logits_distrib(gen);
}
std::span<u_int16_t> logits_span(logits_base.data(), logits_base.size());
std::vector<std::span<u_int16_t>> logits = {logits_span};
// Construct a guide
dotjson::Guide guide(index, batch_size);
// Create the set of initial permitted tokens
dotjson::BatchedTokenSet allowed_tokens = guide.get_start_tokensets();
// Construct the processor, a function that masks out invalid tokens
dotjson::LogitsProcessor processor(guide, mask_value);
// Initialize the sequence of prior generated tokens
std::vector<std::vector<u_int32_t>> context(batch_size);
//
// The following loop will:
//
// 1. Randomly assign logits to all tokens (to simulate an LLM forward pass)
// 2. Mask out tokens inconsistent with the schema by setting their values
// to mask_value (0 in this case)
// 3. Select a token randomly, proportional to its logit
// 4. Add that token to the context
// 5. Check if the new token is the EOS token -- if so, stop sampling.
//
// Set a maximum number of tokens to sample
const int max_tokens_to_sample = 100;
// Begin sampling loop
for (int i = 0; i < max_tokens_to_sample; ++i) {
// Reset logits_base with new random values before processing
for (u_int16_t& logit : logits_base) {
logit = logits_distrib(gen);
}
// Mask out any tokens that are inconsistent with the schema
processor(logits, allowed_tokens);
// Sample proportionally from the logits vector using std::discrete_distribution.
std::discrete_distribution<> distrib(logits_base.begin(), logits_base.end());
u_int32_t sampled_token_id = distrib(gen);
// Add the sampled token to the context for the current batch entry (0)
context[0].push_back(sampled_token_id);
// Update the set of allowed tokens for the next iteration
allowed_tokens = guide.get_next_tokensets({sampled_token_id});
// Check if the sampled token is the EOS token, if so, exit.
if (sampled_token_id == eos_token_id) {
std::cout << "EOS token (" << eos_token_id << ") sampled. Stopping generation." << std::endl;
break;
}
}
std::cout << "Final tokens: ";
for(const auto& token : context[0]) {
std::cout << token << " ";
}
std::cout << std::endl;
}
Compilation
To compile the example, run the following command:
g++ -o example example.cpp -ldotjson -ldotjsoncpp -Wall -Wextra -std=c++20 -O3
g++ -o yourprogram yourprogram.cpp -I. -L./dottxt/dotjson/lib -ldotjson -ldotjsoncpp -Wall -Wextra -std=c++20 -O3 -Wl,-rpath,\$ORIGIN/dottxt/dotjson/lib
Execution
To execute the example, run the following command:
./example
You should see output similar to the following:
EOS token (50256) sampled. Stopping generation.
Final tokens: 90 197 628 198 92 628 197 198 201 197 201 628 628 197 198 197 198 50256