API Reference

Namespaces

`dotjson`

Main namespace containing the public API.

Classes

`BatchedTokenSet`

An opaque class containing the information needed for the DotJsonLogits processor to efficiently mask a batch of logits vectors.

Constructors

explicit BatchedTokenSet(rust::Box<internal::BatchedTokenSet> token_set);

Methods

std::vector<bool> contains(u_int32_t token_id);

Description: Returns a vector of booleans indicating if a particular token is marked as allowed in a token set
Parameters:
- token_id: A tokenid to check
Returns: A vector of booleans with length equal to the batch size, where each element indicates if the token is allowed in the corresponding batch element

Tip

This can be used to stop early if the EOS token is available in the set of allowed tokens. For example, if the EOS token has ID 42, you can use

if (batched_token_set.contains(42)[0]) {
    // EOS token is allowed in the first batch element, stop this sequence early
}

Early stopping may not be appropriate for all use cases. Typically, you will want to stop when the model has sampled the EOS token, not when it is first available.

std::vector<std::size_t> num_allowed();

Description: Returns a vector of length batch_size containing the number of allowed tokens in each tokenset
Returns: A vector of integers with length equal to the batch size, where each element indicates the number of allowed tokens in the corresponding batch element

Tip

num_allowed can be useful for skipping model forward passes. If only one token is allowed in a batch element, you can sample that token directly without evaluating the model.

`Vocabulary`

A vocabulary for constructing an index.

Constructors

Vocabulary(
    std::string &model, 
    const std::string &revision = "",
    const std::vector<std::pair<std::string, std::string>> &user_agent = {},
    const std::string &auth_token = "");

Description: Constructs a serializable vocabulary object for a given tokenizer
Parameters:
- model: The name of the tokenizer in the huggingface index
- revision: (default: "") The specific revision of the tokenizer on the hf-hub
- user_agent: (default: {}) The user agent info in the form of a dictionary or a single string. It will be completed with information about the installed packages.
- auth_token: (default: "") A valid user access token for hf-hub.
Throws: This will throw a std::exception if the vocabulary cannot be built. This will typically happen in one of two cases: The tokenizer could not be found in the huggingface hub; or the tokenizer has unsupported features.

Vocabulary(std::vector<std::pair<std::string, u_int32_t>> &dictionary,
           u_int32_t eos_token_id);

Description: Constructs a serializable vocabulary object from a set of (Token, TokenId) pairs
Parameters:
- dictionary: This contains the pairs of (Token, TokenId).
- eos_token_id: The token_id of the end of string token. This should NOT be present in the dictionary
Throws: This will throw a std::exception if the vocabulary cannot be built. This will typically happen if the dictionary provided contains the EOS tokenID or is otherwise invalid.

explicit Vocabulary(std::filesystem::path &path);

Description: Constructs a vocabulary object from a serialized object on disk
Parameters:
- path: Path to the serialized object
Throws: This will throw a std::exception in the event that a properly serialized vocabulary cannot be found at the specified path.

Methods

void serialize_to_disk(std::string &path);

Description: Serialize the Vocabulary object to disk
Parameters:
- path: The path to the file you want to serialize your vocabulary to
Throws: This will throw a std::exception in the event that the serialization fails.

u_int32_t max_token_id();

Description: Returns the largest token id in the vocabulary. This is used for bounds checking in the logits processor.

u_int32_t eos_token_id();

Description: Returns the EOS (end of string) TokenId for the vocabulary. This can be useful for checking termination.

`Index`

An index for constructing a LogitsProcessor.

Constructors

Index(std::string &schema, Vocabulary &vocabulary,
      bool disallow_whitespace = true);

Description: Constructs a serializable index object for a given JSON schema
Parameters:
- schema: A JSON schema that generations should match
- vocabulary: A Vocabulary object encoding information about the model’s tokenizer
- disallow_whitespace: (Default: true) Don’t generate JSON containing extra whitespace (such as spaces after commas or linebreaks after {)
Throws: This will throw a std::exception if the index cannot be built. This will typically happen in one of two cases: The JSON schema is malformed or it contains unsupported features.

[!NOTE]

Using disallow_whitespace=true (the default) may cause unanticipated model performance issues, as it disables a formatting that may be natural for the model.

explicit Index(std::filesystem::path &path);

Description: Constructs an index object from a serialized object on disk
Parameters:
- path: Path to the serialized object
Throws: This will throw a std::exception in the event that a properly serialized index cannot be found at the specified path.

Methods

void serialize_to_disk(std::string &path);

Description: Serialize the Index object to disk
Parameters:
- path: The path to the file you want to serialize your index to
Throws: This will throw a std::exception in the event that the serialization fails.

`Guide`

A guide class that reads batches of TokenIds and produces BatchedTokenSets that can be used in the logits processor.

Constructors

Guide(const Index &index, size_t batch_size) noexcept;

Description: Constructs the Guide object
Parameters:
- index: The index used for developing the mask
- batch_size: The batch size for the logits updates

Methods

BatchedTokenSet get_start_tokensets() noexcept;

Description: Construct the set of allowed tokens need to generate the first token.

BatchedTokenSet get_next_tokensets(const std::vector<u_int32_t> &token_ids);

Description: Read a new batch of tokens in and produce the mask
Parameters:
- token_ids: The vector of TokenIds that have just been sampled.
Throws: This will throw an exception when the wrong number of tokens are given or when at least one of the tokens does not come from the allowed token set.

`LogitsProcessor`

A logits processor that modifies logits arrays in place for structured generation.

Constructors

LogitsProcessor(Guide &guide, u_int16_t mask_value) noexcept;

Description: Constructs the logits processor
Parameters:
- guide: The Guide object used to produce the token sets.
- mask_value: The equivalent of -std::numeric_limits<float>::infinity() for u_int16_t.

Methods

void operator()(std::vector<std::span<u_int16_t>> &logits,
                BatchedTokenSet &token_set);

Description: Adaptively compute the mask and apply it in place to the logits array
Parameters:
- logits: A vector of std::span<u_int16_t> containing logits computed after reading the tokens in context. The vector must be the same as the batch_size used to initialize the processor and the spans must all have the same size.
- token_set: The set of tokens produced by the guide.
Throws: A std::exception will be thrown on bounds errors.

Example Usage

Here is a minimal example of how to use the dotjson library:

// Create vocabulary and index
std::string model = "gpt2";
std::string schema = "{\"type\":\"object\",\"properties\":{\"x\":{\"type\":\"integer\"}}}";
dotjson::Vocabulary vocabulary(model);
dotjson::Index index(schema, vocabulary);

// Create guide and processor
std::size_t batch_size = 1; 
u_int16_t mask_value = 0; // Appropriate mask value
dotjson::Guide guide(index, batch_size);
dotjson::LogitsProcessor processor(guide, mask_value);

// Get initial token set
dotjson::BatchedTokenSet token_set = guide.get_start_tokensets();

// Initial logit vector (generated by LLM)
std::vector<std::span<uint16_t>> logits;
// Populate logits...

// Process the logits using the token set
processor(logits, token_set);

// Sample tokens based on the processed logits
std::vector<u_int32_t> sampled_tokens = sample_tokens(logits);

// Get the next token set based on sampled tokens
token_set = guide.get_next_tokensets(sampled_tokens);