One of the challenges when integrating a large language model into your backend processes is that the response you get back is non-deterministic. This is not a big problem if you only want to output the response as text, but it can be a challenge to process the response in an automated fashion.
Prompting for JSON
Of course you can use prompt engineering to ask the LLM to return the response as JSON and even provide an example to steer the LLM, but still it can happen that the JSON you get back is not formatted correctly.
Here is a possible prompt:
var json_prompt = """ | |
Ensure the output is valid JSON. | |
It should be in the schema: | |
<output> | |
{ | |
"ingredients": [ | |
{ | |
"name": "<ingredient_name1>", | |
"quantity": "<quantity1>", | |
"unit": "<unit1>", | |
}, | |
{ | |
"name": "<ingredient_name2>", | |
"quantity": "<quantity2>", | |
"unit": "<unit2>", | |
} | |
] | |
} | |
</output> | |
""" | |
var system_prompt = "You are an AI language model that provides structured JSON outputs."; |
A trick that also can help as mentioned in the Anthropic documentation is to prefill the response with a part of the JSON message.
ChatHistory chatHistory = new ChatHistory(); | |
chatHistory.AddAssistantMessage("<output>\\n{\\n\"ingredients\":\"\""); |
JSON mode
Although the techniques above will certainly help, they are not fool proof. A first improvement on this approach was the introduction of JSON mode in the OpenAI API. When JSON mode is turned on, the model's output is ensured to be valid JSON, except for some edge cases that are described in the documentation.
To use this technique you need to update the execution settings of Semantic Kernel:
var executionSettings = new OpenAIPromptExecutionSettings | |
{ | |
ResponseFormat = "json_object" | |
}; | |
// Send a request and pass prompt execution settings with desired response format. | |
var result = await kernel.InvokePromptAsync("What are the ingredients needed to prepare a Christmas turkey?", new(executionSettings)); |
Structured output
With structured output we can take this approach one step further and specify an exact JSON schema that we want to get back.
We have 2 options when using this approach. Either we specify the schema directly:
public class Ingredient | |
{ | |
public string Name { get; set; } | |
public string Quantity { get; set; } | |
public string Unit { get; set; } | |
} |
using Microsoft.Extensions.Configuration; | |
// Initialize kernel. | |
using Microsoft.SemanticKernel; | |
using Microsoft.SemanticKernel.ChatCompletion; | |
using Microsoft.SemanticKernel.Connectors.OpenAI; | |
using OpenAI.Chat; | |
using System.Text.Json; | |
#pragma warning disable SKEXP0010 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed. | |
var builder = new ConfigurationBuilder() | |
.SetBasePath(Directory.GetCurrentDirectory()) | |
.AddUserSecrets<Program>(); | |
IConfiguration configuration = builder.Build(); | |
Kernel kernel = Kernel.CreateBuilder() | |
.AddAzureOpenAIChatCompletion(deploymentName: "gpt-4o", endpoint: configuration["OpenAI:apiUrl"], apiKey: configuration["OpenAI:apiKey"]) | |
.Build(); | |
// Initialize ChatResponseFormat object with JSON schema of desired response format. | |
ChatResponseFormat chatResponseFormat = ChatResponseFormat.CreateJsonSchemaFormat( | |
jsonSchemaFormatName: "recipe", | |
jsonSchema: BinaryData.FromString(""" | |
{ | |
"type": "object", | |
"properties": { | |
"Ingredients": { | |
"type": "array", | |
"items": { | |
"type": "object", | |
"properties": { | |
"Name": { "type": "string" }, | |
"Quantity": { "type": "string" }, | |
"Unit": { "type": "string" } | |
}, | |
"required": ["Name", "Quantity", "Unit"], | |
"additionalProperties": false | |
} | |
} | |
}, | |
"required": ["Ingredients"], | |
"additionalProperties": false | |
} | |
"""), | |
jsonSchemaIsStrict: true); | |
// Specify response format by setting ChatResponseFormat object in prompt execution settings. | |
var executionSettings = new OpenAIPromptExecutionSettings | |
{ | |
ResponseFormat = chatResponseFormat | |
}; | |
// Send a request and pass prompt execution settings with desired response format. | |
var result = await kernel.InvokePromptAsync("What are the ingredients needed to prepare a Christmas Turkey?", new(executionSettings)); | |
Console.WriteLine(result); | |
var recipe = JsonSerializer.Deserialize<Recipe>(result.ToString()); | |
#pragma warning restore SKEXP0010 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed. |
public class Recipe | |
{ | |
public List<Ingredient> Ingredients { get; set; } | |
} |
Or we let Semantic Kernel automatically generate the model based on a provided type:
public class Ingredient | |
{ | |
public string Name { get; set; } | |
public string Quantity { get; set; } | |
public string Unit { get; set; } | |
} |
using Microsoft.Extensions.Configuration; | |
// Initialize kernel. | |
using Microsoft.SemanticKernel; | |
using Microsoft.SemanticKernel.ChatCompletion; | |
using Microsoft.SemanticKernel.Connectors.OpenAI; | |
using OpenAI.Chat; | |
using System.Text.Json; | |
#pragma warning disable SKEXP0010 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed. | |
var builder = new ConfigurationBuilder() | |
.SetBasePath(Directory.GetCurrentDirectory()) | |
.AddUserSecrets<Program>(); | |
IConfiguration configuration = builder.Build(); | |
Kernel kernel = Kernel.CreateBuilder() | |
.AddAzureOpenAIChatCompletion(deploymentName: "gpt-4o", endpoint: configuration["OpenAI:apiUrl"], apiKey: configuration["OpenAI:apiKey"]) | |
.Build(); | |
// Specify response format by setting ChatResponseFormat object in prompt execution settings. | |
var executionSettings = new OpenAIPromptExecutionSettings | |
{ | |
ResponseFormat = typeof(Recipe) | |
}; | |
// Send a request and pass prompt execution settings with desired response format. | |
var result = await kernel.InvokePromptAsync("What are the ingredients needed to prepare a Christmas Turkey?", new(executionSettings)); | |
Console.WriteLine(result); | |
var recipe = JsonSerializer.Deserialize<Recipe>(result.ToString()); | |
#pragma warning restore SKEXP0010 // Type is for evaluation purposes only and is subject to change or removal in future updates. Suppress this diagnostic to proceed. |
public class Recipe | |
{ | |
public List<Ingredient> Ingredients { get; set; } | |
} |
Remark: Structured output is only supported in the more recent language models.
More information
Entity extraction with Azure OpenAI Structured Outputs | Microsoft Community Hub
Using JSON Schema for Structured Output in .NET for OpenAI Models | Semantic Kernel
Prefill Claude's response for greater output control ā Anthropic