Using Structured Outputs with Spring AI

What are Structured Outputs?

Using Structured Outputs is an important aspect when interacting with LLMs programmatically. LLMs, by their nature, generate unstructured output. The unstructured output does not work well with programming languages which need structure to understand the data.

Through prompting techniques, you can direct the LLM to create a structured output. The format can be just about anything. You can ask for xml, json, yaml, key value pairs, or even provide a format you wish to see.

A common technique is to provide the LLM a JSON schema for the expected output. This generally works well with most LLMs. The caveat is, the LLM might not adhere to the provide schema. Thus, might work most of the time, may fail some of the time.

OpenAI is the first major LLM to guarantee adherence to the provided JSON schema. On August 6th, 2024, OpenAI announced formal support of Structured Outputs. With this change, you can now instruct the LLM to adhere to the given JSON schema.

From the OpenAI blog post, you can see this significantly improves reliability.

At the time of writing, OpenAI is only major LLM with this feature. I expect other vendors will also offer this feature.

The Structured Outputs features in Spring AI will work with all supported LLMs. However, understand only OpenAI will be consistently reliable. Other LLMs have a JSON mode, which ensures the output is valid JSON, but it might not adhere to the provided JSON schema.

Structured Outputs with Spring AI

Spring AI offers several converters for working with Structured Output.

The most commonly used converter is the BeanOutputConverter, which is used to bind the LLM output to a Java POJO.

Also available are the MapOutputConverter and ListOutputConverter. The MapOutputConverter binds the LLM response to an instance of java.util.Map<String, Object>, while the ListOutputConverter binds to an instance of java.util.List.

Example of Using Spring AI’s BeanOutputConverter

In my Introduction to Spring AI post we setup a simple Spring AI application. We added an example using String Templates in my post Using String Templates with Spring AI.

In this post, let’s explore using Structured Outputs with the StringTemplates. The source code used in this post is available here on Github.

First we need to update the StringTemplate to accept the format binding of the request.

get-capital-prompt.st

What is the capital of {stateOrCountry}? {format}

Note we simply added {format} to the StringTemplate.

We can update our getCapital method as follows.

    @Override
    public GetCapitalResponse getCapital(GetCapitalRequest getCapitalRequest) {
        BeanOutputConverter<GetCapitalResponse> converter 
                = new BeanOutputConverter<>(GetCapitalResponse.class);
        
        String format = converter.getFormat();

        PromptTemplate promptTemplate = new PromptTemplate(getCapitalPrompt);

        Prompt prompt = promptTemplate.create(Map.of("stateOrCountry", 
                getCapitalRequest.stateOrCountry(),
                "format", format));

        ChatResponse response = chatModel.call(prompt);

        return converter.convert(response.getResult().getOutput().getContent());
    }

The first update we did was to define a new BeanOutputConverter. You can see this accepts the class of the desired output. We use the defined converter to generate the format string.

The format string is and stateOrCountry are bound to the StringTemplate when the prompt is created from the PromptTemplate.

The last line of the method now uses the converter to convert the JSON payload returned by the LLM.

To help the LLM understand the context of the properties, we can update the GetCapitalResponse as follows.

GetCapitalResponse.java

public record GetCapitalResponse(@JsonPropertyDescription("This is the city name") String answer) {
}

By using the @JsonPropertyDescription annotation we are supplying meta data to the LLM.

If we inspect the format String in our Java debugger, we can examine the generated format.

Your response should be in JSON format.
Do not include any explanations, only provide a RFC8259 compliant JSON response following this format without deviation.
Do not include markdown code blocks in your response.
Remove the ```json markdown from the output.
Here is the JSON Schema instance your output must adhere to:
```{
  "$schema" : "https://json-schema.org/draft/2020-12/schema",
  "type" : "object",
  "properties" : {
    "answer" : {
      "type" : "string",
      "description" : "This is the city name"
    }
  },
  "additionalProperties" : false
}```

You can see from the above, the converter generates instructions for the LLM about the expected output. It also generates a JSON schema, which includes the meta-data we added using the @JsonPropertyDescription annotation.

We can run this example in Postman as follows. The request remains the same.

However, the response has changed slightly.

Rather than responding with an informative sentence, now the LLM returns just the city name.

Using Spring AI’s BeanOutputConverter with Multiple Properties

The above example, just returns the city name. While it demonstrates functionality, its pretty simple.

What if we wished to use the output to generate dynamic content on a web page?

We can ask the LLM for addition information about the city.

Lets create a second example with addition information. First add a new Java record to the project as follows.

GetCapitalWithInfoResponse.java

public record GetCapitalWithInfoResponse(
          @JsonPropertyDescription("The name of the city") String city,
          @JsonPropertyDescription("The population of the city") Integer population,
          @JsonPropertyDescription("The region the city is located in") String region,
          @JsonPropertyDescription("The primary language spoken") String language,
          @JsonPropertyDescription("The currency used") String currency){
}

Add the following method to the Spring MVC controller.

    @PostMapping("/capitalWithInfo")
    public GetCapitalWithInfoResponse getCapitalWithInfo(@RequestBody GetCapitalRequest 
                                                             getCapitalRequest) {
        return this.openAIService.getCapitalWithInfo(getCapitalRequest);
    }

We can add the following method to our service to process this request.

 @Override
    public GetCapitalWithInfoResponse getCapitalWithInfo(GetCapitalRequest getCapitalRequest) {
        BeanOutputConverter<GetCapitalWithInfoResponse> converter = new BeanOutputConverter<> 
                         (GetCapitalWithInfoResponse.class);
        
        String format = converter.getFormat();

        PromptTemplate promptTemplate = new PromptTemplate(getCapitalPrompt);
        Prompt prompt = promptTemplate.create(Map.of("stateOrCountry", 
                 getCapitalRequest.stateOrCountry(),
                "format", format));
                
        ChatResponse response = chatModel.call(prompt);

        return converter.convert(response.getResult().getOutput().getContent());
    }

The method implementation remains nearly the same with the exception of the record being used for the response.

The format sent to the LLM now becomes this

Your response should be in JSON format.
Do not include any explanations, only provide a RFC8259 compliant JSON response following this format without deviation.
Do not include markdown code blocks in your response.
Remove the ```json markdown from the output.
Here is the JSON Schema instance your output must adhere to:
```{
  "$schema" : "https://json-schema.org/draft/2020-12/schema",
  "type" : "object",
  "properties" : {
    "city" : {
      "type" : "string",
      "description" : "The name of the city"
    },
    "currency" : {
      "type" : "string",
      "description" : "The currency used"
    },
    "language" : {
      "type" : "string",
      "description" : "The primary language spoken"
    },
    "population" : {
      "type" : "integer",
      "description" : "The population of the city"
    },
    "region" : {
      "type" : "string",
      "description" : "The region the city is located in"
    }
  },
  "additionalProperties" : false
}```

Using the meta-data supplied in the JSON schema, the LLM will now supply additional information.

The response in Postman is now

Our example seems to be working as expected. However, OpenAI’s documentation states the properties should be marked as required.

We can update the response record as follows to require the properties.

public record GetCapitalWithInfoResponse(
        @JsonProperty(required = true) @JsonPropertyDescription("The name of the city") String city,
        @JsonProperty(required = true)  @JsonPropertyDescription("The population of the city") Integer population,
        @JsonProperty(required = true)  @JsonPropertyDescription("The region the city is located in") String region,
        @JsonProperty(required = true)  @JsonPropertyDescription("The primary language spoken") String language,
        @JsonProperty(required = true)  @JsonPropertyDescription("The currency used") String currency){
}

The format supplied to the LLM now becomes this:

Your response should be in JSON format.
Do not include any explanations, only provide a RFC8259 compliant JSON response following this format without deviation.
Do not include markdown code blocks in your response.
Remove the ```json markdown from the output.
Here is the JSON Schema instance your output must adhere to:
```{
  "$schema" : "https://json-schema.org/draft/2020-12/schema",
  "type" : "object",
  "properties" : {
    "city" : {
      "type" : "string",
      "description" : "The name of the city"
    },
    "currency" : {
      "type" : "string",
      "description" : "The currency used"
    },
    "language" : {
      "type" : "string",
      "description" : "The primary language spoken"
    },
    "population" : {
      "type" : "integer",
      "description" : "The population of the city"
    },
    "region" : {
      "type" : "string",
      "description" : "The region the city is located in"
    }
  },
  "required" : [ "city", "currency", "language", "population", "region" ],
  "additionalProperties" : false
}```

You can see at the end of the JSON schema a list of properties required.

While this change did not seem to affect the output of this example, I expect it will affect the reliability of using Structured Output with OpenAI.

Summary

As you can see from this post using Structured Output is very important when interacting with LLMs. The LLMs are far more capable than simple chat bots. Through the use of Structured Output the logic and reasoning of LLMs can be incorporated into your applications. This builds an important bridge between the world of unstructured data and the structure needed by programming languages.