Structured Outputs with Vertex AI¶
Vertex AI is the recommended way to deploy the Gemini family of models in production. These models support up to 1 million tokens in their context window and boast native multimodality with files, video, and audio. The Vertex AI SDK offers a preview of tool calling that we can use to obtain structured outputs.
By the end of this blog post, you will learn how to effectively utilize Instructor with the Gemini family of models.
Patching¶
Instructor's patch enhances the gemini api with the following features:
response_model
increate
calls that returns a pydantic modelmax_retries
increate
calls that retries the call if it fails by using a backoff strategy
Learn More
To learn more, please refer to the docs. To understand the benefits of using Pydantic with Instructor, visit the tips and tricks section of the why use Pydantic page.
Vertex AI Client¶
The Vertex AI client employs a different client than OpenAI, making the patching process slightly different than other examples
Getting access
If you want to try this out for yourself check out the Vertex AI console. You can get started here.
import instructor
from pydantic import BaseModel
import vertexai.generative_models as gm
import vertexai
vertexai.init()
client = gm.GenerativeModel("gemini-1.5-pro-preview-0409")
# enables `response_model` in chat call
client = instructor.from_vertexai(client)
if __name__ == "__main__":
class UserDetails(BaseModel):
name: str
age: int
resp = client.create(
response_model=UserDetails,
messages=[
{
"role": "user",
"content": f'Extract the following entities: "Jason is 20"',
},
],
)
print(resp)
#> name='Jason' age=20
JSON Mode¶
By default, instructor.from_vertexai()
uses the mode instructor.Mode.VERTEXAI_TOOLS
, which means it will use tool calling to create the model response. Alternatively, you can use instructor.Mode.VERTEXAI_JSON
to use the response_schema parameter provided by the VertexAI SDK. This parameter will prompt Gemini to respond with JSON directly, which can then be parsed into a model response.
If you are not getting good results with tool calling, or prefer this method for any reason, you can switch to this mode:
### rest of the code as above ...
client = gm.GenerativeModel(
"gemini-1.5-pro-preview-0409", mode=instructor.Mode.VERTEXAI_JSON
)
## rest of the code as above ...
Limitations¶
Currently, Vertex AI offers does not support the following attributes from the OpenAPI schema: optional
, maximum
, anyOf
. This means that not all pydantic models will be supported. Below, I'll share some models that could trigger this error and some work-arounds.
optional / anyOf¶
Using a pydantic model with an Optional
field raise an exception, because the Optional type is translated to "anyOf": [integer , null]
which is not yet supported.
from typing import Optional
class User(BaseModel):
name: str
age: Optional[int]
resp = client.create(
messages=[
{
"role": "user",
"content": "Extract Anibal is 23 years old.",
}
],
response_model=User,
)
print(resp)
# ValueError: Protocol message Schema has no "anyOf" field.
A workaround if to set a certain default value that Gemini can fall back on if the information is not present:
from pydantic import Field
class User(BaseModel):
name: str
age: int = Field(default=0) # or just age: int = 0
resp = client.create(
messages=[
{
"role": "user",
"content": "Extract Anibal is _ years old.",
}
],
response_model=User,
)
print(resp)
# name='Anibal' age=0
This workaround can also work with default_factories:
class User(BaseModel):
name: str
age: int
siblings: list[str] = Field(default_factory=lambda: [])
resp = client.create(
messages=[
{
"role": "user",
"content": "Extract Anibal is 23 years old.",
}
],
response_model=User,
)
print(resp)
# name='Anibal' age=23 siblings=[]
maximum¶
Using the lt
(less than) or gt
(greater than) paramateres in a pydantic field will raise exceptions:
class User(BaseModel):
name: str
age: int = Field(gt=0)
resp = client.create(
messages=[
{
"role": "user",
"content": "Extract Anibal is 23 years old.",
}
],
response_model=User,
)
print(resp)
# ValueError: Protocol message Schema has no "exclusiveMinimum" field.
class User(BaseModel):
name: str
age: int = Field(lt=100)
resp = client.create(
messages=[
{
"role": "user",
"content": "Extract Anibal is _ years old.",
}
],
response_model=User,
)
print(resp)
# ValueError: Protocol message Schema has no "exclusiveMaximum" field
A workaround for this is to use pydantic validadors to change these values post creation
from pydantic import field_validator
class User(BaseModel):
name: str
age: int
@field_validator("age")
def age_range_limit(cls, age: int) -> int:
if age > 100:
age = 100
elif age < 0:
age = 0
return age
resp = client.create(
messages=[
{
"role": "user",
"content": "Extract Anibal is 1023 years old.",
}
],
response_model=User,
)
print(resp)
# name='Anibal' age=100
resp = client.create(
messages=[
{
"role": "user",
"content": "Extract Anibal is -12 years old.",
}
],
response_model=User,
)
print(resp)
# name='Anibal' age=0
So by relying on pydantic, we can mitigate some of the current limitations with the Gemini models 😊.