Philosophy¶
The instructor values simplicity and flexibility in leveraging language models (LLMs). It offers a streamlined approach for structured output, avoiding unnecessary dependencies or complex abstractions. Let Pydantic do the heavy lifting.
“Simplicity is a great virtue but it requires hard work to achieve it and education to appreciate it. And to make matters worse: complexity sells better.” — Edsger Dijkstra
Proof that its simple¶
- Most users will only need to learn
response_model
andpatch
to get started. - No new prompting language to learn, no new abstractions to learn.
Proof that its transparent¶
- We write very little prompts, and we don't try to hide the prompts from you.
- We'll do better in the future to give you config over the 2 prompts we do write, Reasking and JSON_MODE prompts.
Proof that its flexible¶
- If you build a system with OpenAI directly, it is easy to incrementally adopt instructor.
- Add
response_model
and if you want to revert, just remove it.
The zen of instructor
¶
Maintain the flexibility and power of Python, without unnecessary constraints.
Begin with a function and a return type hint – simplicity is key. With my experience maintaining a large enterprise framework at my previous job over many years I've learned that the goal of making a useful framework is minimizing regret, both for the author and hopefully for the user.
- Define a Schema
class StructuredData(BaseModel):
- Define validators and methods on your schema.
- Encapsulate all your LLM logic into a function
def extract(a) -> StructuredData:
- Define typed computations against your data with
def compute(data: StructuredData):
or call methods on your schemadata.compute()
It should be that simple.
My Goals¶
The goal for the library, documentation, and blog, is to help you be a better python programmer and as a result a better AI engineer.
- The library is a result of my desire for simplicity.
- The library should help maintain simplicity in your codebase.
- I won't try to write prompts for you,
- I don't try to create indirections or abstractions that make it hard to debug in the future
Please note that the library is designed to be adaptable and open-ended, allowing you to customize and extend its functionality based on your specific requirements. If you have any further questions or ideas hit me up on twitter
Cheers!