This directory contains notebooks and example code for demonstrating various Llama Stack capabilities and use cases.
An introduction to the Llama Stack Responses API, including:
- Simple model inference
- Retrieval-Augmented Generation (RAG)
- Model Context Protocol (MCP) tool calling
- Integration examples with the Llama Stack client, the OpenAI client, and LangChain
We welcome contributions of new example notebooks! If you've built something interesting with Llama Stack, please consider sharing it with the community.
- Clear documentation: Well-commented code with explanations
- Complete setup: Include all prerequisites and dependencies
- Realistic use cases: Practical applications that others can learn from
- Clean structure: Organized code that's easy to follow
- Create a new directory following the naming pattern:
NN-topic-name/ - Add in a notebook with instructions, code, and explanations
- Include a comprehensive README.md explaining your example
- If needed, add additional support files to make your notebook work
- Ensure your notebook runs from start to finish
- Submit a pull request with your contribution
Some areas we'd love to see examples for:
- Advanced RAG implementations
- Multi-modal applications
- Custom tool development
- Production deployment patterns
- Performance optimization techniques
- Integration with other AI frameworks
Each example directory contains its own README with specific instructions for getting started.
For questions about specific examples, check the README in each directory. For general Llama Stack questions, see the official documentation.