Lab 5: Go Bananas

Advanced Developer Exercises

Lab Overview

Ready to become an AI security engineer? This is where the real fun begins! Lab 5 is your chance to apply everything you've learned and create your own security solutions from scratch. While the previous labs taught you about existing vulnerabilities and defenses, this lab empowers you to build the next generation of AI security tools that could one day protect real-world systems.

We live in a constantly evolving threat landscape. Attackers are always getting smarter, and as the use-cases for GenAI expand, so do the ways in which these systems can be attacked. The defenses that work today may not be enough tomorrow. That's why it's critical to learn how to customize and expand your security measures—adapting to new threats and new applications as they emerge. This lab is your chance to practice those skills in a safe, hands-on environment.

You'll tackle real-world challenges like protecting against medical advice liability (a common concern for healthcare AI systems), detecting personally identifiable information (essential for privacy compliance), and creating robust detection systems that can adapt to new threats. These aren't just academic exercises - they're the same problems that companies face when deploying AI systems in production, and the solutions you develop here could form the foundation for professional security tools. By the end of this lab, you'll have the skills and confidence to secure AI systems in the real world, and you'll understand why the field of AI security is both challenging and incredibly rewarding. This is where you transition from learning about AI security to becoming an AI security practitioner.

Skill Level: 3 Prerequisites: Developer skills

View Extensibility Documentation

Exercises

Exercise 5.A: Create a Blocklist

Skill Level: 3 Prerequisites: Developer skills
Directions:

Look at the sex and violence examples. Create another topic like "Medical" and try to keep your bot from giving out medical advice. You might do this in real life to mitigate legal risk. Add it to the output filters list.

Reference the existing blocklists:

Exercise 5.B: Create a PII Guardrail

Skill Level: 3 Prerequisites: Developer skills
Directions:

Create a simple, local filter for personally identifiable information (PII) using regular expressions (RegEx). Look for patterns like phone numbers, social security numbers, etc. Go nuts! Add it to the output filters list and try it out.

Consider patterns like:

  • Phone numbers (various formats)
  • Social Security Numbers (XXX-XX-XXXX)
  • Email addresses
  • Credit card numbers
  • IP addresses

Exercise 5.C: Make a Robust PII Guardrail

Skill Level: 3 Prerequisites: Developer skills + OpenAI API Key
Directions:

Make a much more robust PII using an LLM as the judge. Use the prompt injection (AI) filter as an example. Add it to the output filters list and try it out.

Reference the AI prompt injection filter:

View AI Prompt Injection Filter

Extra Credit:

Create a PII test suite that tests the local and robust versions. Put it in /tests. Use the Prompt Injection suite as an example. Look at how the AI version and the LLM-based versions compare.

View Test Data Examples

Extra, Extra Credit:

Tune both the simple PII filter and the advanced one and see how good you can make them against your test suite cases.

Key Learning Points

Advanced Tips

Congratulations!

You've completed all the labs in Steve's Chat Playground Lab Book! You now have hands-on experience with:

Keep exploring, experimenting, and building secure AI systems!