Lab 5: Go Bananas - Steve's Chat Playground Lab Book

Lab Overview

Everything here is extra credit! Everything requires some admin/developer chops. Dive in if you like! These don't have a lot of structure, so it's more challenging, but that's part of the fun. You can do it! Review the extensibility docs to get started.

Skill Level: 3 Prerequisites: Developer skills

View Extensibility Documentation

Exercises

Exercise 5.A: Create a Blocklist

Skill Level: 3 Prerequisites: Developer skills

Directions:

Look at the sex and violence examples. Create another topic like "Medical" and try to keep your bot from giving out medical advice. You might do this in real life to mitigate legal risk. Add it to the output filters list.

Reference the existing blocklists:

Exercise 5.B: Create a PII Guardrail

Skill Level: 3 Prerequisites: Developer skills

Directions:

Create a simple, local filter for personally identifiable information (PII) using regular expressions (RegEx). Look for patterns like phone numbers, social security numbers, etc. Go nuts! Add it to the output filters list and try it out.

Consider patterns like:

Phone numbers (various formats)
Social Security Numbers (XXX-XX-XXXX)
Email addresses
Credit card numbers
IP addresses

Exercise 5.C: Make a Robust PII Guardrail

Skill Level: 3 Prerequisites: Developer skills + OpenAI API Key

Directions:

Make a much more robust PII using an LLM as the judge. Use the prompt injection (AI) filter as an example. Add it to the output filters list and try it out.

Reference the AI prompt injection filter:

View AI Prompt Injection Filter

Extra Credit:

Create a PII test suite that tests the local and robust versions. Put it in /tests. Use the Prompt Injection suite as an example. Look at how the AI version and the LLM-based versions compare.

View Test Data Examples

Extra, Extra Credit:

Tune both the simple PII filter and the advanced one and see how good you can make them against your test suite cases.

Key Learning Points

Creating custom security filters from scratch
Understanding regular expressions for pattern matching
Building robust PII detection systems
Creating comprehensive test suites
Comparing simple vs. sophisticated approaches
Extending the playground with custom functionality

Advanced Tips

Start with simple patterns and gradually make them more sophisticated
Test your filters with various edge cases and false positives
Consider performance implications of your filters
Document your custom filters for future reference
Share your creations with the community!

Congratulations!

You've completed all the labs in Steve's Chat Playground Lab Book! You now have hands-on experience with:

Basic AI system vulnerabilities
Content filtering and guardrails
Prompt injection attacks and defenses
Output filtering and security measures
Advanced moderation techniques
Automated testing for security
Creating custom security measures

Keep exploring, experimenting, and building secure AI systems!