Debugging AI
If you’re using LLM-based AI as part of your product, how do you debug it?
There are some big limitations. They are a black box, so you can only get so much information in and out. There is no way to ask for verbose output, hook into logs, or set breakpoints.
When debugging, I would isolate prompts. Treat them like any other external API or closed-source library. You can save outputs and feed those back into the rest of your application, where you can use the usual array of debugging tools and techniques.
For the prompts, you can use non-introspection debugging techniques. You can use heuristics to verify and test since the results are non-deterministic.
Maybe there are better tools for those who are training models, but the vast majority of folks are using pre-built ones.
Have you worked with applications that integrated with LLM-based AI? How did you debug issues?