GenAI application development - a Cynefin perspective
generative AI CynefinThese days a lot of energy is being poured into developer Generative AI applications. A key part of developing these applications is prompt engineerig: carefully crafting prompts to get the desired behavior from the LLM. This often requires a lot of experimentation and iteration to get right. What’s more, a prompt that works for one LLM (version) might not work with another. Arguably, generative AI application development is what the Cynefin framework would call a complex domain.
About Cynefin
Cynefin is a framework that helps us understand the nature of the problem we’re facing. It’s a sense-making framework that characterizes problems as either clear, complicated, complex, or chaotic.
In clear and complicated domains, the relationship between cause and effect is knowable.
You simply need to understand the rules.
In the case of a complicated domain, you might need to do some analysis to figure out the rules.
You could consider User Profile Management
to be a clear domain, and the Federal Incone Tax Calculator
to be a complicated domain.
Developer tools are clear or complicated
Typical developer technologies - languages, tools, etc - are usually clear or complicated. You simply need to learn a bunch of rules in order to use. And even when you do need to experiment, the problem is almost always poorly written documentation.
About Cynefix complex domains
In a complex domain, the relationship between cause and effect is only knowable in retrospect.
In order to devise a solution, a lot of experimentation is required.
What’s more, you might not even know whether a given solution is the best one.
An example of a complex domain is Delivery Management
.
Creating a courier scheduling algorithm requires a lot of experimentation.
LLMs are complex and random
As I mentioned earlier, generate AI development is a complex domain. You typically need to experiment a lot to get the desired behavior. And, what works for one LLM might not work for another. Moreover, LLMs exhibit random behavior, which makes development even more difficult.
For example, the Faramir phone number parser prompt required a lot of experimentation to get right. Moreover, it works with GPT-4 but fails with other LLMs, for reason unknown.