Agent Recipes

Summary

Chain of Draft (CoD) is a novel prompting technique that makes reasoning more efficient by reducing verbosity. Unlike traditional Chain-of-Thought (CoT) prompting, CoD instructs the model to produce extremely concise, information-dense intermediate steps - typically limited to about five words per step. This approach mimics how humans often jot down quick notes when solving problems, capturing only the critical elements needed. CoD drastically reduces token usage (as little as 7.6% compared to CoT) while maintaining comparable accuracy, resulting in faster response times and lower computational costs.

Implementation

Include clear instructions in your prompt such as "think step by step, but only keep a minimum draft for each step with five words at most" and indicate that the final answer should use a clear separator. Providing few-shot examples that demonstrate the desired concise reasoning format helps the model adapt to this minimalistic style.

When to use

High-volume requests: Lower token costs
Latency-sensitive apps: Faster generation
Multi-step reasoning: Efficient thought process
Resource-constrained: Reduced compute requirements

Best for

Cost-sensitive applications
Real-time response scenarios
High-throughput batch processing

Summary

Implementation

When to use

Best for

Chain of Draft Implementation