Article 09 · May 2026

Prompt Engineering is Dead. Long Live Context Engineering.

May 14, 2026 · by Satish K C 8 min read
LLMs Agents RAG Prompt Design

The Big Idea

In June 2025, Shopify CEO Tobi Lutke tweeted a definition that instantly went viral: context engineering is "the art of providing all the context for the task to be plausibly solvable by the LLM." Two days later, Andrej Karpathy amplified it - calling context engineering "the delicate art and science of filling the context window with just the right information for the next step." Within weeks, the term replaced prompt engineering in practitioner discourse. The shift is not cosmetic. Prompt engineering optimized a single input string. Context engineering designs entire dynamic systems that assemble instructions, retrieved knowledge, tool outputs, conversation history, and persistent memory into the context window at runtime. As AI moves from chatbots to autonomous agents, the bottleneck is no longer what you ask - it is what information is available when you ask it.

Before vs After

Prompt engineering treated the LLM interaction as a writing exercise - craft the perfect sentence, add the right examples, iterate on phrasing. Context engineering treats it as systems engineering - design the infrastructure that dynamically populates the context window with everything the model needs to succeed.

Prompt Engineering (2022-2024)

  • Craft a clever single query or instruction
  • Static text optimization - same prompt every time
  • Focus on phrasing, word choice, few-shot examples
  • One interaction at a time, no state
  • "Find the magic sentence that makes GPT do the thing"
  • Skills needed: writing, trial-and-error

Context Engineering (2025+)

  • Design dynamic systems that populate the context window
  • Runtime assembly - different context per task, user, state
  • Focus on RAG, tools, memory, state management, compression
  • Manage full agent lifecycle across multiple turns
  • "Build the operating system that feeds the LLM the right data"
  • Skills needed: systems design, retrieval, infrastructure

How It Works

Simon Willison framed the naming problem clearly: "prompt engineering" acquired an unfortunate inferred definition - "typing into a chatbot." The real work was always more complex, but the name undersold it. "Context engineering" sticks because its inferred meaning matches the actual complexity of what practitioners do. Philipp Schmid from Hugging Face formalized the definition: context engineering is "the discipline of designing and building dynamic systems that provide the right information and tools, in the right format, at the right time."

The Seven Components of Context Engineering
LLM Context Window = RAM Engineer = Operating System 1. Instructions System prompt + rules 2. User Prompt Immediate task request 3. State / History Conversation memory 4. Long-term Memory Persistent knowledge 5. Retrieved Info (RAG) External documents 6. Available Tools Functions + APIs 7. Output Format Response schema FOUR CORE STRATEGIES (LANGCHAIN) Write Persist outside Select Retrieve right info Compress Retain essentials Isolate Split contexts

LangChain formalized four core strategies in their July 2025 framework. Write - persist information outside the active context (scratchpads, long-term memory stores). Select - strategically retrieve only what is relevant (semantic search over tools improved selection accuracy 3x). Compress - retain only essential tokens (Claude Code auto-compacts after 95% window usage). Isolate - split work across separate context windows via multi-agent architectures, though Anthropic found this can consume up to 15x more tokens than single-agent approaches.

Prompt Engineering vs Context Engineering - System Complexity
PROMPT ENGINEERING Human writes prompt Static Prompt "You are a helpful..." LLM generates Output hope it works CONTEXT ENGINEERING RAG Pipeline docs + embeddings Tool Results APIs + functions Memory state + history Context Engine Select + Compress + Format + Assemble LLM rich context = better output Tool Calls execute actions Response grounded + accurate Results feed back into context (Write strategy) ~50 tokens typical prompt 50,000 - 200,000 tokens assembled at runtime

The O'Reilly analogy from Addy Osmani captures it precisely: treat the LLM as a CPU and its context window as RAM. The context engineer functions as an operating system - loading the right programs, managing memory, scheduling I/O. A CPU with the wrong data in RAM produces garbage regardless of its processing power. Same with LLMs.

Key Findings

Systems Design Runtime Assembly Memory Management Failure Modes
3x
Tool selection improvement with semantic search
15x
Token overhead of multi-agent isolation
95%
Context threshold for auto-compaction
200K+
Tokens before explicit memory persistence

Why This Matters for AI and Automation Practitioners

If you are building AI agents, RAG pipelines, or any system where an LLM does real work - this shift redefines your job description. You are no longer a prompt writer. You are a context architect. The difference between a demo that impresses and a system that works in production is almost never the prompt. It is whether the model had access to the right customer data, the right tool definitions, the right conversation history, and the right constraints when it generated its response.

The cheap demo vs. production agent gap: A poorly-contexted agent that only sees the user request responds generically - "Thank you for your message. Tomorrow works for me." A richly-contexted agent with calendar access, email history, contact preferences, and available tools responds with: "Hey Jim! Tomorrow's packed. Thursday AM free - sent an invite." Same model. Same prompt. Completely different context. Completely different outcome.

For automation practitioners specifically, context engineering is the bridge between "AI chatbot" and "AI that actually does things." Every n8n workflow that feeds data into an LLM node, every RAG pipeline that retrieves documents before generation, every agent that calls tools - these are all context engineering. The discipline gives a name and framework to what practitioners were already doing, and provides systematic patterns (Write, Select, Compress, Isolate) for doing it better.

The failure mode to watch: Context Distraction - loading too much into the window and degrading performance. More context is not always better context. Windsurf uses multi-layered retrieval (AST parsing, semantic chunking, embeddings, grep, knowledge graphs, re-ranking) specifically to surface only what matters. The engineering is in the selection, not the accumulation.

My Take

The naming shift matters more than it looks. "Prompt engineering" attracted writers and marketers. "Context engineering" attracts systems engineers and infrastructure builders. The second group is who actually ships production AI systems. The term change filters the talent pipeline toward the right skill set - people who think in pipelines, retrieval strategies, and memory management rather than linguistic tricks.

That said, prompting skill does not disappear inside context engineering - it becomes one component among seven. The system prompt still matters. Few-shot examples still help. But they are 15% of the problem now, not 90%. The other 85% is infrastructure: what gets retrieved, when it gets retrieved, how it gets compressed, and whether the model has the right tools available at the moment of generation. If you are still spending most of your time iterating on prompt wording rather than building retrieval pipelines and memory systems, you are optimizing the wrong layer.

The practitioners who will excel in this paradigm are those who already think in systems - backend engineers, data engineers, platform builders. They have been building the plumbing that context engineering requires for decades. The domain knowledge transfers directly. What is new is that the "application" running on top of that plumbing is now an LLM rather than a deterministic program.

Discussion question: Context engineering requires infrastructure (vector stores, memory systems, tool registries, orchestration layers) that prompt engineering never needed. At what point does the infrastructure overhead of proper context engineering exceed the value it delivers - and how do you decide when a simple, well-crafted prompt is still the right answer?

Share this discussion