Is AI Timeless?
Explore AI fundamentals and the Lindy effect to uncover why artificial intelligence feels new yet timeless—and why its influence keeps growing.
By: Amir Tadrisi
Published on: 6/3/2025
Last updated on: 6/3/2025
Context is the core of guiding the language model to the right answer. Context is whatever the model sees and knows before it answers our question or execute a requested task. This is different from the model's weights (pre-trained knowledge) this is what we provide to the model at runtime to guide it's output.
We construct context for three main reasons:
Fill Knowledge Gaps : Models Pre-trained data is static and only current to it's cutoff. So If you need it to know about events or proprietary details that came afterward (for example your latest product manual), you must feed those facts in as context.
Steer Output via Prompt Engineering: Context is one of the components of the prompt engineering that steers the model's output, it doesn't need to be a giant document, book or internet content it can be as simple as multiple well-chosen precise bullet points that explains something to the model
Prevent hallucination: When humans lack information, we guess—and sometimes err. Models do the same. Providing factual snippets and clear instructions keeps the model grounded, minimizing the risk of invented or misleading statements.
Two main ways for context construction are RAG, or retrieval-augmented generation, and agents.
In this method we retrieve relevant information from external memory sources like Documents, Knowledge Bases, Internal Database, Books or even user's chat session.
Token Efficiency: Since language model's context is limited to specific number of tokens instead of stuffing a language model’s prompt with every possible fact and bloating our prompt and our bills, RAG only provides related knowledge to the question.
Query-Specific Context: We no longer have one static context for all queries, our context built at the runtime related to the query user asked. This method improves model's accuracy and precision.
In this method we convert our original external knowledge to chunks of documents. When user send a query we use keywords to find documents that has highest repetition of the keyword we are looking for, this is the same method Elasticsearch is using. There are important tips we need to pay attention to
In this method we use semantic similarity to find relevant documents.
Query
What it is: The user’s input or task description (“How do I reset my password?”).
Role: Drives retrieval—defines what information the system needs to fetch.
External Memory
What it is: Your knowledge sources—documents, wikis, product manuals, prior chat logs, databases, etc.
Task: The raw material from which relevant snippets are drawn.
Embedding Model
What it is: A neural encoder that turns text (queries or documents) into fixed-length vector representations.
Role: Maps semantically similar text into nearby points in vector space—so “password reset” and “change password” embeddings sit close together.
Vector Database
What it is: A specialized store (e.g., Pinecone, Weaviate, FAISS) for massive collections of vectors.
Role: Efficiently stores, indexes and searches millions of embeddings to find the top-K closest matches to your query vector.
Retriever
What it is: The component that orchestrates vector look-up and document fetch.
Roles:
1) Sends the query embedding to the vector DB
2) Retrieves the IDs of the most similar document embeddings
3) Loads the corresponding text snippets from your external memory
Language Model (LM)
What it is: The generative core (e.g., GPT-4, LLaMA) that produces the final answer. – Roles:
1) Receives the user’s original query plus the retrieved snippets as “context”
2) Generates a response that’s grounded in those snippets—minimizing hallucinations and staying on point
This is prerequisite for the querying process. To make our external knowledge queryable we break it to smaller sizes (paragraphs, sentences, specific number of words or tokens) and pass each unit of knowledge to the embedding model to transform it to vector, next we save this vector representation of the unit in our Vector database to make it searchable for our future queries. Our Retriever is also is in charge of Indexing beside querying.
To construct context for our models we can take advantage of agents. Agents are system that can read our request, plan which steps to take, call the right tools, keeps track of what it’s learned, and finally hands off a concise prompt to the language model (LM) for a polished answer. In more technical terms Agents are orchestration systems that wire up Language Models with set of tools to be able to take actions and gain knowledge about a topic that is not in the models weights.
Let's say we want to know "What’s the latest price of Bitcoin in USD, and summarize any major recent news?" So in this case we want a way to:
Here is the workflow for our example:
Looking to learn more about ai, lm, context contruction and rag, agents, automation? These related blog articles explore complementary topics, techniques, and strategies that can help you master Boost AI Precision with Context Construction: RAG & Agents.
Explore AI fundamentals and the Lindy effect to uncover why artificial intelligence feels new yet timeless—and why its influence keeps growing.
Introduction to the top 3 AI engineering tasks—model evaluation, prompt engineering & interface development—to level up your AI projects.
Discover how AI embeddings use semantic similarity to power large language models.
In this guide, we’ll walk through the building blocks of powerful prompts, explore advanced techniques like Chain-of-Thought (CoT), delve into context-construction strategies, and show you how to monitor performance over time.
Explore Alan Turing's five pioneering AI contributions that laid the foundation for modern Artificial Intelligence. See his legacy today!