The Complete LangChain Guide: Build AI Applications from Scratch
Tutorials35 min readDecember 10, 2025

The Complete LangChain Guide: Build AI Applications from Scratch

Master LangChain with this comprehensive guide. Learn to build chatbots, RAG systems, AI agents, and more with step-by-step code examples in Python and TypeScript.

LangChain has become the go-to framework for building AI applications. Whether you want to create a chatbot, build a document Q&A system, or deploy autonomous AI agents, LangChain provides the building blocks. This guide takes you from zero to production-ready applications.

What You'll Learn

  • LangChain fundamentals and architecture
  • Working with different LLM providers
  • Building conversational chatbots
  • Retrieval-Augmented Generation (RAG)
  • Creating autonomous AI agents
  • Production deployment strategies

Prerequisites: Basic Python or TypeScript knowledge. No prior AI/ML experience required—we'll explain everything from first principles.

Part 1: Understanding LangChain

What is LangChain?

LangChain is a framework for developing applications powered by large language models (LLMs). Think of it as the "Rails for AI"—it provides structure, conventions, and reusable components so you can focus on building your application instead of reinventing the wheel.

At its core, LangChain solves several problems:

🔗

Chaining

Connect multiple LLM calls and operations together into coherent workflows

🧠

Memory

Give your AI applications persistent context across conversations

🔧

Tools

Let LLMs interact with external APIs, databases, and services

LangChain vs Alternatives

Before diving in, let's understand when to use LangChain versus alternatives:

FrameworkBest ForLearning Curve
LangChainComplex applications, RAG, agentsMedium
LlamaIndexData indexing and retrievalLow
Semantic KernelEnterprise .NET applicationsMedium
Direct APISimple, single-purpose appsLow

Our recommendation: Use LangChain when you need chains, memory, or agents. For simple chat completions, the direct OpenAI/Anthropic APIs might be simpler.

Part 2: Setting Up Your Environment

Installation

LangChain is available in both Python and JavaScript/TypeScript. We'll show examples in Python, but the concepts translate directly.

Terminal
# Create a virtual environment
python -m venv langchain-env
source langchain-env/bin/activate  # On Windows: langchain-env\Scripts\activate

# Install LangChain and dependencies pip install langchain langchain-openai langchain-community

# For RAG applications, also install: pip install chromadb tiktoken pypdf

API Keys Setup

Create a .env file in your project root:

.env
OPENAI_API_KEY=sk-your-openai-key-here
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key-here

# Optional: for LangSmith tracing (highly recommended) LANGCHAIN_TRACING_V2=true LANGCHAIN_API_KEY=your-langsmith-key

⚠️ Security Note: Never commit your .env file to version control. Add it to your .gitignore.

Your First LangChain Program

Let's verify everything is working:

Python hello_langchain.py
from dotenv import load_dotenv
from langchain_openai import ChatOpenAI

# Load environment variables load_dotenv()

# Initialize the model llm = ChatOpenAI(model="gpt-4o-mini")

# Simple invocation response = llm.invoke("What is LangChain in one sentence?") print(response.content)

Run it:

$ python hello_langchain.py
LangChain is a framework for developing applications powered by language models,
enabling features like chaining, memory, and tool use.

Part 3: Core Concepts Deep Dive

The LCEL (LangChain Expression Language)

LCEL is LangChain's declarative way to compose chains. It uses the pipe operator (|) to connect components:

Python lcel_example.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

# Define components prompt = ChatPromptTemplate.from_messages([ ("system", "You are a helpful assistant that explains concepts simply."), ("user", "Explain {topic} in {style} style.") ])

llm = ChatOpenAI(model="gpt-4o-mini") output_parser = StrOutputParser()

# Chain them together with LCEL chain = prompt | llm | output_parser

# Invoke the chain result = chain.invoke({ "topic": "quantum computing", "style": "explain like I'm 10" })

print(result)

How it works:

  1. prompt takes your inputs and formats them into a proper message
  2. The formatted message goes to llm (the model)
  3. output_parser extracts the string content from the response

Prompt Templates

Prompt templates are reusable structures for your prompts. LangChain supports several types:

Python prompt_types.py
from langchain_core.prompts import (
    PromptTemplate,
    ChatPromptTemplate,
    FewShotPromptTemplate,
    MessagesPlaceholder
)

# 1. Simple string template simple = PromptTemplate.from_template( "Write a {adjective} poem about {subject}." )

# 2. Chat template (for chat models) chat = ChatPromptTemplate.from_messages([ ("system", "You are a {role}."), ("user", "{question}") ])

# 3. With message history placeholder chat_with_history = ChatPromptTemplate.from_messages([ ("system", "You are a helpful assistant."), MessagesPlaceholder(variable_name="history"), ("user", "{input}") ])

# 4. Few-shot template (for providing examples) examples = [ {"input": "happy", "output": "sad"}, {"input": "tall", "output": "short"}, ]

example_template = PromptTemplate( input_variables=["input", "output"], template="Input: {input}\nOutput: {output}" )

few_shot = FewShotPromptTemplate( examples=examples, example_prompt=example_template, prefix="Give the opposite of each input.", suffix="Input: {input}\nOutput:", input_variables=["input"] )

Output Parsers

Output parsers transform the LLM's response into structured data:

Python structured_output.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
from typing import List

# Define your output structure class MovieReview(BaseModel): """A structured movie review.""" title: str = Field(description="The movie title") rating: int = Field(description="Rating from 1-10") pros: List[str] = Field(description="List of positive aspects") cons: List[str] = Field(description="List of negative aspects") summary: str = Field(description="One sentence summary")

# Use with_structured_output for guaranteed schema llm = ChatOpenAI(model="gpt-4o-mini") structured_llm = llm.with_structured_output(MovieReview)

prompt = ChatPromptTemplate.from_messages([ ("system", "You are a movie critic. Analyze the given movie."), ("user", "Review the movie: {movie}") ])

chain = prompt | structured_llm

# The result is a MovieReview object! review = chain.invoke({"movie": "The Matrix"}) print(f"Rating: {review.rating}/10") print(f"Pros: {review.pros}")

Part 4: Building a Chatbot with Memory

One of the most common LangChain use cases is building chatbots. The key challenge is maintaining conversation history—LLMs are stateless, so we need to manage memory ourselves.

Simple Chatbot Implementation

Python chatbot.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage

# Setup llm = ChatOpenAI(model="gpt-4o-mini")

prompt = ChatPromptTemplate.from_messages([ ("system", """You are a helpful AI assistant. You remember the conversation and can refer back to earlier messages. Be conversational and friendly."""), MessagesPlaceholder(variable_name="history"), ("user", "{input}") ])

chain = prompt | llm

# In-memory conversation history conversation_history = []

def chat(user_input: str) -> str: """Send a message and get a response.""" # Add user message to history conversation_history.append(HumanMessage(content=user_input))

<span class="text-slate-500"># Get response</span>
response = chain.invoke({
    <span class="text-green-400">"history"</span>: conversation_history[:-1],  <span class="text-slate-500"># All except current</span>
    <span class="text-green-400">"input"</span>: user_input
})

<span class="text-slate-500"># Add AI response to history</span>
conversation_history.append(AIMessage(content=response.content))

<span class="text-purple-400">return</span> response.content

# Example conversation if name == "main": print("Chatbot ready! Type 'quit' to exit.\n")

<span class="text-purple-400">while</span> <span class="text-blue-400">True</span>:
    user_input = <span class="text-purple-400">input</span>(<span class="text-green-400">"You: "</span>)
    <span class="text-purple-400">if</span> user_input.lower() == <span class="text-green-400">"quit"</span>:
        <span class="text-purple-400">break</span>

    response = chat(user_input)
    <span class="text-purple-400">print</span>(f<span class="text-green-400">"Assistant: {response}\n"</span>)</code></pre>

Persistent Memory with Redis

For production applications, you'll want persistent memory. Here's how to use Redis:

Python chatbot_redis.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.chat_message_histories import RedisChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory

llm = ChatOpenAI(model="gpt-4o-mini")

prompt = ChatPromptTemplate.from_messages([ ("system", "You are a helpful assistant."), MessagesPlaceholder(variable_name="history"), ("user", "{input}") ])

chain = prompt | llm

# Function to get session history from Redis def get_session_history(session_id: str): return RedisChatMessageHistory( session_id=session_id, url="redis://localhost:6379" )

# Wrap chain with message history chain_with_history = RunnableWithMessageHistory( chain, get_session_history, input_messages_key="input", history_messages_key="history" )

# Use with session ID response = chain_with_history.invoke( {"input": "Hi, my name is Alice!"}, config={"configurable": {"session_id": "user_123"}} )

Part 5: Retrieval-Augmented Generation (RAG)

RAG is one of the most powerful patterns in LangChain. It lets your AI answer questions based on your own documents—essentially giving it a custom knowledge base.

How RAG Works

  1. Index: Split documents into chunks and create embeddings
  2. Retrieve: When a question comes in, find relevant chunks
  3. Generate: Send the question + relevant context to the LLM

Building a Document Q&A System

Let's build a complete RAG system that can answer questions about PDF documents:

Python rag_system.py
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser

# Step 1: Load and split documents def load_documents(pdf_path: str): """Load PDF and split into chunks.""" loader = PyPDFLoader(pdf_path) documents = loader.load()

splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    separators=[<span class="text-green-400">"\n\n"</span>, <span class="text-green-400">"\n"</span>, <span class="text-green-400">" "</span>, <span class="text-green-400">""</span>]
)

<span class="text-purple-400">return</span> splitter.split_documents(documents)

# Step 2: Create vector store def create_vectorstore(documents): """Create embeddings and store in Chroma.""" embeddings = OpenAIEmbeddings()

vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=embeddings,
    persist_directory=<span class="text-green-400">"./chroma_db"</span>
)

<span class="text-purple-400">return</span> vectorstore

# Step 3: Create RAG chain def create_rag_chain(vectorstore): """Create the RAG chain for Q&A."""

retriever = vectorstore.as_retriever(
    search_type=<span class="text-green-400">"similarity"</span>,
    search_kwargs={<span class="text-green-400">"k"</span>: 4}  <span class="text-slate-500"># Return top 4 chunks</span>
)

llm = ChatOpenAI(model=<span class="text-green-400">"gpt-4o-mini"</span>, temperature=0)

prompt = ChatPromptTemplate.from_template(<span class="text-green-400">"""
Answer the question based on the following context. If you cannot
find the answer in the context, say "I don't have enough information
to answer that question."

Context:
{context}

Question: {question}

Answer:"""</span>)

<span class="text-slate-500"># Format retrieved documents</span>
<span class="text-purple-400">def</span> <span class="text-yellow-400">format_docs</span>(docs):
    <span class="text-purple-400">return</span> <span class="text-green-400">"\n\n"</span>.join(doc.page_content <span class="text-purple-400">for</span> doc <span class="text-purple-400">in</span> docs)

<span class="text-slate-500"># Build the chain</span>
chain = (
    {<span class="text-green-400">"context"</span>: retriever | format_docs, <span class="text-green-400">"question"</span>: RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

<span class="text-purple-400">return</span> chain

# Usage if name == "main": # Load your documents docs = load_documents("your_document.pdf") print(f"Loaded {len(docs)} chunks")

<span class="text-slate-500"># Create vector store</span>
vectorstore = create_vectorstore(docs)
<span class="text-purple-400">print</span>(<span class="text-green-400">"Vector store created"</span>)

<span class="text-slate-500"># Create RAG chain</span>
rag_chain = create_rag_chain(vectorstore)

<span class="text-slate-500"># Ask questions!</span>
question = <span class="text-green-400">"What are the main topics covered in this document?"</span>
answer = rag_chain.invoke(question)
<span class="text-purple-400">print</span>(f<span class="text-green-400">"Answer: {answer}"</span>)</code></pre>

Advanced RAG: Hybrid Search

For better retrieval, combine vector search with keyword search:

Python hybrid_search.py
from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import BM25Retriever

def create_hybrid_retriever(documents, vectorstore): """Combine vector search with BM25 keyword search."""

<span class="text-slate-500"># Vector retriever (semantic search)</span>
vector_retriever = vectorstore.as_retriever(search_kwargs={<span class="text-green-400">"k"</span>: 4})

<span class="text-slate-500"># BM25 retriever (keyword search)</span>
bm25_retriever = BM25Retriever.from_documents(documents)
bm25_retriever.k = 4

<span class="text-slate-500"># Combine with ensemble</span>
ensemble_retriever = EnsembleRetriever(
    retrievers=[vector_retriever, bm25_retriever],
    weights=[0.6, 0.4]  <span class="text-slate-500"># 60% vector, 40% keyword</span>
)

<span class="text-purple-400">return</span> ensemble_retriever</code></pre>

Part 6: Building AI Agents

Agents are LLMs that can use tools and make decisions about which actions to take. They're the most powerful—and complex—part of LangChain.

Creating Custom Tools

Python custom_tools.py
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
import requests

# Define custom tools @tool def get_weather(city: str) -> str: """Get the current weather for a city.""" # In production, use a real weather API return f"The weather in {city} is 72°F and sunny."

@tool def search_web(query: str) -> str: """Search the web for information.""" # In production, use a real search API return f"Search results for '{query}': [Example results...]"

@tool def calculate(expression: str) -> str: """Evaluate a mathematical expression.""" try: result = eval(expression) # Use a safe eval in production! return str(result) except: return "Could not evaluate expression"

# Create the agent tools = [get_weather, search_web, calculate] llm = ChatOpenAI(model="gpt-4o-mini")

agent = create_react_agent(llm, tools)

# Run the agent result = agent.invoke({ "messages": [{"role": "user", "content": "What's the weather in Tokyo?"}] })

print(result["messages"][-1].content)

Multi-Step Agent Example

Here's a more complex agent that can perform research tasks:

Python research_agent.py
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langchain_community.tools import DuckDuckGoSearchRun
from langgraph.prebuilt import create_react_agent

# Real web search tool search = DuckDuckGoSearchRun()

@tool def web_search(query: str) -> str: """Search the web for current information.""" return search.run(query)

@tool def save_note(content: str) -> str: """Save a research note to a file.""" with open("research_notes.txt", "a") as f: f.write(content + "\n\n") return "Note saved successfully."

@tool def read_notes() -> str: """Read all saved research notes.""" try: with open("research_notes.txt", "r") as f: return f.read() except FileNotFoundError: return "No notes saved yet."

# Create research agent tools = [web_search, save_note, read_notes] llm = ChatOpenAI(model="gpt-4o", temperature=0)

agent = create_react_agent( llm, tools, state_modifier="""You are a research assistant. When given a topic: 1. Search for relevant information 2. Save important findings as notes 3. Synthesize your findings into a summary

Be thorough but concise."""</span>

)

# Example: Research a topic result = agent.invoke({ "messages": [{ "role": "user", "content": "Research the latest developments in quantum computing in 2024" }] })

Part 7: Production Deployment

Streaming Responses

For better UX, stream responses instead of waiting for the full completion:

Python streaming.py
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate

llm = ChatOpenAI(model="gpt-4o-mini", streaming=True)

prompt = ChatPromptTemplate.from_template("Tell me a story about {topic}") chain = prompt | llm

# Stream the response for chunk in chain.stream({"topic": "a robot learning to paint"}): print(chunk.content, end="", flush=True)

Error Handling and Retries

Python robust_chain.py
from langchain_openai import ChatOpenAI
from langchain_core.runnables import RunnableConfig

# Configure retries and timeouts llm = ChatOpenAI( model="gpt-4o-mini", max_retries=3, timeout=30, request_timeout=60 )

# Add fallback model fallback_llm = ChatOpenAI(model="gpt-3.5-turbo") robust_llm = llm.with_fallbacks([fallback_llm])

# Use with rate limiting from langchain_core.rate_limiters import InMemoryRateLimiter

rate_limiter = InMemoryRateLimiter( requests_per_second=1, check_every_n_seconds=0.1, max_bucket_size=10 )

llm_rate_limited = ChatOpenAI( model="gpt-4o-mini", rate_limiter=rate_limiter )

LangSmith for Monitoring

LangSmith is essential for debugging and monitoring production applications:

.env
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your-langsmith-api-key
LANGCHAIN_PROJECT=my-production-app

With tracing enabled, you get:

  • Full visibility into every chain execution
  • Latency breakdowns for each component
  • Token usage and cost tracking
  • Error logging and debugging
  • Dataset collection for fine-tuning

Part 8: Real-World Project: AI Customer Support

Let's put everything together into a production-ready customer support bot:

Python support_bot.py
from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage, AIMessage
from langgraph.prebuilt import create_react_agent
from pydantic import BaseModel, Field
from typing import Optional
import json

# Initialize components llm = ChatOpenAI(model="gpt-4o", temperature=0) embeddings = OpenAIEmbeddings()

# Load knowledge base (your FAQ/documentation) vectorstore = Chroma( persist_directory="./support_kb", embedding_function=embeddings ) retriever = vectorstore.as_retriever(search_kwargs={"k": 3})

# Define tools @tool def search_knowledge_base(query: str) -> str: """Search the knowledge base for relevant information.""" docs = retriever.invoke(query) return "\n\n".join([doc.page_content for doc in docs])

@tool def check_order_status(order_id: str) -> str: """Check the status of an order by order ID.""" # In production, query your database return json.dumps({ "order_id": order_id, "status": "shipped", "estimated_delivery": "2024-01-15" })

@tool def create_support_ticket( subject: str, description: str, priority: str = "medium" ) -> str: """Create a support ticket for issues that need human attention.""" # In production, create in your ticketing system ticket_id = "TKT-12345" return f"Created ticket {ticket_id}: {subject}"

# Create the support agent tools = [search_knowledge_base, check_order_status, create_support_ticket]

system_prompt = """You are a helpful customer support agent for TechCo.

Guidelines:

  1. Always be polite and empathetic
  2. Search the knowledge base before saying you don't know
  3. For order issues, always check the order status first
  4. Create a support ticket if you cannot resolve the issue
  5. Never make up information - if unsure, escalate to human support

Company info:

  • Return policy: 30 days, no questions asked
  • Support hours: 24/7 for chat, phone support 9am-5pm EST
  • Shipping: Free over $50, otherwise $5.99"""

agent = create_react_agent(llm, tools, state_modifier=system_prompt)

# Conversation handler with memory class SupportBot: def init(self): self.history = []

<span class="text-purple-400">def</span> <span class="text-yellow-400">chat</span>(self, user_message: str) -> str:
    self.history.append({<span class="text-green-400">"role"</span>: <span class="text-green-400">"user"</span>, <span class="text-green-400">"content"</span>: user_message})

    result = agent.invoke({<span class="text-green-400">"messages"</span>: self.history})

    assistant_message = result[<span class="text-green-400">"messages"</span>][-1].content
    self.history.append({<span class="text-green-400">"role"</span>: <span class="text-green-400">"assistant"</span>, <span class="text-green-400">"content"</span>: assistant_message})

    <span class="text-purple-400">return</span> assistant_message

# Usage bot = SupportBot() print(bot.chat("Hi, I need help with my order #12345")) print(bot.chat("When will it arrive?"))

Common Pitfalls and How to Avoid Them

❌ Pitfall: Ignoring Token Limits

LLMs have context limits. If you pass too much text, you'll get errors or truncated responses.

✓ Solution: Use text splitters and summarization for long documents. Monitor token usage with LangSmith.

❌ Pitfall: No Error Handling

API calls can fail. Rate limits hit. Networks timeout.

✓ Solution: Use retries, fallbacks, and proper exception handling. Always have a graceful degradation path.

❌ Pitfall: Not Testing Prompts

Prompts that work in testing may fail with real user input.

✓ Solution: Create evaluation datasets. Use LangSmith to track prompt performance. A/B test prompt variations.

❌ Pitfall: Exposing API Keys

Hardcoded keys or keys in version control are security risks.

✓ Solution: Use environment variables. Add .env to .gitignore. Use secrets management in production.

Next Steps

You now have the foundation to build powerful AI applications with LangChain. Here's where to go next:

📚 Learn More

  • • LangChain documentation
  • • LangGraph for complex workflows
  • • LangSmith for observability

🛠️ Build Projects

  • • Personal knowledge assistant
  • • Code review bot
  • • Content generation pipeline

Ready to dive deeper?

Check out our other guides on building AI applications, including tutorials on Claude, GPT-4, and more advanced patterns like multi-agent systems.

LangChainAI DevelopmentPythonRAGAI AgentsLLMTutorial
Share:

Stay Updated on AI

Get the latest news and tutorials

No spam, unsubscribe anytime.

Comments

Loading comments...

Related Articles