The Complete LangChain Guide: Build AI Appli...

Master LangChain with this comprehensive guide. Learn to build chatbots, RAG systems, AI agents, and more with step-by-step code examples in Python and TypeScript.

LangChain has become the go-to framework for building AI applications. Whether you want to create a chatbot, build a document Q&A system, or deploy autonomous AI agents, LangChain provides the building blocks. This guide takes you from zero to production-ready applications.

What You'll Learn

✓ LangChain fundamentals and architecture
✓ Working with different LLM providers
✓ Building conversational chatbots
✓ Retrieval-Augmented Generation (RAG)
✓ Creating autonomous AI agents
✓ Production deployment strategies

Prerequisites: Basic Python or TypeScript knowledge. No prior AI/ML experience required—we'll explain everything from first principles.

Part 1: Understanding LangChain

What is LangChain?

LangChain is a framework for developing applications powered by large language models (LLMs). Think of it as the "Rails for AI"—it provides structure, conventions, and reusable components so you can focus on building your application instead of reinventing the wheel.

At its core, LangChain solves several problems:

🔗

Chaining

Connect multiple LLM calls and operations together into coherent workflows

🧠

Memory

Give your AI applications persistent context across conversations

🔧

Tools

Let LLMs interact with external APIs, databases, and services

LangChain vs Alternatives

Before diving in, let's understand when to use LangChain versus alternatives:

Framework	Best For	Learning Curve
LangChain	Complex applications, RAG, agents	Medium
LlamaIndex	Data indexing and retrieval	Low
Semantic Kernel	Enterprise .NET applications	Medium
Direct API	Simple, single-purpose apps	Low

Our recommendation: Use LangChain when you need chains, memory, or agents. For simple chat completions, the direct OpenAI/Anthropic APIs might be simpler.

Part 2: Setting Up Your Environment

Installation

LangChain is available in both Python and JavaScript/TypeScript. We'll show examples in Python, but the concepts translate directly.

Terminal

# Create a virtual environment python -m venv langchain-env source langchain-env/bin/activate # On Windows: langchain-env\Scripts\activate # Install LangChain and dependencies pip install langchain langchain-openai langchain-community

# For RAG applications, also install: pip install chromadb tiktoken pypdf

API Keys Setup

Create a .env file in your project root:

.env

OPENAI_API_KEY=sk-your-openai-key-here
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key-here
# Optional: for LangSmith tracing (highly recommended)
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your-langsmith-key

⚠️ Security Note: Never commit your .env file to version control. Add it to your .gitignore.

Your First LangChain Program

Let's verify everything is working:

Python hello_langchain.py

from dotenv import load_dotenv
from langchain_openai import ChatOpenAI
# Load environment variables
load_dotenv()
# Initialize the model
llm = ChatOpenAI(model="gpt-4o-mini")
# Simple invocation
response = llm.invoke("What is LangChain in one sentence?")
print(response.content)

Run it:

$ python hello_langchain.py
LangChain is a framework for developing applications powered by language models,
enabling features like chaining, memory, and tool use.

Part 3: Core Concepts Deep Dive

The LCEL (LangChain Expression Language)

LCEL is LangChain's declarative way to compose chains. It uses the pipe operator (|) to connect components:

Python lcel_example.py

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
# Define components
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant that explains concepts simply."),
("user", "Explain {topic} in {style} style.")
])
llm = ChatOpenAI(model="gpt-4o-mini")
output_parser = StrOutputParser()
# Chain them together with LCEL
chain = prompt | llm | output_parser
# Invoke the chain
result = chain.invoke({
"topic": "quantum computing",
"style": "explain like I'm 10"
})
print(result)

How it works:

prompt takes your inputs and formats them into a proper message
The formatted message goes to llm (the model)
output_parser extracts the string content from the response

Prompt Templates

Prompt templates are reusable structures for your prompts. LangChain supports several types:

Python prompt_types.py

from langchain_core.prompts import (
    PromptTemplate,
    ChatPromptTemplate,
    FewShotPromptTemplate,
    MessagesPlaceholder
)
# 1. Simple string template
simple = PromptTemplate.from_template(
"Write a {adjective} poem about {subject}."
)
# 2. Chat template (for chat models)
chat = ChatPromptTemplate.from_messages([
("system", "You are a {role}."),
("user", "{question}")
])
# 3. With message history placeholder
chat_with_history = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
MessagesPlaceholder(variable_name="history"),
("user", "{input}")
])
# 4. Few-shot template (for providing examples)
examples = [
{"input": "happy", "output": "sad"},
{"input": "tall", "output": "short"},
]
example_template = PromptTemplate(
input_variables=["input", "output"],
template="Input: {input}\nOutput: {output}"
)
few_shot = FewShotPromptTemplate(
examples=examples,
example_prompt=example_template,
prefix="Give the opposite of each input.",
suffix="Input: {input}\nOutput:",
input_variables=["input"]
)

Output Parsers

Output parsers transform the LLM's response into structured data:

Python structured_output.py

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
from typing import List
# Define your output structure
class MovieReview(BaseModel):
"""A structured movie review."""
title: str = Field(description="The movie title")
rating: int = Field(description="Rating from 1-10")
pros: List[str] = Field(description="List of positive aspects")
cons: List[str] = Field(description="List of negative aspects")
summary: str = Field(description="One sentence summary")
# Use with_structured_output for guaranteed schema
llm = ChatOpenAI(model="gpt-4o-mini")
structured_llm = llm.with_structured_output(MovieReview)
prompt = ChatPromptTemplate.from_messages([
("system", "You are a movie critic. Analyze the given movie."),
("user", "Review the movie: {movie}")
])
chain = prompt | structured_llm
# The result is a MovieReview object!
review = chain.invoke({"movie": "The Matrix"})
print(f"Rating: {review.rating}/10")
print(f"Pros: {review.pros}")

Part 4: Building a Chatbot with Memory

One of the most common LangChain use cases is building chatbots. The key challenge is maintaining conversation history—LLMs are stateless, so we need to manage memory ourselves.

Simple Chatbot Implementation

Python chatbot.py

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.messages import HumanMessage, AIMessage
# Setup
llm = ChatOpenAI(model="gpt-4o-mini")
prompt = ChatPromptTemplate.from_messages([
("system", """You are a helpful AI assistant. You remember the conversation
and can refer back to earlier messages. Be conversational and friendly."""),
MessagesPlaceholder(variable_name="history"),
("user", "{input}")
])
chain = prompt | llm
# In-memory conversation history
conversation_history = []
def chat(user_input: str) -> str:
"""Send a message and get a response."""
# Add user message to history
conversation_history.append(HumanMessage(content=user_input))
<span class="text-slate-500"># Get response</span>
response = chain.invoke({
    <span class="text-green-400">"history"</span>: conversation_history[:-1],  <span class="text-slate-500"># All except current</span>
    <span class="text-green-400">"input"</span>: user_input
})

<span class="text-slate-500"># Add AI response to history</span>
conversation_history.append(AIMessage(content=response.content))

<span class="text-purple-400">return</span> response.content

# Example conversation
if name == "main":
print("Chatbot ready! Type 'quit' to exit.\n")
<span class="text-purple-400">while</span> <span class="text-blue-400">True</span>:
    user_input = <span class="text-purple-400">input</span>(<span class="text-green-400">"You: "</span>)
    <span class="text-purple-400">if</span> user_input.lower() == <span class="text-green-400">"quit"</span>:
        <span class="text-purple-400">break</span>

    response = chat(user_input)
    <span class="text-purple-400">print</span>(f<span class="text-green-400">"Assistant: {response}\n"</span>)</code></pre>


Persistent Memory with Redis
For production applications, you'll want persistent memory. Here's how to use Redis:


Python
chatbot_redis.py

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_community.chat_message_histories import RedisChatMessageHistory
from langchain_core.runnables.history import RunnableWithMessageHistory
llm = ChatOpenAI(model="gpt-4o-mini")
prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful assistant."),
MessagesPlaceholder(variable_name="history"),
("user", "{input}")
])
chain = prompt | llm
# Function to get session history from Redis
def get_session_history(session_id: str):
return RedisChatMessageHistory(
session_id=session_id,
url="redis://localhost:6379"
)
# Wrap chain with message history
chain_with_history = RunnableWithMessageHistory(
chain,
get_session_history,
input_messages_key="input",
history_messages_key="history"
)
# Use with session ID
response = chain_with_history.invoke(
{"input": "Hi, my name is Alice!"},
config={"configurable": {"session_id": "user_123"}}
)


Part 5: Retrieval-Augmented Generation (RAG)
RAG is one of the most powerful patterns in LangChain. It lets your AI answer questions based on your own documents—essentially giving it a custom knowledge base.

  How RAG Works
  
    Index: Split documents into chunks and create embeddings
    Retrieve: When a question comes in, find relevant chunks
    Generate: Send the question + relevant context to the LLM
  

Building a Document Q&A System
Let's build a complete RAG system that can answer questions about PDF documents:


Python
rag_system.py

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.document_loaders import PyPDFLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_core.output_parsers import StrOutputParser
# Step 1: Load and split documents
def load_documents(pdf_path: str):
"""Load PDF and split into chunks."""
loader = PyPDFLoader(pdf_path)
documents = loader.load()
splitter = RecursiveCharacterTextSplitter(
    chunk_size=1000,
    chunk_overlap=200,
    separators=[<span class="text-green-400">"\n\n"</span>, <span class="text-green-400">"\n"</span>, <span class="text-green-400">" "</span>, <span class="text-green-400">""</span>]
)

<span class="text-purple-400">return</span> splitter.split_documents(documents)

# Step 2: Create vector store
def create_vectorstore(documents):
"""Create embeddings and store in Chroma."""
embeddings = OpenAIEmbeddings()
vectorstore = Chroma.from_documents(
    documents=documents,
    embedding=embeddings,
    persist_directory=<span class="text-green-400">"./chroma_db"</span>
)

<span class="text-purple-400">return</span> vectorstore

# Step 3: Create RAG chain
def create_rag_chain(vectorstore):
"""Create the RAG chain for Q&A."""
retriever = vectorstore.as_retriever(
    search_type=<span class="text-green-400">"similarity"</span>,
    search_kwargs={<span class="text-green-400">"k"</span>: 4}  <span class="text-slate-500"># Return top 4 chunks</span>
)

llm = ChatOpenAI(model=<span class="text-green-400">"gpt-4o-mini"</span>, temperature=0)

prompt = ChatPromptTemplate.from_template(<span class="text-green-400">"""
Answer the question based on the following context. If you cannot
find the answer in the context, say "I don't have enough information
to answer that question."

Context:
{context}

Question: {question}

Answer:"""</span>)

<span class="text-slate-500"># Format retrieved documents</span>
<span class="text-purple-400">def</span> <span class="text-yellow-400">format_docs</span>(docs):
    <span class="text-purple-400">return</span> <span class="text-green-400">"\n\n"</span>.join(doc.page_content <span class="text-purple-400">for</span> doc <span class="text-purple-400">in</span> docs)

<span class="text-slate-500"># Build the chain</span>
chain = (
    {<span class="text-green-400">"context"</span>: retriever | format_docs, <span class="text-green-400">"question"</span>: RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

<span class="text-purple-400">return</span> chain

# Usage
if name == "main":
# Load your documents
docs = load_documents("your_document.pdf")
print(f"Loaded {len(docs)} chunks")
<span class="text-slate-500"># Create vector store</span>
vectorstore = create_vectorstore(docs)
<span class="text-purple-400">print</span>(<span class="text-green-400">"Vector store created"</span>)

<span class="text-slate-500"># Create RAG chain</span>
rag_chain = create_rag_chain(vectorstore)

<span class="text-slate-500"># Ask questions!</span>
question = <span class="text-green-400">"What are the main topics covered in this document?"</span>
answer = rag_chain.invoke(question)
<span class="text-purple-400">print</span>(f<span class="text-green-400">"Answer: {answer}"</span>)</code></pre>


Advanced RAG: Hybrid Search
For better retrieval, combine vector search with keyword search:


Python
hybrid_search.py

from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import BM25Retriever
def create_hybrid_retriever(documents, vectorstore):
"""Combine vector search with BM25 keyword search."""
<span class="text-slate-500"># Vector retriever (semantic search)</span>
vector_retriever = vectorstore.as_retriever(search_kwargs={<span class="text-green-400">"k"</span>: 4})

<span class="text-slate-500"># BM25 retriever (keyword search)</span>
bm25_retriever = BM25Retriever.from_documents(documents)
bm25_retriever.k = 4

<span class="text-slate-500"># Combine with ensemble</span>
ensemble_retriever = EnsembleRetriever(
    retrievers=[vector_retriever, bm25_retriever],
    weights=[0.6, 0.4]  <span class="text-slate-500"># 60% vector, 40% keyword</span>
)

<span class="text-purple-400">return</span> ensemble_retriever</code></pre>


Part 6: Building AI Agents
Agents are LLMs that can use tools and make decisions about which actions to take. They're the most powerful—and complex—part of LangChain.
Creating Custom Tools


Python
custom_tools.py

from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langgraph.prebuilt import create_react_agent
import requests
# Define custom tools
@tool
def get_weather(city: str) -> str:
"""Get the current weather for a city."""
# In production, use a real weather API
return f"The weather in {city} is 72°F and sunny."
@tool
def search_web(query: str) -> str:
"""Search the web for information."""
# In production, use a real search API
return f"Search results for '{query}': [Example results...]"
@tool
def calculate(expression: str) -> str:
"""Evaluate a mathematical expression."""
try:
result = eval(expression)  # Use a safe eval in production!
return str(result)
except:
return "Could not evaluate expression"
# Create the agent
tools = [get_weather, search_web, calculate]
llm = ChatOpenAI(model="gpt-4o-mini")
agent = create_react_agent(llm, tools)
# Run the agent
result = agent.invoke({
"messages": [{"role": "user", "content": "What's the weather in Tokyo?"}]
})
print(result["messages"][-1].content)


Multi-Step Agent Example
Here's a more complex agent that can perform research tasks:


Python
research_agent.py

from langchain_core.tools import tool
from langchain_openai import ChatOpenAI
from langchain_community.tools import DuckDuckGoSearchRun
from langgraph.prebuilt import create_react_agent
# Real web search tool
search = DuckDuckGoSearchRun()
@tool
def web_search(query: str) -> str:
"""Search the web for current information."""
return search.run(query)
@tool
def save_note(content: str) -> str:
"""Save a research note to a file."""
with open("research_notes.txt", "a") as f:
f.write(content + "\n\n")
return "Note saved successfully."
@tool
def read_notes() -> str:
"""Read all saved research notes."""
try:
with open("research_notes.txt", "r") as f:
return f.read()
except FileNotFoundError:
return "No notes saved yet."
# Create research agent
tools = [web_search, save_note, read_notes]
llm = ChatOpenAI(model="gpt-4o", temperature=0)
agent = create_react_agent(
llm,
tools,
state_modifier="""You are a research assistant. When given a topic:
1. Search for relevant information
2. Save important findings as notes
3. Synthesize your findings into a summary
Be thorough but concise."""</span>

)
# Example: Research a topic
result = agent.invoke({
"messages": [{
"role": "user",
"content": "Research the latest developments in quantum computing in 2024"
}]
})


Part 7: Production Deployment
Streaming Responses
For better UX, stream responses instead of waiting for the full completion:


Python
streaming.py

from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
llm = ChatOpenAI(model="gpt-4o-mini", streaming=True)
prompt = ChatPromptTemplate.from_template("Tell me a story about {topic}")
chain = prompt | llm
# Stream the response
for chunk in chain.stream({"topic": "a robot learning to paint"}):
print(chunk.content, end="", flush=True)


Error Handling and Retries


Python
robust_chain.py

from langchain_openai import ChatOpenAI
from langchain_core.runnables import RunnableConfig
# Configure retries and timeouts
llm = ChatOpenAI(
model="gpt-4o-mini",
max_retries=3,
timeout=30,
request_timeout=60
)
# Add fallback model
fallback_llm = ChatOpenAI(model="gpt-3.5-turbo")
robust_llm = llm.with_fallbacks([fallback_llm])
# Use with rate limiting
from langchain_core.rate_limiters import InMemoryRateLimiter
rate_limiter = InMemoryRateLimiter(
requests_per_second=1,
check_every_n_seconds=0.1,
max_bucket_size=10
)
llm_rate_limited = ChatOpenAI(
model="gpt-4o-mini",
rate_limiter=rate_limiter
)


LangSmith for Monitoring
LangSmith is essential for debugging and monitoring production applications:


.env

LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your-langsmith-api-key
LANGCHAIN_PROJECT=my-production-app

With tracing enabled, you get:

Full visibility into every chain execution
Latency breakdowns for each component
Token usage and cost tracking
Error logging and debugging
Dataset collection for fine-tuning

Part 8: Real-World Project: AI Customer Support
Let's put everything together into a production-ready customer support bot:


Python
support_bot.py

from langchain_openai import ChatOpenAI, OpenAIEmbeddings
from langchain_community.vectorstores import Chroma
from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.tools import tool
from langchain_core.messages import HumanMessage, AIMessage
from langgraph.prebuilt import create_react_agent
from pydantic import BaseModel, Field
from typing import Optional
import json
# Initialize components
llm = ChatOpenAI(model="gpt-4o", temperature=0)
embeddings = OpenAIEmbeddings()
# Load knowledge base (your FAQ/documentation)
vectorstore = Chroma(
persist_directory="./support_kb",
embedding_function=embeddings
)
retriever = vectorstore.as_retriever(search_kwargs={"k": 3})
# Define tools
@tool
def search_knowledge_base(query: str) -> str:
"""Search the knowledge base for relevant information."""
docs = retriever.invoke(query)
return "\n\n".join([doc.page_content for doc in docs])
@tool
def check_order_status(order_id: str) -> str:
"""Check the status of an order by order ID."""
# In production, query your database
return json.dumps({
"order_id": order_id,
"status": "shipped",
"estimated_delivery": "2024-01-15"
})
@tool
def create_support_ticket(
subject: str,
description: str,
priority: str = "medium"
) -> str:
"""Create a support ticket for issues that need human attention."""
# In production, create in your ticketing system
ticket_id = "TKT-12345"
return f"Created ticket {ticket_id}: {subject}"
# Create the support agent
tools = [search_knowledge_base, check_order_status, create_support_ticket]
system_prompt = """You are a helpful customer support agent for TechCo.
Guidelines:

Always be polite and empathetic
Search the knowledge base before saying you don't know
For order issues, always check the order status first
Create a support ticket if you cannot resolve the issue
Never make up information - if unsure, escalate to human support

Company info:

Return policy: 30 days, no questions asked
Support hours: 24/7 for chat, phone support 9am-5pm EST
Shipping: Free over $50, otherwise $5.99"""

agent = create_react_agent(llm, tools, state_modifier=system_prompt)
# Conversation handler with memory
class SupportBot:
def init(self):
self.history = []
<span class="text-purple-400">def</span> <span class="text-yellow-400">chat</span>(self, user_message: str) -> str:
    self.history.append({<span class="text-green-400">"role"</span>: <span class="text-green-400">"user"</span>, <span class="text-green-400">"content"</span>: user_message})

    result = agent.invoke({<span class="text-green-400">"messages"</span>: self.history})

    assistant_message = result[<span class="text-green-400">"messages"</span>][-1].content
    self.history.append({<span class="text-green-400">"role"</span>: <span class="text-green-400">"assistant"</span>, <span class="text-green-400">"content"</span>: assistant_message})

    <span class="text-purple-400">return</span> assistant_message

# Usage
bot = SupportBot()
print(bot.chat("Hi, I need help with my order #12345"))
print(bot.chat("When will it arrive?"))


Common Pitfalls and How to Avoid Them

  
    ❌ Pitfall: Ignoring Token Limits
    LLMs have context limits. If you pass too much text, you'll get errors or truncated responses.
    ✓ Solution: Use text splitters and summarization for long documents. Monitor token usage with LangSmith.
  
  
    ❌ Pitfall: No Error Handling
    API calls can fail. Rate limits hit. Networks timeout.
    ✓ Solution: Use retries, fallbacks, and proper exception handling. Always have a graceful degradation path.
  
  
    ❌ Pitfall: Not Testing Prompts
    Prompts that work in testing may fail with real user input.
    ✓ Solution: Create evaluation datasets. Use LangSmith to track prompt performance. A/B test prompt variations.
  
  
    ❌ Pitfall: Exposing API Keys
    Hardcoded keys or keys in version control are security risks.
    ✓ Solution: Use environment variables. Add .env to .gitignore. Use secrets management in production.
  

Next Steps
You now have the foundation to build powerful AI applications with LangChain. Here's where to go next:

  
    📚 Learn More
    
      • LangChain documentation
      • LangGraph for complex workflows
      • LangSmith for observability
    
  
  
    🛠️ Build Projects
    
      • Personal knowledge assistant
      • Code review bot
      • Content generation pipeline
    
  


  Ready to dive deeper?
  Check out our other guides on building AI applications, including tutorials on Claude, GPT-4, and more advanced patterns like multi-agent systems.

The Complete LangChain Guide: Build AI Applications from Scratch

What You'll Learn

Part 1: Understanding LangChain

What is LangChain?

Chaining

Memory

Tools

LangChain vs Alternatives

Part 2: Setting Up Your Environment

Installation

API Keys Setup

Your First LangChain Program

Part 3: Core Concepts Deep Dive

The LCEL (LangChain Expression Language)

Prompt Templates

Output Parsers

Part 4: Building a Chatbot with Memory

Simple Chatbot Implementation

Persistent Memory with Redis

Part 5: Retrieval-Augmented Generation (RAG)

How RAG Works

Building a Document Q&A System

Advanced RAG: Hybrid Search

Part 6: Building AI Agents

Creating Custom Tools

Multi-Step Agent Example

Part 7: Production Deployment

Streaming Responses

Error Handling and Retries

LangSmith for Monitoring

Part 8: Real-World Project: AI Customer Support

Common Pitfalls and How to Avoid Them

❌ Pitfall: Ignoring Token Limits

❌ Pitfall: No Error Handling

❌ Pitfall: Not Testing Prompts

❌ Pitfall: Exposing API Keys

Next Steps

📚 Learn More

🛠️ Build Projects

Ready to dive deeper?

Stay Updated on AI

Comments

Related Articles

Custom GPTs: How to Build AI Assistants That Actually Save Time

AI Automation with Zapier and Make: A No-Code Guide

Local AI: How to Run AI Models on Your Own Computer