Skip to main content

Build an Agent

By themselves, language models can't take actions - they just output text. A big use case for LangChain is creating agents. Agents are systems that use an LLM as a reasoning enginer to determine which actions to take and what the inputs to those actions should be. The results of those actions can then be fed back into the agent and it determine whether more actions are needed, or whether it is okay to finish.

In this tutorial we will build an agent that can interact with multiple different tools: one being a local database, the other being a search engine. You will be able to ask this agent questions, watch it call tools, and have conversations with it.


Concepts we will cover are:

  • Using language models, in particular their tool calling ability
  • Creating a Retriever to expose specific information to our agent
  • Using a Search Tool to look up things online
  • Using LangGraph Agents which use an LLM to think about what to do and then execute upon that
  • Debugging and tracing your application using LangSmith

Setup: LangSmith​

By definition, agents take a self-determined, input-dependent sequence of steps before returning a user-facing output. This makes debugging these systems particularly tricky, and observability particularly important. LangSmith is especially useful for such cases.

When building with LangChain, all steps will automatically be traced in LangSmith. To set up LangSmith we just need set the following environment variables:

export LANGCHAIN_TRACING_V2="true"
export LANGCHAIN_API_KEY="<your-api-key>"

Define tools​

We first need to create the tools we want to use. We will use two tools: Tavily (to search online) and then a retriever over a local index we will create.


We have a built-in tool in LangChain to easily use Tavily search engine as tool. Note that this requires a Tavily API key set as an environment variable named TAVILY_API_KEY - they have a free tier, but if you don’t have one or don’t want to create one, you can always ignore this step.

import { TavilySearchResults } from "@langchain/community/tools/tavily_search";

const searchTool = new TavilySearchResults();

const toolResult = await searchTool.invoke("what is the weather in SF?");


[{"title":"Weather in December 2023 in San Francisco, California, USA","url":"","content":"Currently: 52 °F. Broken clouds. (Weather station: San Francisco International Airport, USA). See more current weather Select month: December 2023 Weather in San Francisco — Graph °F Sun, Dec 17 Lo:55 6 pm Hi:57 4 Mon, Dec 18 Lo:54 12 am Hi:55 7 Lo:54 6 am Hi:55 10 Lo:57 12 pm Hi:64 9 Lo:63 6 pm Hi:64 14 Tue, Dec 19 Lo:61","score":0.96006},...]


We will also create a retriever over some data of our own. For a deeper explanation of each step here, see our how to guides.

import { RecursiveCharacterTextSplitter } from "langchain/text_splitter";
import { CheerioWebBaseLoader } from "langchain/document_loaders/web/cheerio";
import { MemoryVectorStore } from "langchain/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";

const loader = new CheerioWebBaseLoader(
const rawDocs = await loader.load();

const splitter = new RecursiveCharacterTextSplitter({
chunkSize: 1000,
chunkOverlap: 200,
const docs = await splitter.splitDocuments(rawDocs);

const vectorstore = await MemoryVectorStore.fromDocuments(
new OpenAIEmbeddings()
const retriever = vectorstore.asRetriever();

const retrieverResult = await retriever.invoke(
"how to upload a dataset"

Document {
pageContent: "your application progresses through the beta testing phase, it's essential to continue collecting data to refine and improve its performance. LangSmith enables you to add runs as examples to datasets (from both the project page and within an annotation queue), expanding your test coverage on real-world scenarios. This is a key benefit in having your logging system and your evaluation/testing system in the same platform.Production​Closely inspecting key data points, growing benchmarking datasets, annotating traces, and drilling down into important data in trace view are workflows you’ll also want to do once your app hits production. However, especially at the production stage, it’s crucial to get a high-level overview of application performance with respect to latency, cost, and feedback scores. This ensures that it's delivering desirable results at scale.Monitoring and A/B Testing​LangSmith provides monitoring charts that allow you to track key metrics over time. You can expand to",
metadata: {
source: '',
loc: { lines: [Object] }

Now that we have populated our index that we will do doing retrieval over, we can easily turn it into a tool (the format needed for an agent to properly use it):

import { createRetrieverTool } from "langchain/tools/retriever";

const retrieverTool = createRetrieverTool(retriever, {
name: "langsmith_search",
"Search for information about LangSmith. For any questions about LangSmith, you must use this tool!",


Now that we have created both, we can create a list of tools that we will use downstream:

const tools = [searchTool, retrieverTool];

Create the agent​

Now that we have defined the tools, we can create the agent. We will be using an OpenAI Functions agent - for more information on this type of agent, as well as other options, see this guide.

First, we choose the LLM we want to be guiding the agent.

import { ChatOpenAI } from "@langchain/openai";

const llm = new ChatOpenAI({
model: "gpt-3.5-turbo",
temperature: 0,

Next, we choose the prompt we want to use to guide the agent:

import type { ChatPromptTemplate } from "@langchain/core/prompts";
import { pull } from "langchain/hub";

// Get the prompt to use - you can modify this!
// If you want to see the prompt in full, you can at:
const prompt = await pull<ChatPromptTemplate>(

Now, we can initalize the agent with the LLM, the prompt, and the tools. The agent is responsible for taking in input and deciding what actions to take. Crucially, the Agent does not execute those actions - that is done by the AgentExecutor (next step). For more information about how to thing about these components, see our conceptual guide.

import { createOpenAIFunctionsAgent } from "langchain/agents";

const agent = await createOpenAIFunctionsAgent({

Finally, we combine the agent (the brains) with the tools inside the AgentExecutor (which will repeatedly call the agent and execute tools). For more information about how to thing about these components, see our conceptual guide.

import { AgentExecutor } from "langchain/agents";

const agentExecutor = new AgentExecutor({

Run the agent​

We can now run the agent on a few queries! Note that for now, these are all stateless queries (it won’t remember previous interactions).

const result1 = await agentExecutor.invoke({
input: "hi!",

[chain/start] [1:chain:AgentExecutor] Entering Chain run with input: {
"input": "hi!"
[chain/end] [1:chain:AgentExecutor] [1.36s] Exiting Chain run with output: {
"output": "Hello! How can I assist you today?"
input: 'hi!',
output: 'Hello! How can I assist you today?'
const result2 = await agentExecutor.invoke({
input: "how can langsmith help with testing?",


[chain/start] [1:chain:AgentExecutor] Entering Chain run with input: {
"input": "how can langsmith help with testing?"
[chain/end] [1:chain:AgentExecutor > 2:chain:RunnableAgent > 7:parser:OpenAIFunctionsAgentOutputParser] [66ms] Exiting Chain run with output: {
"tool": "langsmith_search",
"toolInput": {
"query": "how can LangSmith help with testing?"
"log": "Invoking \"langsmith_search\" with {\"query\":\"how can LangSmith help with testing?\"}\n",
"messageLog": [
"lc": 1,
"type": "constructor",
"id": [
"kwargs": {
"content": "",
"additional_kwargs": {
"function_call": {
"name": "langsmith_search",
"arguments": "{\"query\":\"how can LangSmith help with testing?\"}"
[tool/start] [1:chain:AgentExecutor > 8:tool:langsmith_search] Entering Tool run with input: "{"query":"how can LangSmith help with testing?"}"
[retriever/start] [1:chain:AgentExecutor > 8:tool:langsmith_search > 9:retriever:VectorStoreRetriever] Entering Retriever run with input: {
"query": "how can LangSmith help with testing?"
[retriever/end] [1:chain:AgentExecutor > 8:tool:langsmith_search > 9:retriever:VectorStoreRetriever] [294ms] Exiting Retriever run with output: {
"documents": [
"pageContent": "You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring​After all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be assigned string tags or key-value metadata, allowing you to attach correlation ids or AB test variants, and filter runs accordingly.We’ve also made it possible to associate feedback programmatically with runs. This means that if your application has a thumbs up/down button on it, you can use that to log feedback back to LangSmith. This can be used to track performance over time and pinpoint under performing data points, which you can subsequently add to a dataset for future testing — mirroring the",
"metadata": {
"source": "",
"loc": {
"lines": {
"from": 11,
"to": 11
"pageContent": "the time that we do… it’s so helpful. We can use LangSmith to debug:An unexpected end resultWhy an agent is loopingWhy a chain was slower than expectedHow many tokens an agent usedDebugging​Debugging LLMs, chains, and agents can be tough. LangSmith helps solve the following pain points:What was the exact input to the LLM?​LLM calls are often tricky and non-deterministic. The inputs/outputs may seem straightforward, given they are technically string → string (or chat messages → chat message), but this can be misleading as the input string is usually constructed from a combination of user input and auxiliary functions.Most inputs to an LLM call are a combination of some type of fixed template along with input variables. These input variables could come directly from user input or from an auxiliary function (like retrieval). By the time these input variables go into the LLM they will have been converted to a string format, but often times they are not naturally represented as a string",
"metadata": {
"source": "",
"loc": {
"lines": {
"from": 3,
"to": 3
"pageContent": "inputs, and see what happens. At some point though, our application is performing\nwell and we want to be more rigorous about testing changes. We can use a dataset\nthat we’ve constructed along the way (see above). Alternatively, we could spend some\ntime constructing a small dataset by hand. For these situations, LangSmith simplifies",
"metadata": {
"source": "",
"loc": {
"lines": {
"from": 4,
"to": 7
"pageContent": "feedback back to LangSmith. This can be used to track performance over time and pinpoint under performing data points, which you can subsequently add to a dataset for future testing — mirroring the debug mode approach.We’ve provided several examples in the LangSmith documentation for extracting insights from logged runs. In addition to guiding you on performing this task yourself, we also provide examples of integrating with third parties for this purpose. We're eager to expand this area in the coming months! If you have ideas for either -- an open-source way to evaluate, or are building a company that wants to do analytics over these runs, please reach out.Exporting datasets​LangSmith makes it easy to curate datasets. However, these aren’t just useful inside LangSmith; they can be exported for use in other contexts. Notable applications include exporting for use in OpenAI Evals or fine-tuning, such as with FireworksAI.To set up tracing in Deno, web browsers, or other runtime",
"metadata": {
"source": "",
"loc": {
"lines": {
"from": 11,
"to": 11
[chain/start] [1:chain:AgentExecutor > 10:chain:RunnableAgent] Entering Chain run with input: {
"input": "how can langsmith help with testing?",
"steps": [
"action": {
"tool": "langsmith_search",
"toolInput": {
"query": "how can LangSmith help with testing?"
"log": "Invoking \"langsmith_search\" with {\"query\":\"how can LangSmith help with testing?\"}\n",
"messageLog": [
"lc": 1,
"type": "constructor",
"id": [
"kwargs": {
"content": "",
"additional_kwargs": {
"function_call": {
"name": "langsmith_search",
"arguments": "{\"query\":\"how can LangSmith help with testing?\"}"
"observation": "You can also quickly edit examples and add them to datasets to expand the surface area of your evaluation sets or to fine-tune a model for improved quality or reduced costs.Monitoring​After all this, your app might finally ready to go in production. LangSmith can also be used to monitor your application in much the same way that you used for debugging. You can log all traces, visualize latency and token usage statistics, and troubleshoot specific issues as they arise. Each run can also be assigned string tags or key-value metadata, allowing you to attach correlation ids or AB test variants, and filter runs accordingly.We’ve also made it possible to associate feedback programmatically with runs. This means that if your application has a thumbs up/down button on it, you can use that to log feedback back to LangSmith. This can be used to track performance over time and pinpoint under performing data points, which you can subsequently add to a dataset for future testing — mirroring the\n\nthe time that we do… it’s so helpful. We can use LangSmith to debug:An unexpected end resultWhy an agent is loopingWhy a chain was slower than expectedHow many tokens an agent usedDebugging​Debugging LLMs, chains, and agents can be tough. LangSmith helps solve the following pain points:What was the exact input to the LLM?​LLM calls are often tricky and non-deterministic. The inputs/outputs may seem straightforward, given they are technically string → string (or chat messages → chat message), but this can be misleading as the input string is usually constructed from a combination of user input and auxiliary functions.Most inputs to an LLM call are a combination of some type of fixed template along with input variables. These input variables could come directly from user input or from an auxiliary function (like retrieval). By the time these input variables go into the LLM they will have been converted to a string format, but often times they are not naturally represented as a string\n\ninputs, and see what happens. At some point though, our application is performing\nwell and we want to be more rigorous about testing changes. We can use a dataset\nthat we’ve constructed along the way (see above). Alternatively, we could spend some\ntime constructing a small dataset by hand. For these situations, LangSmith simplifies\n\nfeedback back to LangSmith. This can be used to track performance over time and pinpoint under performing data points, which you can subsequently add to a dataset for future testing — mirroring the debug mode approach.We’ve provided several examples in the LangSmith documentation for extracting insights from logged runs. In addition to guiding you on performing this task yourself, we also provide examples of integrating with third parties for this purpose. We're eager to expand this area in the coming months! If you have ideas for either -- an open-source way to evaluate, or are building a company that wants to do analytics over these runs, please reach out.Exporting datasets​LangSmith makes it easy to curate datasets. However, these aren’t just useful inside LangSmith; they can be exported for use in other contexts. Notable applications include exporting for use in OpenAI Evals or fine-tuning, such as with FireworksAI.To set up tracing in Deno, web browsers, or other runtime"
[chain/end] [1:chain:AgentExecutor] [5.83s] Exiting Chain run with output: {
"input": "how can langsmith help with testing?",
"output": "LangSmith can help with testing in several ways:\n\n1. Debugging: LangSmith can be used to debug unexpected end results, agent loops, slow chains, and token usage. It helps in pinpointing underperforming data points and tracking performance over time.\n\n2. Monitoring: LangSmith can monitor applications by logging all traces, visualizing latency and token usage statistics, and troubleshooting specific issues as they arise. It also allows for associating feedback programmatically with runs, which can be used to track performance over time.\n\n3. Exporting Datasets: LangSmith makes it easy to curate datasets, which can be exported for use in other contexts such as OpenAI Evals or fine-tuning with FireworksAI.\n\nOverall, LangSmith simplifies the process of testing changes, constructing datasets, and extracting insights from logged runs, making it a valuable tool for testing and evaluation."
input: 'how can langsmith help with testing?',
output: 'LangSmith can help with testing in several ways:\n' +
'\n' +
'1. Initial Test Set: LangSmith allows developers to create datasets of inputs and reference outputs to run tests on their LLM applications. These test cases can be uploaded in bulk, created on the fly, or exported from application traces.\n' +
'\n' +
"2. Comparison View: When making changes to your applications, LangSmith provides a comparison view to see whether you've regressed with respect to your initial test cases. This is helpful for evaluating changes in prompts, retrieval strategies, or model choices.\n" +
'\n' +
'3. Monitoring and A/B Testing: LangSmith provides monitoring charts to track key metrics over time and allows for A/B testing changes in prompt, model, or retrieval strategy.\n' +
'\n' +
'4. Debugging: LangSmith offers tracing and debugging information at each step of an LLM sequence, making it easier to identify and root-cause issues when things go wrong.\n' +
'\n' +
'5. Beta Testing and Production: LangSmith enables the addition of runs as examples to datasets, expanding test coverage on real-world scenarios. It also provides monitoring for application performance with respect to latency, cost, and feedback scores at the production stage.\n' +
'\n' +
'Overall, LangSmith provides comprehensive testing and monitoring capabilities for LLM applications.'

Adding in memory​

As mentioned earlier, this agent is stateless. This means it does not remember previous interactions. To give it memory we need to pass in previous chat_history.

Note: the input variable below needs to be called chat_history because of the prompt we are using. If we use a different prompt, we could change the variable name.

const result3 = await agentExecutor.invoke({
input: "hi! my name is cob.",
chat_history: [],

input: 'hi! my name is cob.',
chat_history: [],
output: "Hello Cob! It's nice to meet you. How can I assist you today?"
import { HumanMessage, AIMessage } from "@langchain/core/messages";

const result4 = await agentExecutor.invoke({
input: "what's my name?",
chat_history: [
new HumanMessage("hi! my name is cob."),
new AIMessage("Hello Cob! How can I assist you today?"),

input: "what's my name?",
chat_history: [
HumanMessage {
content: 'hi! my name is cob.',
additional_kwargs: {}
AIMessage {
content: 'Hello Cob! How can I assist you today?',
additional_kwargs: {}
output: 'Your name is Cob. How can I assist you today, Cob?'

If we want to keep track of these messages automatically, we can wrap this in a RunnableWithMessageHistory. For more information on how to use this, see this guide.

import { ChatMessageHistory } from "langchain/stores/message/in_memory";
import { RunnableWithMessageHistory } from "@langchain/core/runnables";

const messageHistory = new ChatMessageHistory();

const agentWithChatHistory = new RunnableWithMessageHistory({
runnable: agentExecutor,
// This is needed because in most real world scenarios, a session id is needed per user.
// It isn't really used here because we are using a simple in memory ChatMessageHistory.
getMessageHistory: (_sessionId) => messageHistory,
inputMessagesKey: "input",
historyMessagesKey: "chat_history",

const result5 = await agentWithChatHistory.invoke(
input: "hi! i'm cob",
// This is needed because in most real world scenarios, a session id is needed per user.
// It isn't really used here because we are using a simple in memory ChatMessageHistory.
configurable: {
sessionId: "foo",

input: "hi! i'm cob",
chat_history: [
HumanMessage {
content: "hi! i'm cob",
additional_kwargs: {}
AIMessage {
content: 'Hello Cob! How can I assist you today?',
additional_kwargs: {}
output: 'Hello Cob! How can I assist you today?'
const result6 = await agentWithChatHistory.invoke(
input: "what's my name?",
// This is needed because in most real world scenarios, a session id is needed per user.
// It isn't really used here because we are using a simple in memory ChatMessageHistory.
configurable: {
sessionId: "foo",

input: "what's my name?",
chat_history: [
HumanMessage {
content: "hi! i'm cob",
additional_kwargs: {}
AIMessage {
content: 'Hello Cob! How can I assist you today?',
additional_kwargs: {}
HumanMessage {
content: "what's my name?",
additional_kwargs: {}
AIMessage {
content: 'Your name is Cob. How can I assist you today, Cob?',
additional_kwargs: {}
output: 'Your name is Cob. How can I assist you today, Cob?'


That’s a wrap! In this quick start we covered how to create a simple agent. Agents are a complex topic, and there’s lot to learn! Head back to the main agent page to find more resources on conceptual guides, different types of agents, how to create custom tools, and more!

Was this page helpful?

You can leave detailed feedback on GitHub.