simonw/llm · Archaeologist

From the Field

“Solid Swiss Army knife for LLM CLI work, but not revolutionary.”

Verdict:Reach for it

Reach for it when

You want a single, scriptable CLI to chat with multiple LLM providers and store conversation history.

Look elsewhere when

You need deep model-specific features, advanced prompt chaining, or a GUI.

In context

It's like Ollama but with first-class support for OpenAI/Anthropic and a plugin system for custom models.

Complexity●●●Light

Read time~30 minutes

Language

Python

Runtime

Python >=3.10

Dependencies

0total

What using it looks like

Drawn from the project's README

From the README

pip install llm

Fig. 1 — example 1 of 6

What this is

As told for the tourist

What Is This?

LLM is a tool that lets you talk to AI language models (like ChatGPT, Claude, or Gemini) directly from your computer's command line or from your own Python code. Think of it as a universal remote control for all the different AI chatbots out there — instead of opening a dozen different websites or apps, you type a single command and get answers from whichever AI you want.

What Can You Do With It?

You could use this to ask questions without ever opening a browser. For example, you can type:

llm "Explain how DNS works in one paragraph"

And get an answer from OpenAI's GPT-4 or Anthropic's Claude right in your terminal. You could also save all your conversations automatically — every prompt and response gets stored in a local database, so you can search through your history later.

You could use it to generate and store "embeddingsembeddingsconceptNumerical vector representations of text that capture semantic meaning, used for tasks like similarity search and clustering. They convert words or sentences into lists of numbers." (think of these as mathematical fingerprints of text that help you find similar content). Or you could ask the AI to extract structured data from messy text — like pulling names, dates, and prices out of an email and formatting them neatly.

You can even give the AI the ability to run tools on your computer, like searching your files or running calculations, then using those results to answer your questions.

llm "Explain how DNS works in one paragraph"

How It Works (No Jargon)

It's like a universal adapter for AI models. Just like a travel adapter lets you plug your phone into different wall sockets around the world, LLM lets you talk to different AI models using the same commands. You don't need to learn a new way of asking questions for each AI service.

It's like a tape recorder for your conversations. Every time you ask something, LLM automatically saves both your question and the AI's answer into a little database on your computer. Later, you can rewind and look up what you asked last week, or search through all your past conversations.

It's like a toolbox with expandable drawers. The core tool is simple, but you can add new "plugins" (extra pieces of software) that give it new abilities. Want to use a model running on your own computer instead of through the cloud? There's a plugin for that. Need to connect to a new AI service that just launched? Someone probably already made a plugin.

What's Cool About It?

The coolest thing is that it treats every AI model the same way. Whether you're using OpenAI's expensive flagship model or a free one running on your laptop, the commands are identical. You just swap the model name. This means you're not locked into any one company's AI — if one service gets too expensive or goes down, you switch with a single word change.

Also, the fact that it saves everything locally is surprisingly powerful. You can ask "What did I ask about Python last Tuesday?" and get an instant answer, because all your history is stored on your machine, not in some cloud service you can't search.

Who Should Care?

Reach for this if you're a developer who spends a lot of time in the terminal and wants AI help without context-switching to a browser. Or if you're building your own Python applications and want to add AI features without rewriting the connection code for every different model provider. It's also great if you care about privacy and want to run models on your own computer.

Skip it if you're happy using ChatGPT's website or the Claude app, and you don't need to automate anything or save your conversations for later searching. If you never touch a command line, this tool isn't for you — it's built for people comfortable typing commands.

Start Here

A recommended reading path through the code

Start Here

A recommended reading path through the code

01
llm/__init__.py(Core Abstractions)
This module, `llm`, is the core of a CLI-driven LLM interaction framework.
02
llm/cli.py(CLI Orchestration)
This module is the CLI and orchestration layer for the `llm` package, built with Click and pluggy.
03
llm/default_plugins/__init__.py(Utility Services)
This module provides foundational utility services for the LLM application.
04
llm/models.py(Model Embeddings)
This module provides a unified abstraction layer for interacting with large language models and managing embeddings.

What's inside

4 sections of the codebase

Sibling Projects

Codebases that occupy adjacent space

Related Expeditions

🌐

BerriAI/litellm

↗

Broader provider coverage and proxy server, less elegant API.

Focuses exclusively on local models with a simpler, model-centric CLI.

Desktop GUI and local model ecosystem with a stronger emphasis on privacy and offline use.

≈similar size

🔑

openai/OpenAI Python SDK

↗

Official SDK for a single provider with richer client features but no multi-model abstraction.

Plugin for llm that adds semantic clustering of embeddings, extending the core project.

▾smaller

Export & Share

Take the field notes with you

pattern

A design pattern that defines the skeleton of an algorithm in a base class, letting subclasses override specific steps without changing the overall structure. The Response class uses this for streaming.

tool calling

concept

The ability for an LLM to request execution of external functions (like searching a database or running code) and receive the results to incorporate into its response.

simonwllm

What using it looks like

What this is

What Is This?

What Can You Do With It?

How It Works (No Jargon)

What's Cool About It?

Who Should Care?

Start Here

Start Here

What's inside

Core Abstractions

CLI Orchestration

Model Embeddings

Utility Services

Read Next

The LLM CLI tool I didn't know I needed

LLM: A CLI for Large Language Models

How to use LLM with local models via Ollama

Prompt injection and LLM safety from the command line

LLM project README

Sibling Projects

Export & Share

Words You'll Hear

async generator

CLI facade

CLI-first

Click

Composite Pattern

embeddings

Factory Pattern

hexagonal architecture

Observer Pattern

pluggy

plugin-based framework

plugin-core architecture

port/adapter separation

pydantic

serialization roundtrips

service layer

setuptools entry points

sqlite_utils

SQLite-backed

stateless factory

Strategy Pattern

streaming

structured output

Template Method Pattern

tool calling