Archaeologist·Field Notes from simonw/llm
Vol. I · Field Notes

simonwllm

CLI utility and Python library for interacting with Large Language Models from organizations like OpenAI, Anthropic and Gemini plus local models installed on your own machine.

8 May 2026·a substantial project
Reading Posture
From the Field
Solid Swiss Army knife for LLM CLI work, but not revolutionary.
Verdict:Reach for it
Reach for it when

You want a single, scriptable CLI to chat with multiple LLM providers and store conversation history.

Look elsewhere when

You need deep model-specific features, advanced prompt chaining, or a GUI.

In context

It's like Ollama but with first-class support for OpenAI/Anthropic and a plugin system for custom models.

Complexity●●Light
Read time~30 minutes
Language
Python
Runtime
Python >=3.10
Dependencies
0total

What using it looks like

Drawn from the project's README

From the README
pip install llm
Fig. 1 — example 1 of 6

What this is

As told for the tourist

What Is This?

LLM is a tool that lets you talk to AI language models (like ChatGPT, Claude, or Gemini) directly from your computer's command line or from your own Python code. Think of it as a universal remote control for all the different AI chatbots out there — instead of opening a dozen different websites or apps, you type a single command and get answers from whichever AI you want.

What Can You Do With It?

You could use this to ask questions without ever opening a browser. For example, you can type:

llm "Explain how DNS works in one paragraph"

And get an answer from OpenAI's GPT-4 or Anthropic's Claude right in your terminal. You could also save all your conversations automatically — every prompt and response gets stored in a local database, so you can search through your history later.

You could use it to generate and store "embeddings" (think of these as mathematical fingerprints of text that help you find similar content). Or you could ask the AI to extract structured data from messy text — like pulling names, dates, and prices out of an email and formatting them neatly.

You can even give the AI the ability to run tools on your computer, like searching your files or running calculations, then using those results to answer your questions.

llm "Explain how DNS works in one paragraph"

How It Works (No Jargon)

It's like a universal adapter for AI models. Just like a travel adapter lets you plug your phone into different wall sockets around the world, LLM lets you talk to different AI models using the same commands. You don't need to learn a new way of asking questions for each AI service.

It's like a tape recorder for your conversations. Every time you ask something, LLM automatically saves both your question and the AI's answer into a little database on your computer. Later, you can rewind and look up what you asked last week, or search through all your past conversations.

It's like a toolbox with expandable drawers. The core tool is simple, but you can add new "plugins" (extra pieces of software) that give it new abilities. Want to use a model running on your own computer instead of through the cloud? There's a plugin for that. Need to connect to a new AI service that just launched? Someone probably already made a plugin.

What's Cool About It?

The coolest thing is that it treats every AI model the same way. Whether you're using OpenAI's expensive flagship model or a free one running on your laptop, the commands are identical. You just swap the model name. This means you're not locked into any one company's AI — if one service gets too expensive or goes down, you switch with a single word change.

Also, the fact that it saves everything locally is surprisingly powerful. You can ask "What did I ask about Python last Tuesday?" and get an instant answer, because all your history is stored on your machine, not in some cloud service you can't search.

Who Should Care?

Reach for this if you're a developer who spends a lot of time in the terminal and wants AI help without context-switching to a browser. Or if you're building your own Python applications and want to add AI features without rewriting the connection code for every different model provider. It's also great if you care about privacy and want to run models on your own computer.

Skip it if you're happy using ChatGPT's website or the Claude app, and you don't need to automate anything or save your conversations for later searching. If you never touch a command line, this tool isn't for you — it's built for people comfortable typing commands.

Start Here

A recommended reading path through the code

Start Here

A recommended reading path through the code

  1. 01
    llm/__init__.py(Core Abstractions)

    This module, `llm`, is the core of a CLI-driven LLM interaction framework.

  2. 02
    llm/cli.py(CLI Orchestration)

    This module is the CLI and orchestration layer for the `llm` package, built with Click and pluggy.

  3. 03

    This module provides foundational utility services for the LLM application.

  4. 04
    llm/models.py(Model Embeddings)

    This module provides a unified abstraction layer for interacting with large language models and managing embeddings.

What's inside

4 sections of the codebase

Read Next

Where to go from here

Sibling Projects

Codebases that occupy adjacent space

Related Expeditions
llm🌐litellm🦙Ollama💻GPT4All🔑OpenAI Python SDK🧩llm-cluster
 

Export & Share

Take the field notes with you

Words You'll Hear

Hover the dotted terms above for definitions in context

async generator

concept

A Python function that can produce multiple values over time, allowing other code to run while waiting for each value. It uses 'async' and 'yield' keywords to enable non-blocking streaming.

CLI facade

pattern

A command-line interface that acts as a simplified front-end to a more complex system, hiding internal complexity from the user. Users interact only with the CLI commands.

CLI-first

concept

A design approach where the primary way to interact with the software is through a command-line interface (typing commands in a terminal), rather than a graphical user interface.

Click

library

A Python library for building command-line interfaces with minimal boilerplate. It handles argument parsing, command grouping, and help text generation.

Composite Pattern

pattern

A design pattern that allows treating individual objects and groups of objects uniformly. The Part hierarchy lets you handle a single text part or a complex message with multiple parts the same way.

embeddings

concept

Numerical vector representations of text that capture semantic meaning, used for tasks like similarity search and clustering. They convert words or sentences into lists of numbers.

Factory Pattern

pattern

A design pattern that provides an interface for creating objects without specifying their exact class. The get_models_with_aliases function creates the right model instance based on a name.

hexagonal architecture

pattern

A design pattern that isolates core business logic from external systems (databases, APIs, UIs) using ports and adapters, making the core testable and swappable. This project explicitly does not use it.

Observer Pattern

pattern

A design pattern where an object (the subject) maintains a list of dependents (observers) and notifies them of state changes. The pluggy hook system implements this for plugin notifications.

pluggy

library

A Python library that enables a plugin system by allowing plugins to register hooks that the host application calls at specific points. It is the backbone of this project's extensibility.

plugin-based framework

pattern

A software structure that allows external code (plugins) to add or modify features without changing the core codebase. Plugins hook into predefined extension points.

plugin-core architecture

pattern

A software design where the core system provides essential functionality and defines extension points, while plugins add specialized features without modifying the core.

port/adapter separation

pattern

A design principle where interfaces (ports) are defined for external interactions, and concrete implementations (adapters) are written separately. This allows swapping databases or APIs without changing core logic.

pydantic

library

A Python library for data validation and settings management using Python type annotations. It ensures that data conforms to defined schemas and provides automatic serialization.

serialization roundtrips

concept

The process of converting an object to a format (like JSON) for storage or transmission, and later reconstructing the original object from that format without losing information.

service layer

pattern

An architectural layer that contains business logic and coordinates operations between different parts of an application, sitting between the user interface and data access code.

setuptools entry points

concept

A mechanism in Python's packaging system that allows packages to advertise plugins or extensions. Other tools can discover and load these plugins automatically at runtime.

sqlite_utils

library

A Python library that provides convenient methods for creating, querying, and managing SQLite databases. It simplifies database operations like inserting rows and running migrations.

SQLite-backed

tool

Using the SQLite database engine to store and retrieve data persistently. SQLite is a lightweight, file-based database that requires no separate server process.

stateless factory

pattern

An object that creates other objects without remembering any previous interactions. Each call to the factory is independent and does not rely on stored state.

Strategy Pattern

pattern

A design pattern that defines a family of interchangeable algorithms, allowing the client to choose which one to use at runtime. Here, different LLM providers are strategies for generating responses.

streaming

concept

A technique where data is sent in small chunks as it becomes available, rather than waiting for the entire response. For LLMs, this means showing tokens one by one as they are generated.

structured output

concept

LLM responses formatted according to a predefined schema (e.g., JSON), rather than free-form text, making them easier for programs to parse and use.

Template Method Pattern

pattern

A design pattern that defines the skeleton of an algorithm in a base class, letting subclasses override specific steps without changing the overall structure. The Response class uses this for streaming.

tool calling

concept

The ability for an LLM to request execution of external functions (like searching a database or running code) and receive the results to incorporate into its response.