simonw/llm 0.17a0 on GitHub

Alpha support for attachments, allowing multi-modal models to accept images, audio, video and other formats. #578

Attachments in the CLI can be URLs:

llm "describe this image" \
  -a https://static.simonwillison.net/static/2024/pelicans.jpg

Or file paths:

llm "extract text" -a image1.jpg -a image2.jpg

Or binary data, which may need to use --attachment-type to specify the MIME type:

cat image | llm "extract text" --attachment-type - image/jpeg

Attachments are also available in the Python API:

model = llm.get_model("gpt-4o-mini")
response = model.prompt(
    "Describe these images",
    attachments=[
        llm.Attachment(path="pelican.jpg"),
        llm.Attachment(url="https://static.simonwillison.net/static/2024/pelicans.jpg"),
    ]
)

Plugins that provide alternative models can support attachments, see Attachments for multi-modal models for details.