I built a new plugin for LLM called llm-jq, which lets you pipe JSON into the tool and provide a short description of what you want, then it uses an LLM to generate a jq program and executes that against the JSON for you https://simonwillison.net/2024/...
-
I built a new plugin for LLM called llm-jq, which lets you pipe JSON into the tool and provide a short description of what you want, then it uses an LLM to generate a jq program and executes that against the JSON for you https://simonwillison.net/2024/Oct/27/llm-jq/
Example usage:
llm install llm-jq
curl -s 'http''s://api.github.com/repos/simonw/datasette/issues' | \
llm jq 'count by user login, top 3' -
@simon Neat. I’ll try this. Is it effective with smallish local models?
It’d be neat to have a flag to explain the jq expressions, so it creates an opportunity to learn jq instead of always outsourcing it to an LLM.
-
@twp I've not tried it with a local model yet - it might work OK, needs to be a model that supports system prompts though (which most of them do)
-
Simon Willisonreplied to Simon Willison last edited by
@twp I just tried Phi-3.1-mini-128k-instruct-Q8_0 and Meta-Llama-3.1-8B-Instruct-Q4_K_M running locally and neither of them quite worked - they both returned either wrong or invalid jq programs for my prompt
-
Michael Hungerreplied to Simon Willison last edited by
@simon do you pass all or parts of the json to the llm? Or json schema with the instructions? How does it work with large json data? Use a sample?
-
-
-
Simon Willisonreplied to Simon Willison last edited by
@akaihola @mesirii using https://pypi.org/project/genson/ to summarize the JSON is an interesting idea
Currently I avoid reading the whole stream into memory at once which means I can't easily do a two step process where it first reads the whole thing and then replays it later, but I'm sure that could be fixed