@simon Did you notice a speed difference between mlx and ollama?
Posts
-
Wrote up some notes on the new Qwen2.5-Coder-32B model, which is the first model I've run on my own Mac (64GB M2) that appears to be highly competent at writing code -
Video scraping: extracting JSON data from a 35 second screen capture for less than 1/10th of a cent https://simonwillison.net/2024/Oct/17/video-scraping/@simon Now that you mention it, I'm curious how different each platform is with tokens and how that might affect pricing (or just be a wash)
-
Video scraping: extracting JSON data from a 35 second screen capture for less than 1/10th of a cent https://simonwillison.net/2024/Oct/17/video-scraping/@simon Nice! You should drop a tokenizer in there for people.