Video scraping: extracting JSON data from a 35 second screen capture for less than 1/10th of a cent https://simonwillison.net/2024/Oct/17/video-scraping/
-
@dbreunig I'm still frustrated that Anthropic don't release their tokenizer!
Gemini have an API endpoint for counting tokens but I think it needs an API key
-
@simon Now that you mention it, I'm curious how different each platform is with tokens and how that might affect pricing (or just be a wash)
-
@dbreunig yeah it's frustratingly difficult to compare tokenizers, which sure make price per million less directly comparable
-
@dbreunig running a benchmark that processes a long essay and records the input token count for different models could be interesting though
-
@simon At what environmental cost though?
-
Simon Willisonreplied to axleyjc last edited by [email protected]
@axleyjc I'd love to understand that more
In particular, how does the energy usage of running that prompt for a few seconds compare to the energy usage of me running my laptop for a few minutes longer to achieve the task by hand?
-
@axleyjc I was using Gemini Flash here which is a much cheaper, faster and (presumably) less energy intensive model than Gemini Pro
There's also the new Gemini Flash 8B which is cheaper still, and the "8B" in the name suggests that the parameter count may be low enough that it could run on a laptop if they ever released the weights (I run open 8B models locally all the time)
-
@simon great documentation ... Any details on accuracy? How much did you have to clean up the output and did you have to check it all by hand?
-
@th0ma5 I checked it all, didn't take long (I watched the 35s video and scanned the JSON) - it was exactly correct
-
@simon Is it also possible to calculate how much energy these things use, and some comparisons of what that's equivalent to? I hear that AI is energy intensive but I have zero concept of what that means in reality for a single "thing" like this.
-
@philgyford if that's possible I haven't seen anyone do it yet - the industry don't seem to want to talk specifics
GPUs apparently draw a lot more power when they are actively computing than when they are idle, so there's an energy cost associated with running a prompt that wouldn't exist if the hardware was turned on but not doing anything
-
Video scraping in Ars Technica! https://arstechnica.com/ai/2024/10/cheap-ai-video-scraping-can-now-extract-data-from-any-screen-recording/
-
-
@superFelix5000 @th0ma5 right - the single hardest thing about learning to productively work with LLMs is figuring out how to get useful results out of inherently unreliable technology
-
@simon Are you also considering that the low costs are illusory? They are artificially low either because they are subsidized by VC money or in Google's case, advertising revenue. Openai is bleeding $ and not making a profit. Just given the energy and hardware costs, the current prices are unsustainable.
There's a serious risk that companies who have hard dependencies on cloud genai will get a big surprise bill down the road or have to pivot to a local model that's not as good.
-
@axleyjc I think about that all the time! It's one of the reasons I've been hesitant to committing to building substantial things on Gemini that assume the price will stay constant