I added multi-modal (image, audio, video) support to my LLM command-line tool and Python library, so now you can use it to run all sorts of content through LLMs such as GPT-4o, Claude and Google Gemini
-
@simon I was using an MP4 of 5 mb size. The error just says "internal error" I downloaded the video from here https://www.pexels.com/video/catching-and-releasing-a-big-carp-fish-in-the-lake-5538137/
-
@xsc I've seen a few of those "Internal error" messages too - I think it's Gemini being a little bit flaky, sometimes resubmitting works fine the second time
-
@florenciocano @djh @simon
Just not very useful for solving maths problems that haven't already been solved and scraped into the training data
https://youtu.be/8_Nr5oKIAmI
And students are supposedly using this to cheat on their homework? -
@bornach @florenciocano @djh media right - LLMs are notoriously bad at math (and logic puzzles too)
-
@simon I was using the following command
> llm 'please explain what is happening in the video' -a man-in-water.mp4 -m gemini-1.5-flash-latest
Does it look like it should work?
-
@xsc yes, if you have the llm-gemini plugin installed and configured with an API key
You could try using this script here (or using Google's AI Studio tool) ti check it's not an LLM bug: https://til.simonwillison.net/llms/prompt-gemini