File storage and bandwidth.
-
File storage and bandwidth. These expenses will also only get more expensive over time. File bandwidth is the #1 charge on the monthly server bill right now. I live in fear of an AI scraper figuring out how to scrape all of these files and bankrupting me overnight.
from RIP botsin.space.
I would expect a "fastly will solve all your problems" from this and not an attempt to write an optimized codebase. But that's just me.
Now for the advertisement (thought you were safe from it here). My benchmarking at https://data.funfedi.dev/ doesn't indicate a performance degradation in Mastodon v4.3.1. Of course, it's the receiving messages use case, and not the sending and offering up downloads. So it's completely irrelevant.
Now Helge out.
-
Let's fix media storage in the Fediverse. There's no reason that the almost 30 thousands servers all have an AWS S3 account where they mirror all cute pictures posted on botsin.space.
-
@helge one imagines a lot of duplicate media must be uploaded by bots whose entire gimmick is uploading the same picture every day
-
-
Emelia πΈπ»replied to infinite love β΄³ last edited by
-
:PUA: Shlee fucked around andreplied to Emelia πΈπ» last edited by
@thisismissem @helge I've messaged the Jortage dev to see if I can throw money at them to build a Jortage 2.0 (deployable edition)... they haven't replied
my main rant: https://shlee.fedipress.au/2024/call-to-action-fediverse-media-server/
-
@thisismissem @helge @shlee .... That's precisely how this Akkoma instance is configured. I could also configure it to just do things through a caching proxy if I wanted a little more anonymity/to be a little more bandwidth usage friendly but well (a) you can identify me by instance and (b) there's one users worth of bandwidth usage here abreast
-
@thisismissem @helge @shlee (even the Akkoma media proxy does not itself cache anything. It's just a proxy. The documentation tells you how to configure nginx to cache it)
-
@helge why not just use IPFS-or-equivalent to sync content-addressed media storage? original hosting server can "pin" (i.e. persist until user asks them to delete, whether using an IPFS-style DHT or just regular HTTP distro), relays cache, individual servers can just fetch live and not cache at all. content-addressing solves dedup, trusted relays can cache and be allowlisted/CORS-policied, etc.
https://codeberg.org/fediverse/fep/src/branch/main/fep/cd47/fep-cd47.md -
@helge (full disclosure, i work for the IPFS Foundation, but i'm happy to outline the pro's and con's of "rolling one's own" and using equivalent technologies rather than the IPFS bundlings thereof. i just use "IPFS-or-equivalent" in my explanations cuz lots of people know how IPFS works, approximately.)
-
bumblefudgereplied to :PUA: Shlee fucked around and last edited by
@shlee @thisismissem @helge if they don't want to DO the work but are ok "blessing" the work/fork, i would recommend rolling this up into a more full-featured Production-Grade Relay type project. I think such a Relay could charge less per month than the hosting fees it saves its servers, for a certain sweet spot of traffic/usercount... not exactly bankable in the VC sense but maybe a sideline that could help an already-trusted resource pooler like IFTAS?
-
:PUA: Shlee fucked around andreplied to bumblefudge last edited by [email protected]
@by_caballero @thisismissem @helge I've been singing the praises of shared services in the fediverse for a long time.
I think IFTAS has their hands full.. but I also think one or two people with domain specific experience could get this done... if I knew the right people I'd ask them :annoyingdog:
-
Emelia πΈπ»replied to :PUA: Shlee fucked around and last edited by
I don't think IFTAS would be the right host for a project like this, but maybe Social Web Foundation or Fedihosting from @ruud would be good hosts?
-
:PUA: Shlee fucked around andreplied to Emelia πΈπ» last edited by
@thisismissem @by_caballero @helge @ruud I'm moving to better colo soon with 10x bandwidth....
so I could host a regional node once that's done (still need to write the server itself)
-
@helge why not just use
This has a simple answer: "I don't know".
However, I expect the engineering effort to adapt Mastodon and its clients to be able to make use of IPFS to be substantial. So this is not something that can be used for short term relief.
The ideal solution would be something like "Update to Mastodon 4.4.0 and add these three config variables". I don't see something like that involving new technologies.
-
@helge oh totally, it's definitely not a quick fix and only useful as part of a bigger effort to achieve economies of scale using relays or other shared services across many many servers.
-
@shlee and al. I've added some relevant sequence diagrams to my fairer federation draft fep. Main point is the media server case, which has two consequences:
- Less bandwidth used by the posting server
- Less storage used by distributing servers
Both are good. Is this the picture everybody has in mind?
-
@helge @shlee @thisismissem @ruud @raphael
It's what _I_ have in mind, but I'm also the least familiar with benchmarking/performance metrics, so maybe wait for more qualified people to confirm
-
I'm not sure if follower (client) should connect directly to the media server. When client GETs new content, media is already cached on the follower's server (according to the diagram).