File storage and bandwidth.
-
:PUA: Shlee fucked around andreplied to bumblefudge last edited by [email protected]
@by_caballero @thisismissem @helge I've been singing the praises of shared services in the fediverse for a long time.
I think IFTAS has their hands full.. but I also think one or two people with domain specific experience could get this done... if I knew the right people I'd ask them :annoyingdog:
-
Emelia πΈπ»replied to :PUA: Shlee fucked around and last edited by
I don't think IFTAS would be the right host for a project like this, but maybe Social Web Foundation or Fedihosting from @ruud would be good hosts?
-
:PUA: Shlee fucked around andreplied to Emelia πΈπ» last edited by
@thisismissem @by_caballero @helge @ruud I'm moving to better colo soon with 10x bandwidth....
so I could host a regional node once that's done (still need to write the server itself)
-
@helge why not just use
This has a simple answer: "I don't know".
However, I expect the engineering effort to adapt Mastodon and its clients to be able to make use of IPFS to be substantial. So this is not something that can be used for short term relief.
The ideal solution would be something like "Update to Mastodon 4.4.0 and add these three config variables". I don't see something like that involving new technologies.
-
@helge oh totally, it's definitely not a quick fix and only useful as part of a bigger effort to achieve economies of scale using relays or other shared services across many many servers.
-
@shlee and al. I've added some relevant sequence diagrams to my fairer federation draft fep. Main point is the media server case, which has two consequences:
- Less bandwidth used by the posting server
- Less storage used by distributing servers
Both are good. Is this the picture everybody has in mind?
-
@helge @shlee @thisismissem @ruud @raphael
It's what _I_ have in mind, but I'm also the least familiar with benchmarking/performance metrics, so maybe wait for more qualified people to confirm
-
I'm not sure if follower (client) should connect directly to the media server. When client GETs new content, media is already cached on the follower's server (according to the diagram).
-
:PUA: Shlee fucked around andreplied to Helge last edited by [email protected]
@helge @thisismissem @ruud @raphael @by_caballero I don't understand the protocol enough so please take this toot as a note.
My objective is closer to just offering a S3 compatible service and adding the AP smarts to that in the backend (relay style to push/pull media between media servers).
but having a AP level support for media server location makes sense (maybe falling back to local if the mediaserver goes down or gets replaced) or even supporting a primary/second media server on the instance level in case of failure.
also, I'd like to see the AP include a UUID/hash of every media file so that in theory, instances/clients can migrate/roam from media server to media server or new media servers can easily find missing files.
but in principle yes Thank you for your efforts.
edit: I'm interested in the options around per toot media location. Because right now, the media URL is hardcoded on a toot by toot basis, and changing that media path retroactively feels hard/impossible. see https://github.com/mastodon/mastodon/pull/16414
-
bumblefudgereplied to :PUA: Shlee fucked around and last edited by
@shlee @helge @thisismissem @ruud @raphael > also, I'd like to see the AP include a UUID/hash of every media file so that in theory, instances/clients can migrate/roam from media server to media server or new media servers can easily find missing files.
This would be a major change to/extension of the protocol, BUT I completely agree that it would be worth exploring. Hash-addressing opens up multiple "fallback" methods to implementations that want to "heal" broken paths...
-
@shlee @helge @thisismissem @ruud @raphael also, as far as making the *protocol itself* smarter about the expensive and clunky nature of uploads, it's worth mentioning that the spec itself has extremely little to say on the subject, and it would seem the authors of the protocol spec *assumed ongoing work would be done by the CG at the implementation/software level* : https://www.w3.org/TR/activitypub/#uploading-media
We could... spin that CG work item back up? it feels like a placeholder that Rhiaro made 3 commits to...
-
Well, I guess one needs to explain more than I've done so far. I'll try to update fairer federation.
Also this is really not an AP thing. I also believe that it is imperative to decouple the solution here. It's about sharing the work load in hosting media. That's relevant to every decentralized network ...
-
@helge @shlee @thisismissem @ruud @raphael @silverpill feel free to tag me if you think my review would be helpful! if i miss the codeberg email you're welcome to DM and/or tag me here to nudge me
-
smallcircles (Humanity Now π)replied to Emelia πΈπ» last edited by
FYI Specifically on #S3 there's #Garage, a project in #Rust funded by @EC_NGI @nlnet which implements the Amazon S3 API.
Garage - An open-source distributed object storage service
An open-source distributed storage service you can self-host to fullfill many needs.
(garagehq.deuxfleurs.fr)
-
:PUA: Shlee fucked around andreplied to smallcircles (Humanity Now π) last edited by
@smallcircles @thisismissem @helge @EC_NGI @nlnet these tools generally donβt handle the dedupe at all
You need something with some smarts to redirect requests⦠like jortage
But the replication HA side of garage (and the other similar s3 clone) is great
-
jon βreplied to :PUA: Shlee fucked around and last edited by
@shlee
Garage is the only FLOSS S3 service that I could find which allows to publish websites directly from the bucket, which is neat. -
Helgereplied to :PUA: Shlee fucked around and last edited by
After thinking and self reflecting some more, I have two things to say:
- There is a business case: If the costs are 330 $ / month for bandwidth and 70 $ /month for storage and one one can half storage, and passes half the savings along to the instance owners. One makes 20 $ / month profit per instance. So if one has like 500 instances using one's services one makes quite a bit of cash.
- If my economics is right this means for 380$ in revenue one makes 20$. That's about 5% profit. I think that's okish for a company. But there's taxes, etc ...
- Also with 500 instances, that's 190k$ a month of bills to pay. That's not a small business.
So what one would really need for this is a founder that knows if this is a "serious business case". I have no idea how to convince a bank to give me an account that can handle that kind of volume. I have no idea, how I would collect the bills. I have no idea how I would fairly split the bills between instances.
If you are a founder, and need someone to estimate the coding costs for this, and do the actual work, please feel free to reach out to me.
-
:PUA: Shlee fucked around andreplied to Helge last edited by
@helge @thisismissem @ruud @raphael @by_caballero (Unless I misunderstand you) I think you need to consider the economics of scale here.. those numbers are based on the first instance costing the same as the last.
Deduplication and the right CDN will just make the costs drop dramatically..