"In 2024, it cost $N to run a Mastodon instance with ~1000 active users for a year.
-
craignicolreplied to mekka okereke :verified: last edited by
@mekkaokereke I suspect there's more than one startup who will offer AI moderation as one of the three - there must be enough public social media data to train a decent spam filter.
Mastodon/ActivityPub extensions for common platforms so anyone with an existing community can extend it into the Fedi without new hardware
A Mastodon fork optimised for shared hosting so a small admin team can run hundreds of instances on a handful of servers
-
Tim Brayreplied to mekka okereke :verified: last edited by
@mekkaokereke
X: Cheap competitors in generic SaaS offerings like blob storage and CDN.
Y: IFTAS, maybe.
Z: (stretching here) AI (but not genAI) models to help moderators with traffic the way it helps radiologists with images. -
{Insert Pasta Pun}replied to mekka okereke :verified: last edited by
Storage costs: content addressable media FEP becomes widely adopted (image bandwidth) - fairly explored, interest in implementing
Compute costs: *shrug* some kind of batching mastodon doesn't do yet reduces compute hurdle - not explored, little interestModeration: some amazing new cross instance moderation collaboration modalities are explored. Probably 15 get proposed, 10 get accepted as drafts for building, and 5 make it to production usage. 3 are really, really good.
They aren't all huge complex things that take a lot, they aren't all things that need hard coded, they aren't all standalone programs that hook in, and maybe one of them is a full protocol. Each of the 15 things explores some way to reduce mod burden
(Thinking out of the box for stuff that isn't strictly a mod tool, reply controls, shared user block lists, shared notification inboxes, spam brigade detection, downranking inflammatory posts pending mod actions, rate limiting users,)
-
Dave Alvaradoreplied to mekka okereke :verified: last edited by
X = abandoning all pretenses at moderation.
Y = mastodon.social being sold to Silicon Valley tech bros, enriching Eugen and pretty much nobody else.
Z = the introduction of a centralized content server so it takes way less disk or network to run an instance. And the tech bros get to spy on you, as a treat.
-
mekka okereke :verified:replied to Dave Alvarado last edited by
That all sounds bad
-
Dave Alvaradoreplied to mekka okereke :verified: last edited by
@mekkaokereke yeah, I just don't see how you cut hosting costs by 50% without Z. X and Y get you there.
-
Emelia ๐ธ๐ปreplied to mekka okereke :verified: last edited by
@mekkaokereke current estimates of yearly cost per account is $0.30 to $0.80 based on infrastructure, storage, etc. From what I've seen.
I'm pretty sure @esk or @dma worked out the numbers for Hachyderm too.
-
Samir Al-Battranreplied to mekka okereke :verified: last edited by
@mekkaokereke
X = Fediverse discovery project
Y = Mastodon implementation of it
Z = Getting rid of relays and local copy of remote dataFediverse discovery project will actually be a huge advantage.
I don't know if it will be complete in 2025, but if it did then it's a game changer (Also assuming Mastodon's architecture takes advantage of it)The existing design is not efficient, and also you end up missing most of the conversations (esp on smaller instances with 1K users)
-
Brandon Jonesreplied to mekka okereke :verified: last edited by
@mekkaokereke I've heard ActivityPub is pretty chatty as far as protocols go. Wonder how much headroom there is for reducing server costs purely through protocol improvements?
(That does nothing to reduce the human costs for moderation, but I think we all know tech folks are more likely to go for the easily quantifiable tech solutions first.)
-
@samir @mekkaokereke we are doing this under a grant from @ngisearch and the agreed plan is to have it done by next June. The spec work is mostly done (unfortunately we did not get much feedback) and we hope to have a first implementation for trends (our first capability) in 2 months, both the ยซย providerย ยป side and the Mastodon implementation
-
Renaud Chaputreplied to mekka okereke :verified: last edited by [email protected]
@mekkaokereke if we focus on cost, then shared moderation and shared storage. Those 2 things build on our FASP idea that we are currently actively working on.
I see a lot of people pointing technical things like switching from Rails or some other brick, but those are really not the issues. At least from my experience running instances with many hundred thousand users. -
gkrnoursreplied to mekka okereke :verified: last edited by
@mekkaokereke I wonder if shared moderation team could help. Like a moderation panel that could handle report for multiple instances and a few small instances that help moderate each other instance using such a tool. This way, a dozen instance have one mod available 2h a day, instead of being unmoderated 22h a day, could be moderated all around the clock.
In the past, spam filter have been used to classify text content. Maybe it could be done for triage in moderation.
-
Joby :gts: (he/him)replied to Brandon Jones last edited by
@tojiro @mekkaokereke I've been kinda blown away by how much traffic AP generates. I'm running a GoToSocial instance that's just me. I only have like 200 followers, and it gets almost two million requests per month and uses like 500MB of RAM 24/7 (and this is a fairly efficient AP implementation!).
-
{Insert Pasta Pun}replied to Emelia ๐ธ๐ป last edited byThis post is deleted!
-
{Insert Pasta Pun}replied to Joby :gts: (he/him) last edited by [email protected]
@joby @tojiro @mekkaokereke what's a good comparison point for the amount of traffic generated?
Like if you're publishing from one to many, are there any comparison points for lighter traffic?
I know one trivial inefficiency is if you have 50 accounts followed by 50 accounts on 50 other servers, and each one publishes one post, it could trivially send 1:1 messages per account (2500/server, 125,000 sender), or optimally send only 50 to each server, or if bundled or gossiped or shared reduce that (though some gossiping increases traffic not reduces it)
Like if your criteria is N hosts must sync M feeds...
There still needs to be that delta of changes over the wire (unless they're sending the complete object twice vs the update) and it's mostly haggling over how fast or batched the send is?
Unless there's some part that's duplicating work somewhere
Or if it's just encoding overhead?
-
{Insert Pasta Pun}replied to {Insert Pasta Pun} last edited by
@joby @tojiro @mekkaokereke or maybe it's caching problems?
-
Emelia ๐ธ๐ปreplied to {Insert Pasta Pun} last edited by
@risottobias @mekkaokereke @esk @dma that range is based on information from a half dozen large instances based on their expenses
-
This post is deleted!
-
@puppygirlhornypost2 @risottobias @mekkaokereke @esk @dma hachyderm uses digitalocean spaces, but has a custom CDN on top
-
This post is deleted!