Pruning of remote content
-
This week I sorted out a feature that I've been putting off for far too long... content pruning!
ActivityPub had been enabled since March of this year, and in that time, the database had accumulated tens of thousands¹ of pieces of content from the fediverse. This was causing our database size to grow, but it was nothing entirely insurmountable, since it's mostly text.
However, the principle of the matter was that we were being pushed a mountain of content, most of which wasn't even consumed. For reference, when I ran the script against the data on this site:
2024-06-10T18:04:52.445Z [4567,4568/3631171] - info: [notes/prune] Found 32531 topics older than 30 days (since last activity).
We then take those topics and determine whether it received "engagement". Essentially whether a user reply to or liked a post within a topic.
Filtering for those, we get:
2024-06-10T18:07:31.860Z [4567,4568/3631171] - info: [notes/prune] 32252 topics eligible for pruning
... which essentially means ~99.14% of incoming content received zero engagement. That's not entirely surprising given that content streams in at all hours of the day, and I only reply or like a fraction of them.
In the backend, I'm allowing this value to be configurable, with the default being 30 days.
Next up, remote user pruning!
¹ Actually, the number ended up being closer to 116k pieces of content. In contrast, this forum has been running for about a decade and only recorded just under 100k posts!
-
This post is deleted!
-
@[email protected] The other users on this NodeBB do not typically interact with fedi content, but that's because it's still under development and not heavily marketed.
Plus there's that whole "what is the fediverse" thing we have to figure out how to teach, heh.
-
This post is deleted!
-
@[email protected] The other part of it is... unlike Mastodon, we're not looking for NodeBB to become the app to use to interact with the fediverse.
It's certainly something I want out of my use of NodeBB, but what forums are great for are cultivating niche communities based on shared interest. If I'm able to preserve that aspect all while allowing remote content to also interact with the forum, then it's win-win.
-
This post is deleted!