Traversing the reply chain when working with topics
-
i meant the latter but you could probably do the former if there were a mechanism to link together equivalent contexts as aliases of each other. for now, the easiest thing to do would be to just copy the "authoritative" one by whoever created the thread.
-
@[email protected] NodeBB now supplies
context
with everyas:Note
object, and is resolvable as anOrderedCollection
.One thing that is not currently done is what we talked about here, inheriting the authoritative context and serving that instead. I will need to think that through a bit more.
-
Angus McLeodreplied to Angus McLeod on last edited by [email protected]The Discourse plugin will implement reply chain traversal for the purpose of topic detection when this is merged: https://github.com/discourse/discourse-activity-pub/pull/98 Essentially it implements the following (but with a limit of 3 instead of 5. angus: Go back N number of replies (perhaps 5) to see if there is a Note already associated with an existing post. If we find a Note (say at the 4th iteration) we import ALL of the intervening Notes, and add ALL notes as new posts in the relevant topic. So we’d end up with 5 new posts in the existing topic in this example. If you're curious about the detail see the ContextResolver spec: spec/lib/discourse_activity_pub/context_resolver_spec.rb
-
@[email protected] may I ask why you add a limit to the traversal logic?
I can see an argument made against doing so if it locks up the process, but the downside is you'd still have some cases where you don't get the full context.
Either way this may be moot if an iterable context is found, so
inReplyTo
traversal is ideal as a fallback mechanism.Edit: in NodeBB's case, we call an internal recursive method called
getParentChain
which just makes the S2S call and adds it to a Set. The method terminates when it encounters an object with noinReplyTo
or is unprocessable. -
The honest answer is that a limit makes some intuitive sense to me, but I have medium to low confidence in the cogency of my thinking on both the limit and where it's set. I've set it at 3 as that seems to be the more "conservative" (read "safer") approach while I think it through further / see how this first version works in practice. In terms of the "risks" (to the extent they exist) I think I'm thinking a version of the following: You could be sent a random Note inReplyTo an unrelated Note that's part of a large chain which you end up traversing for no reason. Even if you eventually get to a Note in an existing topic, say 20 replies in, is it still right to say that those replies are part of your topic in a coherent sense? In what scenario would you be missing 20 odd replies? Perhaps there is one.
-
Angus McLeodreplied to Angus McLeod on last edited by [email protected]I guess one of the things I'm assuming is that other services are implementing the Inbox Forwarding spec correctly, which would mean that, in an ideal world, you should already have the replies you should have anyway and this is more of a "stop gap". https://www.w3.org/TR/activitypub/#inbox-forwarding However, I note that Mastodon violates the spec here, which means that more replies from Mastodon might be missed than is ideal https://github.com/mastodon/mastodon/issues/5631#issuecomment-343039649
-
@[email protected] said in Traversing the reply chain when working with topics:
more replies from Mastodon might be missed than is ideal
You are not incorrect. In practice the following situation happens occasionally, especially in larger/busy topics:
- You post a reply to a topic/thread (branch A), but a different branch (B) of the topic occurs outside of your view (since the activities are not forwarded to you)
- Later on, someone you do follow replies in branch B, and you receive it.
- Traversal finds 20 posts in between you missed, and they are all added at once, and you receive the notification of new posts in the topic, except now all of the "new" posts are scattered throughout the linear flow
- Additionally, some of these new posts might appear in places higher up than where you last read
So this violates the assumption (at least in NodeBB) that if you have a "read up to" point in a topic, that there will not be new content above that point.
@[email protected] said in Traversing the reply chain when working with topics:
is it still right to say that those replies are part of your topic in a coherent sense?
From a purely technical point of view, yes, they are part of the same context (at least as derived via reply chain traversal), but from a UX POV, you could make that argument.
A forum with a linear flow of posts tends to diverge less often due to the nature of the presentation of posts themselves; something threaded models don't need to contend with.
-
@[email protected] said in Traversing the reply chain when working with topics:
You could be sent a random Note inReplyTo an unrelated Note that's part of a large chain which you end up traversing for no reason.
Another legitimate concern. My counter is that traversing the chain is rather inexpensive:
XHR => (do other things while waiting) => inReplyTo? XHR...
etc.Actual note processing is done only once the chain is complete, and a positive relation is found.
... but I can see how this could lock up the process in other languages where processing literally stops when waiting for the XHR to complete.
-
Yeah, I agree with you on both points. It’s similarly inexpensive to make 20 requests in Discourse (it’s in a background process on a seperate thread). On reflection I think part of my conservatism here is that I don’t like that Mastodon doesn’t forward activities properly and I don’t like starting from a position that I need to import 20 notes that Mastodon failed to forward to me It causes similar metadata issues in Discourse.
-
@julian @angus A bad actor with some programming skill could send you a Note that's part of an infinite inReplyTo chain.
This gets even worse if you want to look at the replies collections of individual Notes - which could form an infinitely branching tree.
None of this happens if there's a One True Collection from which the whole thread can be fetched in one gulp.
-
@[email protected] said in Traversing the reply chain when working with topics:
infinite inReplyTo chain.
I think this could be solved in part by the chain traversal sanity checking to ensure that the id is not already retrieved, but I'm not naive enough to assuming that that can't be circumvented.
... so yes, in that sense a limit makes sense from a security standpoint.