on #bluesky there literally is the guy with the big dial constantly looking back at the audience for approval like a contestant on the price is right

jonny (good kind)

it is a powerful default if the statistics of the network are to be believed (strongly homogenous at a network level way way above randomness)

jonny (good kind)

maaaaannnnnn ily but noooooooooo

jonny (good kind)

disussions of network-level algo perf are discussions you should have had before you made a network that requires a network level algorithm

Irenes (many)

@jonny .... they should definitely be using pagerank for this, it can incorporate feedback along the graph without making the query every time, that's the thing it's good at

but also, they should not have a recommendation algorithm at all that is a bullshit tool of oppression that ought not exist

jonny (good kind)

i genuinely feel for this person and wish them well if they were left in charge of the Whole Algorithm

jonny (good kind)

@ireneista it feels way more by the seat of its pants than that! the last public information about the discover algorithm was from back when it was "what's hot" and it was a transcription of the hacker news algo into a single materialized sql query

Irenes (many)

@jonny pagerank is really less than a week of work to implement

optimizing it to run in the large may be harder, but 25 million fits in memory just fine, so really it should be fine

Asta [AMP]

@[email protected] @[email protected] pagerank isn’t even under patent anymore! You can literally just use it! For free!

Irenes (many)

@aud @jonny yes indeed

Irenes (many)

@aud @jonny we want to be SUPER clear that we aren't making fun of the person. especially since it really does sound like there's just one person doing this part of the system, it's a super easy idea to miss.

Asta [AMP]

@[email protected] @[email protected] I was really surprised by how relatively simple the algorithm seemed. I haven’t tried to implement it, so maybe it’s harder than it seems, but I would also be surprised if there wasn’t some implementation that’s already done that they couldn’t at least test.

Asta [AMP]

@[email protected] @[email protected] oh no, for sure. I didn’t get that impression, nor was I trying to make fun of them either.

jonny (good kind)

wow that's so much responsibility, serious props

jonny (good kind)

@ireneista @aud confirmed mostly one person, dang: https://neuromatch.social/@jonny/113638937294602495
https://bsky.app/profile/why.bsky.team/post/3ld3uammbjs2j

Asta [AMP]

@[email protected] @[email protected] Jesus

If they’re hiring, can I apply <_<

jonny (good kind)

i guess my background assumption was that would be something you would want constant feedback on from everyone, and eventually if i was in charge of it my whole goal would be to make it not exist by giving the handles to everyone because that's too much responsibility for anyone to have lmao

Ulrike Hahn

@jonny understudied question: how perceptions of algorithmically determined feed quality change as a function of network size (by algorithm, including reverse chronological)

Cassandrich

@aud @jonny @ireneista If you can scrape enough and explicitly disregard junk, you can probably make a genuine search engine (not reskinned aggregator) at least as good as early Google (i.e. way better than late Google) this way.

Cassandrich

@aud @jonny @ireneista Proposed algorithm for first stage disregarding junk: statistical model for SEO spam (that doubles as model for AI spam since it was trained off SEO spam). 🤪

Irenes (many)

@dalias @aud @jonny mm. the models that detect spam are essentially the same models that generate spam, just used in a slightly different mode. who wins is a question of who has more data.

we do not have more data than google does, and the spammers have breached the walls of google's castle and are pouring into the keep.