I know the Internet Archive has been under a ton of infrastructural pressure lately, but anyone have any idea about how long they might take to review an application and get back to you?
-
squeaky worm gets the sweet sweet grease motherfuckers
(I mean, potentially) -
@aud would be very difficult to get laughed out of the room by contributing code
-
@[email protected] fair enough! I also went into their gitter channel and was like " I put in an application for a job at the IA so I decided to also try and fix an issue while I was at it to draw some attention to myself (but also just to help out in general) "
subtle! -
@aud that is useful info for them to know!
-
@[email protected] right!? why not. Worst case scenario, I spent 30 minutes remembering how to do stuff in Python and some REST api stuff and contributed some code that either works as is or could be worked into something that fixes the problem in a way they need.
-
@aud so, belatedly... pieces like this make us wonder if the idea of searching the entire web is just over https://www.giantfreakinrobot.com/ent/independent-ends.html
-
@aud we wonder if there's a way to, like... make the act of curation into something more social
like, the early post-based social media platforms may have thought of themselves as doing that, but the artifacts they produce are fundamentally bound to a moment in time in the sense that you can't make sense of them without knowing the person and having the same social context they did when they wrote it
-
@aud our thinking here is that when it's a communal activity, people are more likely to see it as important, which is necessary if it's going to be a long-term pattern
-
@aud we also want to call your attention to this art project which is, essentially, a curated "catalog" of physical stuff you can get free directions for making (and source code, where applicable)
-
@aud in the past, when we've seen people do lists of favorite projects as github notes repos or whatever, it's felt like the lists are both too hard to get actionable information from, and too un-memorable to go back to when we have a need for them. this addresses both problems. it's particularly fascinating because paper catalogs are nowhere near as common as they used to be...
-
@aud ... and yet it does feel like the book is filling a clear need. browsing the various how-to sites and 3D-printing "stores" doesn't make it easy to find the good stuff, because although these places can resemble catalogs in their structure, there's no editorial judgement that goes into them, only algorithmic optimization and search features
-
@[email protected] so, I haven’t read the piece you linked yet but yes. That’s what I’ve been thinking, too. “Search” but as an act of curation; little emphasis on crawling, and instead focusing on selected portions that humans have decided to include. I kind of envision, in my head, something akin to a library, in which humans have applied some degree of categorisation. I think “Google is only useful if you append ‘Reddit’ to your search” is an example of that; for instance, we know a lot of the sites Google returns are trash, so why bother including them in the first place? There’s a ton of sites that are already just AI slop that don’t need to be indexed or rewarded. Humans adding (and removing) links (or a degree of supervised crawling) can avoid ever falling into that quagmire in the first place.
I think the fundamental thing that has stuck with me is that we can now generate garbage at an incredible scale. Combine that with the fact that I think “search”, or maybe more appropriately “access to knowledge and information”, must be held outside corporate interests, and the idea of searching the entire web is… just, why, even? What does that bring except the arms race of truth vs profit? So I think, more than ever, the expert eye of a human (not a gatekeeper; we already have Google) is critical. I think the “social” aspect is very important, too, which is why I was kicking around ActivityPub as the communication standard for it (allowing humans to curate and rank, for example).
tl;dr it sounds like we are very much on the same page. I do not think a tool needs to be perfect here; it just needs to exist and be reasonably devoted to the idea of preserving access to information. -
@aud that makes a ton of sense to us. we're very glad to be on the same page, as well!
-
@aud a thing we've asked ourselves recently is whether the world has changed enough that something like the original Yahoo, which was a list put together entirely by humans, makes sense again
it originally stopped making sense around the time the web got bigger than 30,000 pages (which also messed up altavista, in a totally different way)
-
@aud the web is, of course, still much larger than that today, even if we exclude corporate sites
-
@[email protected] so maybe I’ve been saying it incorrectly; what I think is important is not the search algorithm itself, but the specific set of webpages that it’s searching through. Any algorithm I could make could be gamed to return click bait genAI slop, but it’s much harder to ruin a curated set.
-
@[email protected] I’ve been asking exactly the same question.
-
@aud the first thing we see tugging against that is that there are a variety of reasons that people choose to obscure things rather than clarify them, with any sort of crowdsourcing. we do think that this stuff has been around long enough that those motivations can be taxonomized and reasoned about...
-
@aud on quora we've seen people do ideologically motivated hostile taxonomy, for example to suppress write-ups surrounding the ethics of abortion
-
@aud on stack overflow we've seen people use the site's various unlockable moderation powers as tools for building their personal fame and "winning" petty grievances