I know the Internet Archive has been under a ton of infrastructural pressure lately, but anyone have any idea about how long they might take to review an application and get back to you?

d@nny "disc@" mc²

@aud @ireneista i am very interested in making smaller and more powerful document search and filtering techniques. i don't think anything really "requires" surveillance at all

Jeremy Kahn

@hipsterelectron @aud @ireneista

If y'all can tolerate a white straight cis dude I would love to contribute to this conversation about information access, decentralization, shared curation, and resisting surveillance capital

Demi Marie Obenour

@aud @ireneista I think it would be best to start with text only, as opposed to images, video, or audio. The reason is that the requirements for moderating media are much stricter than for text IIUC. (IANAL, of course.)

Irenes (many)

@trochee @hipsterelectron @aud no objection from us, at least!

Irenes (many)

@alwayscurious @aud you make a good point; we're also not a lawyer but we've worked in stuff adjacent to this and that does agree with our understanding

Jeremy Kahn

@aud

> mashing up usenet and the Dewey decimal system and not stopping till the money runs out!

It's a 21st century SNEAKERS technoheist and I. Want. In.

Asta [AMP]

@[email protected] @[email protected] @[email protected] yes please!

Asta [AMP]

@[email protected] @[email protected] I guess I never said it, but I was definitely thinking of text only to start (and maybe end) with. Even if we ignore the legal minefield, it’s just a rather complex topic that likely employs technology which is often purposed to much… darker ends.

Demi Marie Obenour

@aud @ireneista I think it is best to avoid generating results directly, instead merely linking to results found elsewhere. Generating results means generative AI and that is a nightmare.

Asta [AMP]

@[email protected] @[email protected] oh, this is… I dunno if we’ve explicitly said it, but we’re thinking as faaaaaaaar from generative AI as possible. If I used the word “generate” it was just a little sloppiness on my part. There should be as little between the question and the information source as possible.

Demi Marie Obenour

@ireneista @aud Maybe the solution is to instead change the incentive structure. If people aren’t incentivized to game the system, then they won’t.

One option might be to tie results to something that has value and cannot be replaced easily, such as the author’s real-world verified identity. Legitimate scientists and other scholars will not have a problem with this, but it will hopefully make life much harder for spammers.

Irenes (many)

@alwayscurious @aud mm

that's a decision about what kinds of information can be there

our perspective on that topic is colored by our experience growing up as a closeted queer person during a period when it was illegal to favorably portray queer people on television

Irenes (many)

@alwayscurious @aud we had a conversation about legal-name requirements on here recently in which someone pointed us to a public source finding they don't work, which we need to see if we can find again... we've known that for quite some time but the source we had for it isn't public

Demi Marie Obenour

@ireneista @aud I was thinking of long-established reputations, such as those of well-known scientists and organizations. That would be able to filter out the vast majority of e.g. COVID-19 or climate change misinformation.

Irenes (many)

@alwayscurious @aud yeah.... well as an anarchist we don't really want this to only be able to exist because of large institutions, but we definitely think that kind of approach has to be at least on the table

Demi Marie Obenour

@aud To me, and after further thoughts, this seems a lot like Wikipedia, so I would start by describing how you would differ from Wikipedia.

Asta [AMP]

@[email protected] Well, there's a couple quick ones.

1. Wikipedia is run by a non-profit I believe; this would be a community effort
2. Emphasis would be on curation as it applies to a mapping of what is out there on the web, not so much of original articles with references. Wikipedia writes encyclopedia-like articles that discuss certain topics, whereas the idea here would be to directly provide links to websites that are the sources of data.
3. Wikipedia has "notability" and other editorial standards that aren't necessarily in line with what I think would be required here; for instance, wikipedia isn't likely to host a page that's just links to a bunch of personal blogs, but I think that information absolutely should be available somewhere.

Those are just the first that spring to mind!

Asta [AMP]

@[email protected] @[email protected] I had similar thoughts and concerns, especially around, you know, what if we had a project where the people doing the contributing were librarians, for instance? We're already in a period of time where libraries are being targeted, so even though they're potentially the right type of people to be doing this work, if it was 'just' librarians it'd be easy to target them.

I like the idea of disincentivizing potential abuse targets entirely, except if something is widely used and available it likely becomes a potential target. I mean, ffs, look at how much money the tech industry has poured into taking away work from small scale writers and artists. Just ridiculous.

(the for fuck's sake is about the tech companies, for the record)

Walnut

@ireneista @alwayscurious @aud
Maybe one of these? They aren't strictly about the implications of requiring a legal name...

In other organizations I've seen require a "legal" name they never had any way to check or enforce it, so you only had to pick a name that "looked" real to largely white American/European men. That has its own problems of course.

https://www.kalzumeus.com/2010/06/17/falsehoods-programmers-believe-about-names/

https://shinesolutions.com/2018/01/08/falsehoods-programmers-believe-about-names-with-examples/

Asta [AMP]

@[email protected] @[email protected] @[email protected] yeah, that… oooof at the whole… all of it regarding just needing to pick a “real” sounding name according to racial standards. I think in general, I’d definitely prefer a system that didn’t require a real name; I think that introduces far more problems than it helps solve. Anonymity is good. And just because people can be anonymous doesn’t mean moderation can’t be done (in the same way that using a real name doesn’t guarantee people behave).

I don’t (yet?) know how one gets around the issue of requiring something do that auditing can be done, but I think specifically requiring real names…. Mmmmm. Sorry, just sorta rambling now. I just think having real names attached to data is just asking for trouble, in general (even if everyone behaves, data about what sort of links they’ve added or ranked could be telling. And if a server doesn’t delete that type of data or specially leaks it, well…)