I know the Internet Archive has been under a ton of infrastructural pressure lately, but anyone have any idea about how long they might take to review an application and get back to you?

Asta [AMP]

@[email protected] @[email protected] library scientists are literally the ideal people I would want to.. build tools for. When I first started thinking about who should run and and curate this, I was thinking specifically of librarians and library scientists (not sure if they’re the same).

Asta [AMP]

@[email protected] @[email protected] which, unfortunately, I’m in the same position of as not being well read on the topic. I imagined software that could be run and managed by librarians and libraries, for instance, and that people could use.

The funding is barely there for the libraries, let alone servers, but I don’t think the idea is inherently bad.

Asta [AMP]

@[email protected] @[email protected] since we’re on the subject, I suppose this is why I haven’t thought too much about anything EXCEPT for how to distribute the computational workload and datasets while maintaining integrity. I just don’t currently have the knowledge about what type of algorithm and UI should be implemented for this sort of thing. I’m interested, for sure, but it’s definitely not my area of expertise.

Irenes (many)

@aud @hipsterelectron it is a topic we have significant background on, and can definitely advise on, but we need to figure out what we want first

Demi Marie Obenour

@aud @ireneista Are anonymous contributions a requirement? My understanding is that one can have anonymous contributions or prevent ban evasion but not both. One can use a trusted third party but that just leads to an infinite descent problem.

Asta [AMP]

@[email protected] @[email protected] fuck it, we’re mashing up usenet and the Dewey decimal system and not stopping till the money runs out!

(but in all seriousness, perhaps I should start looking up some library science stuff. I could try and start building code around some idea of what the data might look like, except if I’m wrong the nature of how it might need to scale out will change, so there’s not much point in building anything (unless I just want a search clone) till I have some idea of the best way to…

Forgive me for not closing the parentheses, but given the adversarial nature of the ad-surveillance-industrial complex, I’m going to call it “finding a piece of hay in a needlestack”. Anyway, seems I’ve probably got some reading to do.

Asta [AMP]

@[email protected] @[email protected] well, no, they’re not. This is much more of the “floating ideas around” stage so nothing is a hard requirement : )

I think, if there are contributions, they need as much protection against exploitation as possible. Both to tamp down the desire of distributed server operators to misuse the information, and against hostile authorities.

d@nny "disc@" mc²

@aud @ireneista i am very interested in making smaller and more powerful document search and filtering techniques. i don't think anything really "requires" surveillance at all

Jeremy Kahn

@hipsterelectron @aud @ireneista

If y'all can tolerate a white straight cis dude I would love to contribute to this conversation about information access, decentralization, shared curation, and resisting surveillance capital

Demi Marie Obenour

@aud @ireneista I think it would be best to start with text only, as opposed to images, video, or audio. The reason is that the requirements for moderating media are much stricter than for text IIUC. (IANAL, of course.)

Irenes (many)

@trochee @hipsterelectron @aud no objection from us, at least!

Irenes (many)

@alwayscurious @aud you make a good point; we're also not a lawyer but we've worked in stuff adjacent to this and that does agree with our understanding

Jeremy Kahn

@aud

> mashing up usenet and the Dewey decimal system and not stopping till the money runs out!

It's a 21st century SNEAKERS technoheist and I. Want. In.

Asta [AMP]

@[email protected] @[email protected] @[email protected] yes please!

Asta [AMP]

@[email protected] @[email protected] I guess I never said it, but I was definitely thinking of text only to start (and maybe end) with. Even if we ignore the legal minefield, it’s just a rather complex topic that likely employs technology which is often purposed to much… darker ends.

Demi Marie Obenour

@aud @ireneista I think it is best to avoid generating results directly, instead merely linking to results found elsewhere. Generating results means generative AI and that is a nightmare.

Asta [AMP]

@[email protected] @[email protected] oh, this is… I dunno if we’ve explicitly said it, but we’re thinking as faaaaaaaar from generative AI as possible. If I used the word “generate” it was just a little sloppiness on my part. There should be as little between the question and the information source as possible.

Demi Marie Obenour

@ireneista @aud Maybe the solution is to instead change the incentive structure. If people aren’t incentivized to game the system, then they won’t.

One option might be to tie results to something that has value and cannot be replaced easily, such as the author’s real-world verified identity. Legitimate scientists and other scholars will not have a problem with this, but it will hopefully make life much harder for spammers.

Irenes (many)

@alwayscurious @aud mm

that's a decision about what kinds of information can be there

our perspective on that topic is colored by our experience growing up as a closeted queer person during a period when it was illegal to favorably portray queer people on television

Irenes (many)

@alwayscurious @aud we had a conversation about legal-name requirements on here recently in which someone pointed us to a public source finding they don't work, which we need to see if we can find again... we've known that for quite some time but the source we had for it isn't public