@ireneista
-
d@nny "disc@" mc²replied to just adrienne last edited by
@adrienne @ireneista i was looking into whether wasm/webgpu were far enough along to do something like a fawkes facial rec poisoning on a phone browser (obv with greatly reduced settings) and even if that ends up being impossible, OCR seems much less strenuous https://circumstances.run/@hipsterelectron/113239764489703522
-
Irenes (many)replied to d@nny "disc@" mc² last edited by
@hipsterelectron @adrienne right but like OCR just gives you (character, coordinates) tuples. if you then also reconstruct bounding boxes, that gets you back to what you'd have if you'd started by looking at the draw calls, but it doesn't tell you the logical structure of the characters.
-
@hipsterelectron @adrienne most tools that do this have some heuristics where they try to figure out when two things are next to each other and have the same baseline, and then also use font-specific knowledge to guess at where the spaces are (it's hard because they aren't all the same size.....)
-
@ireneista @hipsterelectron in fairness, the thing that prompted this was legal briefs, which are a reasonably clean subset of PDFs. they're all gonna be in one of like 4 fonts because courts are picky as shit, and the only bit where there's ever going to be multicolumn text is at the very beginning of the doc.
-
@ireneista @hipsterelectron like it's actually potentially a very usefully-limited case bc the format is so standardized! there are variations, sure, but there are also a lot of assumptions it's totally safe to make about the entire class.
-
d@nny "disc@" mc²replied to just adrienne last edited by
@adrienne @ireneista oh yes i'm not letting this discourage me from just pulling the text layer first i'm just fascinated that i hadn't really considered the multiple levels of difficulty beyond the character recognition part of "OCR"
-
Irenes (many)replied to d@nny "disc@" mc² last edited by
@hipsterelectron @adrienne yeah as far as we can tell, neither did Adobe >< (or, more likely, they decided that was fine)
-
d@nny "disc@" mc²replied to Irenes (many) last edited by
@ireneista @adrienne difficulty authoring and consuming by external tools by making the format difficult to parse redounds to monopolistic objectives (i also believe this may be a goal of gpg's reportedly horrific file format)
-
Irenes (many)replied to d@nny "disc@" mc² last edited by
@hipsterelectron @adrienne sigh gpg was doing backwards compatibility with pgp, so, maybe
-
@ireneista @hipsterelectron @adrienne PGPs file format was Fine until it needed 13 different extensions to keep up with developments in cryptography