Wow, English-only people (or Western languages, for that matter) are so naïve. In case you didn't know, the lang attribute is very important in East Asian languages.
-
Paul McO'Smith IIIreplied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee are the pictograms so precise when written that people in Korea would giggle if you didn't get that small vertical line exactly vertical? or vice versa?
-
洪 民憙 (Hong Minhee)replied to Paul McO'Smith III last edited by
@pavsmith No, in handwriting, it's not that important. It's like the difference between an a and an ɑ, or the difference between crossing out a 7 or not, but in print, people feel awkward.
-
@hongminhee Thank you for this post.
I have added the lang attribute to posts and comments on https://piefed.social #PieFed
-
@rimu Oh, I'm glad you found my post helpful!
-
Francheskoreplied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee reminded me of this: https://github.com/kdeldycke/awesome-falsehood
-
洪 民憙 (Hong Minhee)replied to Franchesko last edited by
@franchesko Thanks for sharing this great resource.
-
Paul McO'Smith IIIreplied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee thanks. had always kinda wondered, as most of what i see is really detailed calligraphy, which can't be practical when writing notes in a meeting!
i suspect the problem like right handers trying to decipher left handed writing, especially when writing at speed and you start to smudge all of the ink!
-
洪 民憙 (Hong Minhee)replied to Paul McO'Smith III last edited by
@pavsmith Of course, there's cursive script in East Asia too. Also, each single Chinese character is more like a word than a single letter, so the information density of a sentence is high (hence a sentence is short).
-
Braw ☕🏳️🌈replied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee it's also very important for screen readers so they don't attempt to read foreign language with the wrong synthesiser
-
Alberto de Murgareplied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee Those are the same people who then believe that every person has a single name and surname of 4 to 10 letters, and the address has a state
-
洪 民憙 (Hong Minhee)replied to Alberto de Murga last edited by
@threkk Every East Asian sighs every time they see the first/last name fields.
-
Dr. Evan J. Gowanreplied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee I remember back when I first started learning Japanese and my phone was seemingly incapable of using Japanese fonts, and I was learning the wrong way to write characters like 過.
-
洪 民憙 (Hong Minhee)replied to Dr. Evan J. Gowan last edited by
@DrEvanGowan Haha, that's funny!
-
Janne Morenreplied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee @threkk
And as a non-native I sigh each time I see a japanese site with only first and family name, and nowhere to put my middle name... -
concept of a display namereplied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee it’s not trivial to determine per se, but a cross-entropy classifier on character bigrams (that is, 1990s NLP) is surprisingly accurate at determining the language of a string.
However—and this is the big caveat—it’s only trivial if (1) you know where the language boundaries are and (2) the string is long enough to get robust bigram statistics.
Even if you weren’t to specify the language, “lang” solves problem (1) readily.
-
洪 民憙 (Hong Minhee)replied to concept of a display name last edited by
@thedansimonson Yeah, but East Asian languages often be too short, e.g., 孤立無援, which is a valid sentence in Korean, Chinese, and Japanese.
-
concept of a display namereplied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee oh yea totally—but is it common to mix strings of that length on a website containing multiple languages? I’d suspect generally, where things get mixed up, the shortest you might have is a link or button.
-
洪 民憙 (Hong Minhee)replied to concept of a display name last edited by
@thedansimonson You're right, the shortest ones are buttons or links in a navigation bar.
-
James Woodreplied to 洪 民憙 (Hong Minhee) last edited by
@hongminhee How accurately are `lang` attributes placed in practice? I remember seeing “直” displayed wrong for the intended language on social media before (and, by the way, I don't think it's possible for me to specify the intended language in my quote there), and I often see people on the Fediverse who set-and-forget their language and then post in a different language, which you can tell on the client I use because it offers to translate the message.
-
洪 民憙 (Hong Minhee)replied to James Wood last edited by
@mudri Yes, in practice, people often don't even specify the lang attribute at all, and as you said, even on fediverse, there are many people who post without setting the language correctly.