Slightly better titles from fediverse topics
-
An update from last night brings some additional logic to the title generation of topics from the fediverse.
Previously if a title was provided in the
name
property, that was used as the topic title.While that hasn't changed (and is the strongest signal for a topic title), not all fediverse content contains titles. Specifically, Mastodon posts do not require or even have a space to put a title in.
For those cases, we fall back to generating one based on the content. We literally grabbed the first 128 characters or so, and added an ellipsis to the end.
While that worked okay as a stopgap, it meant that a lot of topics ended up with really long titles — not ideal.
The new logic tries to grab the first line of text (either the first
<p>
or line), and from there, the first sentence, using some naive regular expressions.While still not a proper alternative to... you know... specifying a title, it's better than nothing I suppose!
I wonder if other fediverse softwares implement title generation logic like this...
-
@julian What Lemmy understands is this:
Title
@Community
Post body
It was added back in the day to make it possible for Mastodon users to start new threads in connected Lemmy communities.
#FediMeta #FediverseMeta #CWFediMeta #CWFediverseMeta -
Rimureplied to julian on last edited by [email protected]
@julian PieFed uses the first 150 chars (adjusted for word boundaries) of the first <p>.
I like your method of stopping at the first '.', that would yield better results more often.
-
RAMON CERDA QUIROZreplied to Jupiter Rowland on last edited byThis post is deleted!
-
@julian thanks Julian! what about AI-backed title generation with character limitation?
for example, I put your post here: https://seo.ai/tools/ai-title-generator
and got this:
not bad I guess...
I love "topic sentences" and try to use it when I start a new topic, but unfortunately they are not commonly used by others.
-
@crazycells no, I will never use AI for this purpose.
Because the resulting content is in the title, it would be implicitly misattributed to the topic author, without their consent.
You're of course free to use AI to generate a title for your own topics! No problem with that hehe
-
@[email protected] said in Slightly better titles from fediverse topics:
Title@CommunityPost body
Thanks, I hate it.
I should say, rather, that I get why it was done, and bonus points for just getting it done, but it reads like so much like "hack it until it works" methodology that I feel like we ought to be better than that by now.
-
@[email protected] said in Slightly better titles from fediverse topics:
I like your method of stopping at the first '.', that would yield better results more often.
Thanks, it worked decently until I remembered that there were additional punctuation marks besides the lowly period.
So I had to add in support for
?
and!
, and update the logic to actually add those punctuation marks back in to the title.... and yet there are more edge cases... some bot accounts post a title-esque first line along with a link, which needs to be teased out.
-
Scott M. Stolzreplied to julian on last edited by [email protected]@julian
...not all fediverse content contains titles. Specifically, Mastodon posts do not require or even have a space to put a title in.
Hubzilla dealt with that issue by putting the title of the post in the post itself, so people on Mastodon and other platforms could see the title. So the title is transmitted in both thetitle
field, and in thebody
field. It looks redundant, but I see why they did it that way, especially since Mastodon is so dominant in the space. -
Best of luck. We gave up trying to force or generate a title back in 2010, because some posts are nothing but a photo or video. And as you noticed titles are somewhat incompatible with the microblog side of the fediverse. If your own software requires a title you're probably stuck in some cases where you can't extract words from the content with something like '[unknown title]'.
-
@julian Ooo good point about adding the ? back on.
If you're interested in a non-regex solution, here's what I have - https://codeberg.org/rimu/pyfedi/src/branch/main/app/utils.py#L247