Not a great resolution to this inquiry. But I do appreciate the direct communication from masto.host.https://social.polotek.net/@polotek/113125789856269132
-
For the early days of the fediverse, getting what you want requires running your own infrastructure. I think that's to be expected and not a bad thing. I'm probably going to invest in migrating social.polotek.net to hosting that I control.
At least I'll get familiar with the state of doing migrations. In my opinion, portability should be the number one killer feature of the fediverse. But we've got work to do in order to make that true.
-
@polotek hachyderm struggled with this for a while during our migration. We had to do some weird shenanigans with DNS and it still didn't quite work. Changing URLs for media was absolutely painful and mastodon makes this way harder than it needs to be
-
@hazelweakly did you end up rewriting post content?
When you say mastodon makes it harder, is that because of the post cache thing?
All I'm looking for today is to control the domains on my urls. It's a small step, but I think it matters.
-
@polotek we experimented with rewriting post content but had to abandon that. It turns out that when servers fetch things they store the media themselves. So you basically get one chance to send them the correct file, or otherwise they cache the missing file in a weird way and changing the post content on your side doesn't fix that
We ended up with a nginx configuration that tried 1-3 different fallback URLs + a server disk location before finally failing
-
@polotek unfortunately the rest was just waiting the 7 to 30 days for the cache to expire on everyone else's mastodon instances before they re-fetched things. But due to timing and inconsistent configuration settings on other servers It ended up being several months to almost a year before things were settled
-
@polotek but yeah, to clarify, you'll never *own* the URL to your media from the perspective of any other server but yours, and it's one thing that's a bit contentious about the design of mastodon.
Unless someone clicks on "view this post in the web" or otherwise goes to *your* server to see the post, they see the media content from their URL
I'll put an example here since most people don't know how this works. See next post
-
@polotek For example: https://social.polotek.net/@polotek/113094222116728333 is the link to one of your posts, but my mastodon client is logged into my hachyderm account.
So viewing it from hachyderm's perspective means that the media image has the URL: https://media.hachyderm.io/cache/media_attachments/files/113/094/222/472/200/225/original/fe7e306f0983bf31.png
And that'll stay there however long our server stores it. And the CDN caches it however long *that* ttl is. Which means that if you don't carefully handle error responses, it's very easy to cache a broken image "forever"
-
@polotek Of course that also means that rewriting URLs is a bit of a nightmare and will result in the original URL being hit for months after as every server in the fediverse rediscovers the image
Because of how social media works, the halflife of the image is like 4 hours, but the long tail can be multiple years. It's kind of a nightmare for CDNs which explains why a ton of storage and caching innovation came from search engine and social media companies
(I'm sure you knew all that though)
-
@hazelweakly I appreciate this detail. All of this makes sense and I think it matches my understanding. I'm only trying to fix a small part of this puzzle. Which is that I at least want to be responsible for the original url from my post. In this case, it says cdn.masto.host. And I want it to be cdn.polotek.net. I can't control what other instances do. But I can at least be responsible for my source image staying consistent and available even if I migrate my instance.
https://cdn.masto.host/socialpoloteknet/media_attachments/files/113/094/221/878/618/552/original/9199449b86d4a6f2.png -
@hazelweakly you can also tell me that I'm overthinking it and that step isn't very meaningful. I'd be open to that feedback.
-
@polotek It seems like a poor architectural decision that the CDN URLs are embedded into posts in the first place. If the URLs in posts were `social.polotek.net/media/…` that should be very easy to redirect to a CDN (just redirect that whole media namespace) and then it could be changed out as necessary.
That would avoid the masto.host issue of complicating their initial setup with a subdomain, but make it easier to migrate later.
-
@polotek I think that step makes sense. I would actually expect that to be the case if you already got social.polotek.net for your hosting URL (or at least a configuration possibility).
Changing the CDN url would be meaningful enough, especially as it would be essentially *the* blocker to migrating to another host
That said, old content should remain under the old cdn url even though new content would use the new one. Patching posts would break more than it would fix unfortunately
-
@jimw @polotek something like that is actually the default, it's just that the default implies heavily that you are hosting the files locally on a physical server (it's a ruby on rails monolith + Postgres + sidekiq setup)
To be able to have a cdn at all that's on a separate server means you need a different domain (or you have to host a weird proxy thing).
The "sensible" setup would be my.domain and cdn.my.domain by default and ask people to configure the other DNS records on setup too
-
Jenniferplusplusreplied to Hazel Weakly last edited by
I would expect federating out a bunch of updates for all the image objects would address many of the things that break.
Of course, that's a daunting prospect for other reasons.
-
@hazelweakly @jimw this is the conversation I had with masto.host. This is what I would expect when I choose to set up a custom domain. They seem to feel like the added setup burden would be unreasonable. I do not agree.