I have several thousand links to pages on WayBack Machine on my blog.
-
I have several thousand links to pages on WayBack Machine on my blog.
Is it possible to download a specific page from archive.org in such a way that it is viewable offline? As in, a self-contained HTML file with all CSS, JS, and Images inlined?
How would one go about doing that?
I'd like to decentralise my outbound links to dead sources.
(I am very happy for you to mansplain how to do this. Or to wildly speculate.)
-
@Edent https://github.com/hartator/wayback-machine-downloader
I have used this once but not sure if it still works.
-
@Edent I'm pretty sure they used to offer ability to download the WARC files but struggling to find a working link with example.
-
@ldodds yeah, that's where I got stuck too!
-
vfig (aka leviathan_bound)replied to Terence Eden last edited by
@Edent i bookmarked this cli tool a couple days ago—havent had a chance to try it yet: https://mastodon.gamedev.place/@termin[email protected]/113313023810994984
-
Terence Edenreplied to vfig (aka leviathan_bound) last edited by
@vfig oh! That's nice. Ta
-
@Edent curious what you have in mind for this. Just looking at yourself locally, or hosting so your site is more resilient?
-
@danielittlewood probably sticking them on a subdirectory and linking to them?
Not sure yet. -
Thanks for all the brilliant suggestions.
A nasty scrap of MySQL to get all the archive.org links out of my blog.
A bit of grep and sed to clean them up.
Using https://github.com/gildas-lormeau/single-file-cli
I am now downloading about 600 URls which I can statically host somewhere.
Then it's just a case of replacing all the existing URls in my blog posts!
-
@Edent what's wrong with archive.org
-
@nickcolley it was down most of this week. They suffered a DDoS and cyber attack.