The Internet Archive
The Internet Archive is way cool. Full stop. I remember discovering their Wayback Machine early in the century. The Archive is a not-for-profit operation that archives and preserves things — kind of all the things — that appear (or should appear) on the internet, particularly archiving web pages going back to 1995. The Wayback Machine lets you see what pages looked like in the past. It’s also invaluable for referencing web pages in a way that is resistant to link rot. As of this month, there are over one trillion web pages archived in the Wayback Machine. (One trillion seconds is over 31,000 years.)
The Internet Archive is way cool. Support them if you can.
archiving twoprops.net
I like that web sites get archived. I like that my websites get archived. For my sites, I relied on the Internet Archive’s regular crawling schedule to grab my sites. When I moved to the island and my servers were down for a while, I was a bit frustrated when I needed to refer someone to a previous page I had, and found that it had never been archived. Of course, keeping up with all new and changed web sites is an impossible task.
save page now
I started regularly entering twoprops.net
into the Wayback Machine’s “save page now” feature which, supposedly, will follow all the links on the page and archive those pages as well. However, it fails to archive a lot of pages from twoprops.net
(either a bug or user error) so I started using the Wayback Machine extension for Firefox, and just activating it for pages shortly after I created them.
You don’t have to be the owner, creator, or host of a page to save it. Anyone can request that (almost) any page be saved at any time. Nice if you’re, say, writing an article and want to put in a link to a web site and know that future changes to the site won’t mess up your article.
SPN blocklist
I discovered that Save Page Now has a blocklist. Site owners can request that their sites or pages not be archived. That makes sense, if you’re trying to control access to, say, copyrighted content. (Though, as I’ve said in the past, letting search engines crawl your content but then paywalling it to regular users is sleazy.)
I discovered the SPN blocklist when I went to save a page on twoprops.net
and got an error message:
This URL is in the Save Page Now service block list and cannot be captured. Please email us at “info@archive.org” if you would like to discuss this more.
Obviously, I did not ask The Internet Archive to block archiving of my page. I was wondering if, perhaps, I had pissed off somebody powerful. After all, the blocked page did end with:
Google. We’re evil now. Deal with it.
I emailed, as suggested, and got a rapid reply:
Hi [Twoprops],
Thank you for contacting us.
Occasionally a URL can end up on the SPN blocklist accidentally due to automation (nothing to do with content). Luckily, when this happens, it’s just a matter of us manually unblocking the URL in question.
I’ve just let our engineers know to unblock https://twoprops.net/ads-out-of-control, and will let you know when that’s done!
Please feel free to reach out if you have any other questions!
So, perhaps my paranoia was misplaced, and it was just some automated filter gone awry.
But for just a moment, I was concerned and a bit excited to think that twoprops.net
might have actually generated that much attention. It will be interesting to see if this page causes the same issues — as of this writing, I’m still blocklisted.
reputation management
There are organizations out there known as reputation management services. They’re one small step above search engine optimization services on the scum-and-villainy scale. One of the many sleazebag techniques they use is, if they find a website that says something bad about one of their clients, they’ll copy the webpage (copyright be damned) to their own obscure corner of the net, then claim they posted it first, and ask that the original be taken down as a copyright violation. My paranoia was not without cause. Around 2007 I published a story about how a major insurance company screwed me over for a trivial amount of money. This big insurer, who would go on to be pivotal in fomenting the financial crisis of 2008, hired a reputation management service to try to break in to my system and delete the story. In other words, the company solicited the commission of multiple felonies to try to erase one obscure mention of their perfidy around a $19 charge.
So there really are organizations that feel the need to go to great, often unlawful, lengths to protect themselves against people otherwise exercising their free speech rights. Another reason why we can’t have a nice internet.
—2p