What’s Twitterbot/0.1?
Some webmasters have seen HEAD requests with the User-Agent Twitterbot/0.1 hit their sites recently and as usual this causes a flurry of questions (I’ve never really cared too much about bots unless they hit hard, but some webmasters take it really seriously). I did some digging and the IP (128.242.241.134) that Twitterbot is using is from the same data center that Twitter is hosted. That’s unusual, most bots following URLs in Tweets come from Amazon EC2 or similar services. I did a little more digging and apparently links in DMs are also being followed which means it must be an official Twitter run bot. Now the question is what is Twitter doing with the data? Is it for analytics? Anti-spam?
Update: Looks like it may be in preparation for launching a URL shortener. I wonder if they will play hardball and rewrite existing links to use their own shortener? That would help future proof tweets for things like the Library of Congress, but would instantly screw Bit.ly and similar.
Update 6/9/10: Bing!
Isn’t it the other way around? Some twitter clients when hovering over the short url will show you the long url, my guess is that somehow twitter also retrieves the long url just by navigating to the site.
What it is doing is wasting bandwidth and increasing the load on your server. Nonsense.