22 Jan
Posted by Harper as Technology at 05:08 PM
Tags: 37signals, downtime, infrastructure, network, rackspace, rails, tumblr, web20
In the past week there were a couple of companies(tumblr and 37signals) who publically blamed Rackspace for the downtime of their sites (tumblr here and 37signals here, here) . In both cases the downtime wasn’t widespread and was limited to the individual sites. In both cases someone from the respective site blogged about how it was rackspaces fault they were down. It got me thinking…
I don’t like how these companies are blaming their vendors when their sites are down. I understand how frustrating it is. I totally realize how you want a scape goat. And i have personally had experiences where Rackspace has screwed up and i have said to Jake: “WTF. Rackspace broke all of our shit” (which was quickly followed up with “Yeah! Rackspace fixed all of our shit!” - seriously, i love those guys). But what gets me is when that scapegoating and blaming bleeds into the conversation a company has with its clients/consumers. The customers don’t need to hear which vendor broke everything, they just want to hear about solutions and uptime.
It seems to me that these companies are basically saying “this is the entity who is at fault. It isn’t us. In fact - we are so smart that we don’t need them. we are changing hosting.” But they should be saying “We were down. bummer. we are up now.” The snideness is quite annoying to me.
What do yall think - should companies be blaming their vendors - or just take the blame on themselves? I think its hard. I know i have blamed people for faults that really lie with myself. It is certainly easier to write off a simple failure that happens to occur on the vendors property as not my fault. But in the end, didn’t i pick the vendor? Didn’t i make the decisions that put my company in a place where the vendor could fail?
What i would like to see, is companies taking the blame and just rocking through the casual failures. Making sure their clients know that even though there was a failure, that they are on it. That they have their hands in every single part of their infrastructure. One side effect of this blaming is that you ruin the black box effect your customers have with your software. You suddenly have many more complicated pieces of the puzzle and your customers may be leery to trust their data in such a hodgepodge of systems/vendors/etc. Of course i am hypothesizing here, but i do know that the first time i thought about how tumblr did their infrastructure was not when it went down, but when they blamed rackspace. Before they mentioned that, i just wanted them to be back up.
I am interested to hear what others think of this trend. I am guessing that people will learn that blaming is annoying. I hope that i don’t ever do it. I will surely think about this next time i am faced with downtime and have a choice in what to tell my customers.
UPDATE: After talking to Gabriel last night about this issue - he reminded me of the Joel on software post that influenced me to write this post. His post outlines how I feel a software service company should treat outages and how they should handle them. Specifically how following the five whys philosophy usually ferrets out the root of the issue.
11 Responses
chriskalani
January 22nd, 2008 at 5:45 pm
1I agree. That is THEIR problem… not the users. All the users care about is the site they are trying to access. I saw the tumblr one today and thought it was a bit lame. But I didn’t say anything. They should only say something about their host if something insane happens, like all the servers explode or get eaten or something.
Matt Hellige
January 22nd, 2008 at 6:48 pm
2I totally agree. I think it’s an in-group/out-group thing. From the user’s point of view, you and your hosting company are one entity. The right thing to do is to represent the whole group and take responsibility. If you want to sue them behind the scenes or get them to pay up according to some SLAs or whatever, go for it. But don’t try to put yourself on the customer’s side against “them.” It’s condescending and kind of shitty and nobody buys it.
Obviously this is what you mean by ‘black box’. So yeah. I agree. Take responsibility for yourself.
holcomb
January 22nd, 2008 at 7:55 pm
3Blaming things on your vendor just makes you look like a middleman. Middlemen tend to get cut out (or should be at least).
Fitz
January 22nd, 2008 at 11:30 pm
4Blaming is a waste of time. If you have a service, you have chosen to use certain providers for your service, be they datacenters, networks, or what have you. If a provider fails and you don’t have a backup, no one is to blame but you.
slicematt
January 23rd, 2008 at 4:40 pm
5As a partner at Slicehost, we take the responsibility of keeping our customers’ servers running very seriously. Everything outside of customer control (power, cooling, server hardware, network) falls on our shoulders, not yours.
I’ve been on both sides of this equation - as a service provider trying to get customers back up and as a customer at the mercy of a service provider. There isn’t a worse feeling in the world than having to tell your customers you know what the problem is, but someone else has to fix it.
37signals accepted full responsibility and explained what happened. At some point we all have to trust someone else to do *their* job, sadly they appear to have been let down.
Jengates Blog » Blog Archive » links for 2008-01-23
January 23rd, 2008 at 6:25 pm
6[...] Putting the blame on your vendors for the win. » Harper Reed: Tech, Phones, Yo-yoing and Death Meta… [...]
Harper
January 23rd, 2008 at 11:29 pm
7Hi Matt. Thanks for the comment.
I agree - at some point we do have to trust someone to do their job - and i agree that it is a harrowing experience to have to tell your clients that something is broken. however i don’t agree that you need to get into specifics of what is broken and who is going to fix it. because - in the long run - it is the company who is choosing vendors and if the vendor fails - i would go as far as saying that it is the companies fault for choosing a faulty vendor. Or at least not making sure that the vendor is doing it right/having backups and redundancy.
Whenever i have had large failures. I have tried to apoligize in the following form:
Dear users. I am sorry for the downtime. We are working hard to fix it. I am sorry for any problems this has causes.
However - tumblr and 37signals seem to say stuff like:
Dear users. Somebody caused us some problems and they are doing a poor job of fixing it fast. I am sorry for any problems this has causes.
Those two statements are VERY different. Which would you rather hear as a consumer? I would rather hear the first one.
Gil
January 24th, 2008 at 11:39 am
8The 37signals thing bummed me out. They delicately phrased their communication so that it sounded like they were taking the blame, but when you read it closely, they are actually blaming Rackspace for taking 2 hours to replace a failed piece of hardware. The fact is that hardware does fail, and you can’t reasonably expect them to replace it much faster than that.
The transformer thing was a totally separate issue, and it sounds like Rackspace hadn’t been keeping up with their backup systems to make sure they were working properly. Hopefully they learned their lesson, and I won’t discredit them unless it happens again.
I’ve had my own frustrations with Rackspace, but after the fact they’ve always gone out of their way to fix the problem. I still refer people to them on a regular basis, and will continue to do so in spite of the 37s/tumblr nonsense.
Harper
January 24th, 2008 at 11:56 am
9Hi Gil!
I think that having frustrations with vendors is totally part of the game. It happens to everyone. It is just a bummer that people are so into passing the blame onto someone else - instead of taking responsibility.
i think that blaming is easy and a bit thoughtless - whereas taking responsibility is usually hard and possibly shows weakness (because it is YOUR fault instead of someone else).
but yea. rackspace has had a couple rough spots with us too. but they are always quickly resolved and remedied within a day at the most. and usually they do not cause downtime. so regardless of tumblr and 37signals bitching about how bad it is. i will also continue to represent them in anyway i can.
slicematt
January 25th, 2008 at 6:44 pm
10Harper, in our experience, customers WANT (perhaps demand?) to know exactly what happened. Granted we serve a very technical community, but the explanation you outlined in your response would not fly. They would rather hear exactly what broke (insert technical jargon here), how we’ll prevent it going forward, no matter if it’s our fault or not. I think 37signals has a similar user base and perhaps reached the same conclusion.
Ryan Kanno
January 25th, 2008 at 8:08 pm
11Harper - I blame you for writing this damn post. As much as I agree with you, I also agree with sliceMatt. As a consumer, I want to know what went wrong and why, as long as everyone is honest - I can make my own decision about who to blame.
As for 37signals, I think that if they’re blaming their service provider, as Fitz points out, they’re to blame for putting all their eggs in one basket - albeit a generally big, protective basket in Rackspace, but a single point of failure nonetheless. Even in DHH’s comments on TechCrunch he admits that Rackspace’s hardware agreement has long since been broken. Umm, I’m not a smart man, but if that’s the case, um, why are you still with them (and since you are, why aren’t you mirroring crap?)Without either a backup service, or a Google Gears type of app, they’re the ones to blame. Hopefully, the customer can see this. ;)
RSS feed for comments on this post · TrackBack URI
Leave a reply
Be sure and take a gander at my photos.
If you want to contact me click here to start a chat.
Status
Pictures
Friends
Popular Tags
action awesome blogs books bush cell chicago chris colorado crobar dylan family food games google hacking harper hiromi humor india insane internet iraq job juggle juggling matiss metal movies music nokia phone php politic reed rock school search server sick Sites Technology travel video war
search
Categories
Archives
Recent Entries
Recent Comments
Most Commented
Nata2.org is © 1997-2008 Harper Reed. Theme stoled and inspired by the great BloggingPro theme by: Design Disease