iPhone Rollout Redux?

iPhone

Well, July 11th, iPhone Day, came and went.  The Believers waited and most got their phones, but even I could not have predicted the farsical mess that then ensued.  Apple was unable to handle the registration of some 1 million phones in the period of a weekend, while their provisioning infrastructure ground to a halt.  This is the added kick in the pants Believers must really enjoy.

While we wait for news to leak out of Apple as to what actually happened, let me speculate just a bit.  Let us assume the following statements are true:

  1. Apple did in fact test their provisioning capability prior to rollout.
  2. That of the three days the million phones were sold, most were sold and activated in the first twenty-four hours.  In particular, let’s assume a 70%/20%/10% distribution.  I don’t actually know the real one, but we have reason to believe that the load was top heavy on Friday, as problems dissipated later in the weekend.
  3. There were a average of two transactions per registration.  That is- one to provision the phone with services, one to create MobileMe or whatever additional functionality that Apple offers.  Normally we’d include a third for creation of an iTunes account, but since we’re talking about Believers they already have their account.

700,000 sales times 2 transactions over 24 hours would be about 16 transactions per second.  That’s really not that many transactions, considering that benchmarking systems measure that number in the hundreds and thousands.  This makes one wonder: what if we introduced latency into a transaction.  Latency can occur for many reasons, but the biggest one would be some sort of wide area communication.  For instance, an 80 millisecond round trip time would mean that one might not be able to process any more than about 12.5 transactions per second.  Now add a second round trip and you cut the transaction rate in half.

As to Apple’s testing, if they tested their provisioning system either on a local area network or on a network that had lower latency than the time needed to complete the day’s transactions, they wouldn’t have caught the problem.  This is actually a classic concern that most database vendors fully understand, and it is often the reason to use stored procedures.

Anyway, that’s my guess.