Saturday, March 12, 2005

A note while I wait for Air 10 to download

Here's something I've learned in the last couple of months that may be useful to other people.

Say you discover experimentally that you can blog an episode of anime every day and a half or so. Just to keep things simple, in fact, let's say you can do 4 a week. It would seem like you should pick 4 shows to blog, then, right? Do that, and your queue will stay at its initial length, which is 0, right?

Wrong.

In fact, if you have new episodes coming in at exactly the same rate you can blog them, your backlog will grow uncontrollably.

We can demonstrate this mathematically. The question we're interested in is how large your backlog will be after a very long time. We can handle this probabilistically - how likely you are to be totally caught up, how likely you are to have 1 episode waiting, and so on. To keep this reasonably concise, I'm going to start using notation - Pn will be "the probability of having, at any given time, n episodes waiting."

Obviously all those numbers are related, and we can work out how by remembering that we're assuming you've been doing this for a very long time. So, we can ask ourselves how you were doing a few minutes ago, and how you've done since.

How you were doing must, of course, have had the same probability distribution. And since then, one of three things must have happened: You blogged an episode, a new episode was released, or everything stayed the same. Here's the trick: If we pick a short enough time, the chances of more than one thing changing is astronomical - so we can assume that only one thing happened. And we can figure out the chances of each change - for example, with the numbers I gave above, we know you blog 4 episodes a week, so the chance of you blogging an episode in a given day is 4/7, the chance of you blogging in a given minute is 4/10080 (or 1/2520), and so on.

So, the chance of you having n episodes waiting right now is equal to the chance of you having n-1 episodes waiting a little bit ago and one being released (time times rate of release, or tR), plus the chance of you having n+1 waiting and blogging one (time times rate of bloggine, tB), plus the chance of you having n waiting and nothing happening. Or, in notation:

Pn = (1-t*B-t*R)Pn + t*B*Pn+1 + t*R*Pn-1

A little algebra gives us:
Pn = Pn - tBPn - tRPn + tBPn+1 + tRPn-1
tBPn + tRPn = tBPn+1 + tRPn-1

And we can divide out the time! So it turns out it didn't matter what time interval we chose - we can just figure it was a few microseconds and leave it at that.
(B+R)Pn + Pn = BPn+1 + RPn-1

This relationship has a simplified solution, it turns out: Pn = (R/B)Pn-1.

But wait! In the special case that R = B - that is, the rate of releases equals the rate of blogging - R/B = 1, which means that the probability of every possible backlog size - P3, P7, P1,000,000 - is equal. Since those Ps go on forever - or at least until you get sick of it and quit - the chance of your backlog staying a reasonable size is so small it basically makes no difference.

So what this all means is that you may not be able to handle the blogging load you think you can. Actually, another analysis like the one above reveals what you can expect your backlog to average - (R/B)/(1 - R/B).

(This was all worked out in the early days of computer networking, when people were trying to figure out how much server capacity they were going to need.)

And Air 10 just finished downloading, so normal anime blogging will resume shortly, along with a special treat in (hopefully) a few days.

2 Comments:

Anonymous Satoshi said...

Yeah, I remember this from my queuing theory/performance class. Server (or blog) load is only bad when you're close to 1 (i.e. R = B). If you're close to 1 and decrease load a bit, your backlog decreases rapidly (till you hit something more reasonable like 0.5).

Of course, you have a couple other options:
(a) blog faster.
(b) Hire trained monkeys to blog for you
(c) Time travel. Time travel is always an option :)

For the mere mortals that cannot devise a time machine, nor afford the maintenance nightmares that monkeys bring, we have to either get more efficient at blogging, spend more time at it, or cut blogging activities down to a more reasonable level.

Or, just build a backlog. The bonus here is that you can complain incessantly about it (c.f. me).

8:15 PM  
Blogger Charlie said...

"I remember this from my queuing theory/performance class."

Yes! And now that I've explained/derived it, I understand it reasonably well for the upcoming exam! (Who said blogging was a waste of time, eh?)

12:29 PM  

Post a Comment

<< Home