Thought Toys · Strategy & computation · Exhibit 24
Imagine someone clicking links at random, forever, with no idea what any page says. Where do they end up spending their time? That long-run fraction is a page's rank — the idea that launched Google. Release the surfer and watch a tangle of links sort itself into an order. Then turn one dial and watch dead-end pages try to hoard it all.
Where does random clicking lead?
Release the surfer, or solve it instantly. Each page's size = its share of the surfer's time.
Eight pages, joined by arrows — an arrow from one page to another means it links there. The middle page, C, is the hub: nearly everything in the cluster links to it. Off to the right, a thin tail leaks into two pages, G and H, that only link to each other — a dead end.
Now drop a surfer onto a random page. Each tick they either follow a random link from where they stand, or — with a small chance — get bored and teleport to a page picked at random anywhere. Run it long enough and the fraction of time spent on each page stops wobbling and settles. That settled fraction is PageRank. Watch the circles: they grow toward exactly the share of visits each page earns, and the well-linked hub C floats to the top — not because we counted its links, but because important pages keep sending the surfer its way.
The dial is the whole story. Damping d is the chance of following a link instead of teleporting. Slide it to 1 — never teleport — and watch the surfer get sucked into G and H and never come back: the dead-end pair bounces the surfer between them forever and swallows all the rank, while the real web starves to zero. That's why a pure link-follower is broken. Pull d back down and the occasional teleport yanks the surfer out of the trap, and a sensible ranking returns. Google's famous setting is 0.85: follow links most of the time, but jump free often enough to escape the dead ends. (Notice even at 0.85 the reciprocal pair rides high — a hint of why link-farms work, and why search engines fight them.)
improve/verify/24-pagerank.js): M is column-stochastic; power iteration converges to the
same vector from any start (a unique stationary distribution, sums to 1); a 2-million-step random walk matches
that vector to within 0.01 — so rank really is "time spent"; and at d = 1 the pair {G, H}
holds 100% of the rank, which 0.85 bleeds back down to 36%.
Counter-example: forget to split each page's vote by its number of links (use raw link counts)
and the total "rank" stops adding to 1 — it balloons without bound, so the numbers mean nothing. Dividing by
out-degree is what makes it a probability.
← the cabinet · Thought Toys — a cabinet of explorable explanations. Exhibit 24.