David Weinberger's Intranet Buzz:
Mapping the Web
Printer Friendly Version
Y'all know the short subject "Powers of Ten"? If
not, you must have been sleeping during the science
class when the AV Squad rolled in the projector.
It's a great film, depicting the universe seen at
10x increments, starting with you standing in your
backyard in New Jersey. It goes all the way from
the Solar System to the Galaxy to Larry Ellison's
Ego to Super Gigantic Meta-Galaxial Clusters with
Caramel and Nougat ... and then goes into reverse
through molecules, atoms, sub-atomic particles, and
footnotes to a senior thesis in logical positivism.
Looking at maps of the Web has something of the
same effect, except the scale runs according to the
level of abstraction, not distance.
Mapping the Web is a huge field that falls into two
main pieces: maps that show us something
interesting about the Web, and maps that help us
navigate the Web. These two need not be essentially
connected.
We begin with Bill Cheswick's work:
www.cs.bell-labs.com/who/ches/map/index.html. He's
a security guru at Bell Labs and a wacky guy. He
and a colleague, Hal Burch, put together software
that draws a beautiful map of the routers that are
(as he says) the tin cans and string of the Web. It
puts these 100,000 nodes together using colors to
indicate congestion and arcs instead of boring old
straight lines. The result looks like a fireworks
display. While this map has a geographic correlate
-- you could position the routers relative to a
geographic map -- Ches declines to follow it.
Rather, Ches is fond of pointing to a dense cluster
of nodes, saying, "That's UUNET" or whatever. (PS:
Try out Ches's non-optical illusion page on the
McCollough Effect:
www.cs.bell-labs.com/who/ches/me.)
John Quarterman
(http://order.mids.org/~jsq/index.html) has been
mapping the Internet in various ways for over ten
years. For example, he plots the growth of Net
hosts in the U.S. on a geographic map of the
U.S.(www.mids.org/mmq/604/pub/us.i.gr.c.html) or
the presence of hosts plotted on a world map
(www.mids.org/mapsale/world/). Quarterman's group
also produces the Internet Weather Report
(www.mids.org/weather/index.html), a set of maps
that display the speed of the Internet ("latency").
Every four hours, the IWR "pings" several thousand
servers world wide and measures how long it takes
the packet to do the round trip.
If you want to see the geography of a hyperlink,
you can use a utility such as neotrace
(www.neotrace.com/). Give it an URL and it will
plot on a world map exactly how your request to see
that page has been routed. While it's intended to
help you diagnose problems reaching sites, it also
gives you a warm feeling about just how global --
physically and literally -- the Web is, and also,
on occasion, just how bone-headed stupid it is.
(You can see a dynamic map based on traceroutes at
http://home.online.no/~ggunners/NetBird.html.)
But the Web isn't merely hardware. Upside recently
ran an article by Robert Buderi on Ravi Kumar's
charting of 200 million pages and 1.5 billion links
to and from them. (He's backed by Altavista, Compaq
and IBM.) His initial finding is that there are
four main regions on the Web, each with about 50M
pages. Sites in the Strongly Connected Core are no
more than 7 clicks from one another. Sites in the
In region point to the Core but the Core doesn't
have the good graces to point back. Similarly, the
Out region is pointed to by the Core but doesn't
point back. Finally, there are the Tendrils which
run off of the In and Out regions but can't be
reached by the Core, presumably because they're
behind firewalls. According to the article, "Kumar
says, despite its ad hoc creation and constant
evolution, it seems the Web is actually highly
organized." This is a weird type of map, sort of
like clustering streets not by how they're
connected but by whether they're one-way or two-way.
Valdis Krebs (www.orgnet.com) is an expert in
organizational network mapping. Rather than doing
simple org charts, he aims at creating dynamic maps
of the various types of structures of an
organization, including the flow of information. He
has created a map of the Internet industry, showing
the relationship of the various Net players:
www.orgnet.com/netindustry.html. This map obviously
is hugely dynamic, and would need to be updated
with every press release.
So far, none of these maps is intended to help you
find your way on the Web. If you want help
navigating, you need to decide among the Web's
three basic information structures: random
hyperlinks, clustered hyperlinks, and hyperlinks
organized into browsable hierarchies, a la Yahoo.
Each can be presented as a map.
You can see an example of Yahoo's attempt to render
its hierarchical directory as a geographical map at
www.cybergeography.org/atlas/yahoo3d_large.jpg.
It's no longer available, possibly because the
geographical view of a hierarchical directory
defeats the purpose of such a directory, which is
to show a whole lot of information in a very small
space.
There's a mixed mode of clustered hyperlinking
that's extremely common: every Web site can be
presented as a set of pages all linked (eventually)
to a home page. Frequently, maps of this structure
show the home page in the center with the other
pages linked to it and to one another. Many Web
utilities show this to you in your role as
webmaster.
There have been lots of attempts to make random
hyperlinks viewable and navigable. This isn't just a
Web problem, however, so work has been done in lots
of fields, from semantic mapping to document
management. The Brain, for example, -- a utility
that some people swear by but which is to me of no
appeal -- helps organize scraps of information;
Mappi.Mundi (http://mappa.mundi.net/map/), a site
about mapping, has an example of the brain linked on
its home page that maps the ideas on the site.
ThinkMap (www.thinkmap.com) doesn't confine itself
to the Web but it does provide a Web utility which
you can see at www.bacardi.com. Be sure to click on
the ThinkMap button on the bottom left. You will
then see a dynamically updated set of maps,
presented as circles and lines and mysterious
symbols, that are supposed to help you visualize
where you are. For me, they only simulate one too
many pina coladas. Inxight (www.inxight.com/), a
spin-off of Xerox "Kiss of Death" PARC, for years
has been trying to convince people that they want to
think about information as a spinning ice cream cone
(oops, I mean "hyperbolic tree") and other oddities.
The demos are very cool
(www.inxight.com/products_wb/tree_studio/
tree_studio_demos.html [UNWRAP]), but are they
useful? Not to me, but, then, the right half of my
brain was removed after a series of unfortunate
homework assignments in my freshman Drawing the Nude
course in college.
One of the hottest areas is clustering sites based
upon an analysis of their content. The wildly
overhyped Autonomy engine does this
(www.autonomy.com), as do others, some commercially
available (e.g., www.fulcrum.com) and some in
research labs around the world. Putting related
sites together visually even if they are not linked
is a way of manufacturing serendipity. The hard
part is figuring out what's related, but you
shouldn't underestimate the difficulty of coming up
with a visual metaphor that works. There are
already 2-D maps, browsable hierarchies, and some
mappings into 3-D space. (Won't someone please
adapt the Quake III engine for this purpose?
Thanks.) [Note: I'm on the board of a company still
in stealth mode that's involved in one of these
areas.]
There will be many solutions to this problem, and
which ones we like will have everything to do with
our personal way of thinking and the type of problem
we're trying to solve at the moment. But navigable
maps of Web sites clustered by relevancy to our
interests are of unique important to the Web because
the Web space is itself organized not by uniform
units of distance but by *interest* itself.
Distance, on the Web, is measured by irrelevance.
Navigable maps capture this essential fact of our
new world, and thus not only map Web distance but
conquer it.
---
Here are two sources which were helpful to me:
http://mappa.mundi.net/maps/
http://www.cybergeography.org/
Printer Friendly Version
The Author
David Weinberger writes JOHO and is one of the Ringleaders of cluetrain.com,
a manifesto of web ethics. He also provides strategic marketing
consulting to high-tech companies, writes for several magazines
(including Wired)
and is a commentator on NPR's "All Things Considered."
He was, as VP of Strategic Marketing, one of the shapers of Open
Text's intranet strategy. David sits on several conference boards
and is a member of AIIM's Emerging Technology Advisory Group. Reach
him at self@evident.com
.