DIAGNOSING WEB SITE
PROBLEMS
When
something goes wrong with a Web site, people often look to the Webmaster
as if he or she is the problem, or they simply point the finger at the ISP
or Web hosting provider. While many companies are all too willing to blame
problems on a "screwed up Web hosting service," they must also be aware of
other mitigating factors that can quickly lead to the demise of a Web
site. Let's look at a few.
FAULTY CACHING
Defective caching can be a big problem. You usually hear about it
first from an irate person calling and complaining that the Web site isn't
up to date or it's not appearing on the Internet at all. Little do they
realize that it's often their ISP that's the root of the problem. Here's a
good test: Put a clock on your Web site, ask what time it is when they
look at your site, and then look at the clock on the Web site. If there's
a discrepancy, it's most likely due to a caching server not updating its
stored pages. This is an easy way to tell if there's a problem on the
otherviewer's network.
Another problem that can surface is a situation where certain
brands of caching servers don't differentiate between dynamic HTML and
normal HTML. This causes these particular caching servers to repeatedly
download the page for their users and, in the process, overload the Web
page they're downloading. In other words, they show up but never go away.
Every Web page or server is limited to a finite amount of connections.
That means if these types of machines eat up half of your connections,
you're now limited to half the users you normally have. This is not to
mention the stress this type of pounding puts on the Web server itself. It
could make yourWeb page download slowly and pound the underlying SQL
server to overload.
DNS SERVER PROBLEMS
Trouble with the DNS server can also be an ugly problem. DNS is
the function that correlates domain names with actual Internet IP
addresses. Without this important function running flawlessly, you won't
see your Website anywhere on the Internet. Don't underestimate the
importance of this server. If inept or sloppy people run this machine, you
may find your site nonexistent for quite some time.
According to the Internet's current setup, all domains have two
name servers: the primary server and the secondary server (which serves as
a backup). If the primary server is busy or if it fails, the second server
receives a subsequent lookup request, which it uses to direct visitors to
the IP address of your Web site.
It's vital that you make sure whoever runs your DNS server is
very accessible. You never know when you'll need to make DNS changes
immediately when something goes wrong. This isn't something everyone knows
how to do; it's a specialized area at any ISP. If the person with his or
her hand on the switch is asleep, you're out of luck. If something happens
to your domain routing (DNS), you only have two chances a day to update
these issues at the root level. Otherwise, you could be "down" for the
next 12 hours.
When a Web site isn't performing as expected, many
organizations have a tendency to blame their Web hosting provider. While
this can be a good place to start, there are also several other factors
that can degrade a Web site's performance. Last time, we looked at how
faulty caching and DNS server problems can cause trouble. Now, let's look
at two more factors that can affect a site.
BGP-4 AND OTHER ROUTING PROBLEMS
A
more insidious issue involves actual routing on the Internet. Thesedays, a
lot of Web sites use "components," and the combination of all these parts
makes up the complete Web site. One example is MSNBC. It routs its Web
sites to users in one way (from its Web servers), but content hosters,
such as Akamai, host some of the banner ads on the
sites.
When a request is returned to a user at the ISP, part of the
Web site arrives from one Internet connection to the ISP, while another
comes from a second connection. This is due to a routing protocol called
BGP-4 that Cisco core routers use. If BGP-4 or the Cisco routers aren't
configured properly or if an Internet connection flaps (like they
sometimes do), this in turn affects the way BGP-4 reports itself across
adjacent networks, and it can cause the sending Web server to select
improper routes. The end result is a Web page that doesn't completely show
up.
Symptoms of routing problems include Web pages that don't load
all the way (Web sites that are compiled by different sites and thereby
dependent on accurate and reliable BGP-4 sessions). This problem could be
caused by the host's ISP, the visitor's ISP, or any ISP in between,
meaning the problem could lie anywhere. However, flapping Internet
connections and their effect on affiliated BGP often cause it. This type
of problem will be noticed by ISPs with a diligent IT staff, but that can
be hard to find if you're dealing with a small company that has one person
running the network as a part-time job. The ISP may not even notice
problems until someone complains.
BAD CODE
Bad code is probably the most common cause of hard-to-pinpoint
problems. Sometimes these problems don't arise until much later or they
only occur under stress, result from too much traffic, etc., such as
saturation of the SQL and breakdowns between the Web site and SQL due to
traffic.
How do you identify and resolve these types of problems? For
starters, if your Web hosting provider is a large national company, don't
count on it for help. You'll need a company that can dedicate itself to
resolving the exact nature of the problem. This requires someone who's
willing to drop what he or she is doing and actually put some time into
it. Your best bet is to find a "smart" and quick Web hosting provider
that's very familiar with network issues as well as running a Web
server.