DIAGNOSING WEB SITE PROBLEMS

When something goes wrong with a Web site, people often look to the Webmaster as if he or she is the problem, or they simply point the finger at the ISP or Web hosting provider. While many companies are all too willing to blame problems on a "screwed up Web hosting service," they must also be aware of other mitigating factors that can quickly lead to the demise of a Web site. Let's look at a few.


FAULTY CACHING

Defective caching can be a big problem. You usually hear about it first from an irate person calling and complaining that the Web site isn't up to date or it's not appearing on the Internet at all. Little do they realize that it's often their ISP that's the root of the problem. Here's a good test: Put a clock on your Web site, ask what time it is when they look at your site, and then look at the clock on the Web site. If there's a discrepancy, it's most likely due to a caching server not updating its stored pages. This is an easy way to tell if there's a problem on the otherviewer's network.


Another problem that can surface is a situation where certain brands of caching servers don't differentiate between dynamic HTML and normal HTML. This causes these particular caching servers to repeatedly download the page for their users and, in the process, overload the Web page they're downloading. In other words, they show up but never go away. Every Web page or server is limited to a finite amount of connections. That means if these types of machines eat up half of your connections, you're now limited to half the users you normally have. This is not to mention the stress this type of pounding puts on the Web server itself. It could make yourWeb page download slowly and pound the underlying SQL server to overload.


DNS SERVER PROBLEMS


Trouble with the DNS server can also be an ugly problem. DNS is the function that correlates domain names with actual Internet IP addresses. Without this important function running flawlessly, you won't see your Website anywhere on the Internet. Don't underestimate the importance of this server. If inept or sloppy people run this machine, you may find your site nonexistent for quite some time.


According to the Internet's current setup, all domains have two name servers: the primary server and the secondary server (which serves as a backup). If the primary server is busy or if it fails, the second server receives a subsequent lookup request, which it uses to direct visitors to the IP address of your Web site.


It's vital that you make sure whoever runs your DNS server is very accessible. You never know when you'll need to make DNS changes immediately when something goes wrong. This isn't something everyone knows how to do; it's a specialized area at any ISP. If the person with his or her hand on the switch is asleep, you're out of luck. If something happens to your domain routing (DNS), you only have two chances a day to update these issues at the root level. Otherwise, you could be "down" for the next 12 hours.


When a Web site isn't performing as expected, many organizations have a tendency to blame their Web hosting provider. While this can be a good place to start, there are also several other factors that can degrade a Web site's performance. Last time, we looked at how faulty caching and DNS server problems can cause trouble. Now, let's look at two more factors that can affect a site.


BGP-4 AND OTHER ROUTING PROBLEMS


A more insidious issue involves actual routing on the Internet. Thesedays, a lot of Web sites use "components," and the combination of all these parts makes up the complete Web site. One example is MSNBC. It routs its Web sites to users in one way (from its Web servers), but content hosters, such as Akamai, host some of the banner ads on the sites.


When a request is returned to a user at the ISP, part of the Web site arrives from one Internet connection to the ISP, while another comes from a second connection. This is due to a routing protocol called BGP-4 that Cisco core routers use. If BGP-4 or the Cisco routers aren't configured properly or if an Internet connection flaps (like they sometimes do), this in turn affects the way BGP-4 reports itself across adjacent networks, and it can cause the sending Web server to select improper routes. The end result is a Web page that doesn't completely show up.


Symptoms of routing problems include Web pages that don't load all the way (Web sites that are compiled by different sites and thereby dependent on accurate and reliable BGP-4 sessions). This problem could be caused by the host's ISP, the visitor's ISP, or any ISP in between, meaning the problem could lie anywhere. However, flapping Internet connections and their effect on affiliated BGP often cause it. This type of problem will be noticed by ISPs with a diligent IT staff, but that can be hard to find if you're dealing with a small company that has one person running the network as a part-time job. The ISP may not even notice problems until someone complains.


BAD CODE


Bad code is probably the most common cause of hard-to-pinpoint problems. Sometimes these problems don't arise until much later or they only occur under stress, result from too much traffic, etc., such as saturation of the SQL and breakdowns between the Web site and SQL due to traffic.


How do you identify and resolve these types of problems? For starters, if your Web hosting provider is a large national company, don't count on it for help. You'll need a company that can dedicate itself to resolving the exact nature of the problem. This requires someone who's willing to drop what he or she is doing and actually put some time into it. Your best bet is to find a "smart" and quick Web hosting provider that's very familiar with network issues as well as running a Web server.