ONLamp.com
oreilly.comSafari Books Online.Conferences.

advertisement


Analyzing Web Logs with AWStats
Pages: 1, 2, 3, 4, 5

Some significant problems are inherent in tracking visitors and their visits with web log analysis software such as AWStats.

  • You cannot tell that two people connecting at different points in the day from a shared home PC (or internet cafe) are two unique visitors, not one.
  • You cannot know that somebody who connects from both home and work is one unique visitor, not two.
  • Some ISPs (internet service providers) assign a new IP to each request, so if you view three pages over the span of a few minutes, you will appear as three distinct visitors.
  • An ISP may reassign an IP to several users over the course of a day. Assume that Giacomo connects to the internet using his dial-up modem connection at 7:35 a.m. After a few minutes, he disconnects. His host IP address, dialup-062.libero.it, is now free. At 8:10 a.m., Patrizia connects with her modem and is assigned the host IP address dialup-062.libero.it by her provider. If she visits a site, is she the same visitor in the same visit (session) as before?

    The commonly accepted convention is that a visit has ended if there is no further activity from the visitor after 30 minutes. Thus, her visit would be a new session or visit--but you have no way of knowing that she is a different person from Giacomo. When Giacomo connects later in the day, he will most likely do so from the office, so even if he had a fixed IP at home, he will have a new host IP from the office and will thus appear as a different visitor than the Giacomo who visited at 7:35 a.m.

  • Users in large companies often access the internet through a "proxy"--in effect, aggregating thousands of users into one.

Despite these limitations in heuristic approaches, the concept of visitors and sessions (each individual visit) remains a valid tool as an indication of overall user behavior and trends.

Table 4. Visits and unique visitors
Visitor No. Visits (sessions) Unique visits
1 2 1
2 1 1
3 12 1
3 15 3

Bandwidth consumption

Bandwidth consumption is of interest to technical staff, as there is usually an economic cost associated with its use. On a more granular level, large individual file sizes will indicate performance issues, especially for dial-up users.

Bandwidth
The total file size sent from the web server to the end user. This does not include HTTP headers in served objects, HTTP request headers from users, nor bytes needed by the underlying network protocols.

The final part of this series will look at the reports we generated, using the definitions above to identify business and technical metrics to watch.

Sean Carlos is president of Antezeta, an internet consultancy focusing on Merit-Based™ search engine optimization, search engine marketing, web analytics, and web site usability.


Return to ONLamp.com.


Valuable Online Certification Training

Online Certification for Your Career
Earn a Certificate for Professional Development from the University of Illinois Office of Continuing Education upon completion of each online certificate program.

Linux/Unix System Administration Certificate Series — This course series targets both beginning and intermediate Linux/Unix users who want to acquire advanced system administration skills.

PHP/SQL Programming Certificate — The PHP/SQL Programming Certificate series is comprised of four courses covering beginning to advanced PHP programming, beginning to advanced database programming using the SQL language, database theory, and integrated Web 2.0 programming using PHP and SQL on the Unix/Linux mySQL platform.

Enroll today!


Sponsored by: