|
OJNI
|
|
|
An Analysis of Web Site UsageRod Ward Introduction.The Internet in
general, and the World Wide Web in particular, is becoming an increasingly
important source of information related to health for professionals and students
reflecting wider use amongst all sections of society.
DofH
1999 section 2.6 Nursing &
Health Care Resources on the Net is a web site, which attempts to provide an
index of over 1500 websites, mailing lists and newsgroups relevant to nurses and
other health professionals. It has been running since 1993. At that time, an
extensive search of the Internet found only 4 nursing related sites, and a
couple of mailing lists with no central index. During 1994 my own students and
close colleagues used the site, with each providing about 10 visits per week. As
people started to find out about it, they suggested further sites (and mailing
lists and newsgroups) to be included, and the site grew. The site now
lists over 1500 net resources for nurses and other health professionals and took
over 100,000 hits in 1998. The site
has recently been upgraded using Java Script to remove the frames based design
and hopefully improve the ease of navigation and improve download times. It has
a high graphical content with links to web sites being marked with their logo,
as well as a short description. The site is unashamedly an eclectic collection
of resources, intended to help novice net users from the healthcare community to
begin to explore some of the resources available. Research questions The purpose of
this analysis was to identify who (part of world etc.) was accessing the site,
when they accessed, what features they used, and trends in use over the last 5
years. Noonan (1997)
identified several questions for trying to find out about the people who use a
web site; ·
Where do people go
online? ·
How long do they stay
on a particular site or page? ·
What software are they
using? ·
Where do they get
Internet access? and
highlighted the importance of users behaviour patterns for web site designers; “If
you know what people like to see online, you can tailor your service to meet
their needs. If the logs reveal that visitors keep looking for a certain type of
material, maybe it should be provided. If the logs reveal that folks spend a
minute and are off to something more interesting, maybe it's time to think about
adding something like a searchable database or quiz to keep them a while longer.
Folks "surf" the net for the same reason they switch channels, they
are looking for a reason to stop.” She summarised
the reasons for using some sort of log file as being: “You
want to know what's popular on your pages (and what isn't), who uses your system
and how they use it, and how you can improve the service.” Once one accepts
that it is necessary to track the users of a web site, there are a variety of
different tools that can be used, all of which have advantages and
disadvantages. None of them are very accurate. Linder (1998) summarised this; “It's
been said, rightly, that trying to draw conclusions from web server statistics
is like trying to nail Jell-o to the wall. Is
that a fault with the software? The hardware? The operation of the system? No.
The inaccuracy of the numbers is simply a by-product of the way the Web
functions. Even the most technically advanced sites have only a general idea of
the amount and nature of the traffic on their servers.” Linder (1998)
goes further; “Most
large commercial sites such as America Online, Prodigy, and CompuServe, use
large "caches" on the machines
they use to service web requests. That means that once a user looks up one of
our pages, it hangs around in memory at that site in case someone else wants to
look at it. That way, things run faster - if person "A" looks at our
genealogy page, and two minutes later, person "B" wants to look at it,
person "B" gets it right away because it is still in memory from
person "A", and the machine doesn't have to get it from us again. What
effect does this have? It has the effect of reducing the number of reported "hits" we receive from those sites. It's entirely possible that we only register one "hit" for every ten times someone at, for example, CompuServe, looks at the genealogy page. No, there is no way whatsoever to determine exactly what effect this has on our numbers - it could be a factor of two, it could be orders of magnitude. Probably somewhere in the middle." Methods
Several
different counters, trackers, and log files have been used to measure the page
accesses or hits and collect a variety of data about the user. These work in
different ways, log files are created by the server, which counts how many
requests are received for a file. Counters and trackers require some html code
to be placed on a page. When a request for this page is received from any
machine connected to the Internet, an additional message is sent if the machine
is the server. Counters will then increment by one. Trackers, in addition will
record some details about the machine requesting the page e.g.: IP address (the
identification of the machine requesting the page, which identifies the country
or network being used), how the request was generated (e.g. a link from another
page or search engine) and the set-up of the user (e.g. browser being used,
screen resolution etc). Each
method of measuring accesses and user/machine details has advantages and
disadvantages, and their accuracy is influenced by the state of the Internet
connection, their placement on a web page and the configuration of the user's
system. The terms hit,
visit or access in this context are used interchangeably, to indicate one
request being received for that particular page. Alternative data about page
impressions or volume of data transferred can provide additional information but
were not used in this study. The primary
sources of this data have been the Nedstat ( http://www.nedstat.net ) and
Extreme (http://www.extreme-dm.com) counters which are both available free, but
require the owner of the web site to insert some html code into the site to make
them work. If users are
accessing the site from a network that uses a cache (e.g. a university or
commercial Internet Service Provider) their page views will not be recorded
because the page is downloaded onto the local network for use. It is therefore
likely, in those cases, that the number of hits in total considerably exceeds
the numbers recorded by the trackers. Results
The site itself,
was one of the first for nursing on the World Wide Web. It was started in 1994
and received about 10,000 hits in the first 2 years (1994-1996). The site took
over 100,000 hits during 1998. Access statistics were examined to see where
these were coming from, times of day and days of the week, what browsers and
operating systems were being used to build up a user profile. Hits per dayThe number of hits on the site varies with the days of the week, showing a visible drop in hits at weekends as can be seen by the example on Table 1.
Table 1
Table 2 shows
the hits coming to the site by hour of the day in Greenwich Mean Time (GMT)
between February 1998 and July 1999. The peak level of hits occurred during
15.00-15.59 and the lowest level during 07.00-07.59
Table 2
The hits for
different pages show variations e.g. in times of the day e.g. nursing-UK (Table
3) & nursing - North-America
(Table 4). Table 3 shows the hits
coming to the site by hour of the day in Greenwich Mean Time (GMT) from the UK
between January 1998 and July 1999. The peak level of hits occurred during
14.00-14.59 and the lowest level during 04.00-04.59 Table 3
Table 4 shows the hits coming to the site by hour of the day in Greenwich Mean Time (GMT) from North America between January 1998 and July 1999. The peak level of hits occurred during 20.00-20.59 and the lowest level during 10.00-10.59.
Table 4
The hits per Month from May 1998 to July 1999 are shown in Table 5. This revealed a gradual rise during 1998 with a fall around the Christmas period then a large rise in January 1999. Table 5
The visitors to the site are coming from all over the world as shown in Table 6. The largest number was from UK (.uk) domains followed by Network (.net), Commercial (.com) and unknown, which it is impossible to classify. The next largest section was from the USA, primarily educational institutions, (.edu) and then Australia (.au) and Canada (.ca). These are divided by continent in Table 7. Table 6 Domains/Countries from which hits are coming - 13 May 98-16 July 99 (Source Extreme)
+
over 80 other countries/domains with less than 100 (0.1%) hits.
Table 7 Continents from which hits are coming - 13 may 98-16 July 99 (Source Extreme)
The way in which
users arrive at the site showed that most are clicking on a link on another
website, rather than using a search engine, as shown in Table
8. Very few of the users recorded were arriving via email or usenet
newsgroup messages. The most
popular search engines that were used are shown in Table 9 and 1457
different keywords were used, the most popular are shown in Table 10. Over 2500 other
web sites have links to this one. The sites that are providing the most
referrals in the form of hyperlinks which point to this site are shown in Table
11. Table 8 Referral Sources 13 May 98-16 July 99 (Source Extreme)
Table 9 Most popular search engines 13 may98-16 July 99 (Source Extreme)
+ 12 others with less than 1.00% Table 10 Keywords used in search engines 13 May 98-16 July 99 (Source Extreme)
Table 11 Top referring pages 13 may 98-16 July 99 (Source Extreme)
|