The Webalizer is web log analysis software, which generates web pages of analysis, from access and usage logs. It is one of the most commonly used web server administration tools. It was initiated by Bradford L. Barrett in 1997. Statistics commonly reported by Webalizer include hits, visits, referrers, the visitors' countries, and the amount of data downloaded. These statistics can be viewed graphically and presented by different time frames, such as by day, hour, or month.
# yum install webalizer
Make for every vhost a configuration file.
# cd /etc # cp webalizer.conf webalizer-www.example.com.conf # vi webalizer-www.example.com.conf
The items below should be added or changed.
# vi webalizer-www.example.com.conf LogFile /var/log/httpd/www.example.com-access_log OutputDir /srv/www/vhosts/www.example.com/httpsdocs/usage/www.example.com HistoryName /srv/www/vhosts/www.example.com/httpsdocs/usage/www.example.com/webalizer.hist HostName www.example.com PageType htm* PageType cgi PageType php PageType shtml #PageType phtml #PageType php3 #PageType pl PageType xml DNSCache /srv/www/webalizer/dns_cache.db # HTMLPre defines HTML code to insert at the very beginning of the # file. Default is the DOCTYPE line shown below. Max line length # is 80 characters, so use multiple HTMLPre lines if you need more. HTMLPre <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> # HTMLHead defines HTML code to insert within the <HEAD></HEAD> # block, immediately after the <TITLE> line. Maximum line length # is 80 characters, so use multiple lines if needed. HTMLHead <meta name="author" content="The Webalizer"> HTMLHead <link rel="shortcut icon" href="/favicon.ico"> HTMLHead <style type="text/css"> body {font-family:Verdana, arial, helvetica} </style> # HTMLBody defined the HTML code to be inserted, starting with the # <BODY> tag. If not specified, the default is shown below. If # used, you MUST include your own <BODY> tag as the first line. # Maximum line length is 80 char, use multiple lines if needed. HTMLBody <body> # HTMLPost defines the HTML code to insert immediately before the # first <HR> on the document, which is just after the title and # "summary period"-"Generated on:" lines. If anything, this should # be used to clean up in case an image was inserted with HTMLBody. # As with HTMLHead, you can define as many of these as you want and # they will be inserted in the output stream in order of apperance. # Max string size is 80 characters. Use multiple lines if you need to. #HTMLPost <br clear="all"> # HTMLTail defines the HTML code to insert at the bottom of each # HTML document, usually to include a link back to your home # page or insert a small graphic. It is inserted as a table # data element (ie: <TD> your code here </TD>) and is right # alligned with the page. Max string size is 80 characters. HTMLTail <img src="/pictures/msfree.png" alt="100% Micro$oft free!"> # HTMLEnd defines the HTML code to add at the very end of the # generated files. It defaults to what is shown below. If # used, you MUST specify the </BODY> and </HTML> closing tags # as the last lines. Max string length is 80 characters. HTMLEnd </body> HTMLEnd </html> TopSites 30 TopKSites 10 TopURLs 400 TopKURLs 10 TopReferrers 50 TopAgents 15 TopCountries 200 TopEntry 100 TopExit 10 TopSearch 250 TopUsers 20 # Your own site should be hidden HideSite localhost HideSite 2001:985:395:1:021e:2aff:fe49:522c # Your own site gives most referrals HideReferrer www.example.com Hidereferrer example.com HideReferrer localhost HideReferrer 192.168.1.11 HideReferrer 2001:985:395:1:021e:2aff:fe49:522c # Usually you want to hide these HideURL *.gif HideURL *.GIF HideURL *.jpg HideURL *.JPG HideURL *.png HideURL *.PNG HideURL *.bmp HideURL *.BMP HideURL *.ra HideURL *.css HideURL *.txt HideURL *.ico HideURL *.js HideURL *.swf # The following is a great way to get an overall total # for browsers, and not display all the detail records. # (You should use MangleAgent to refine further...) # The order is importend. GroupAgent IE Micro$oft Internet Exploder GroupAgent Firefox Firefox GroupAgent Edge Edge GroupAgent Chrome Chrome GroupAgent Safari Safari GroupAgent Lynx Lynx GroupAgent *bot* Webcrawlers # The SearchEngine keywords allow specification of search engines and # their query strings on the URL. These are used to locate and report # what search strings are used to find your site. The first word is # a substring to match in the referrer field that identifies the search # engine, and the second is the URL variable used by that search engine # to define it's search terms. SearchEngine yahoo.com p= SearchEngine altavista.com q= SearchEngine google. q= SearchEngine eureka.com q= SearchEngine lycos.com query= SearchEngine hotbot.com MT= SearchEngine msn. MT= SearchEngine infoseek.com qt= SearchEngine webcrawler searchText= SearchEngine excite search= SearchEngine netscape.com search= SearchEngine mamma.com query= SearchEngine alltheweb.com query= SearchEngine northernlight. qr= SearchEngine ziggo. q= SearchEngine zoeken.nl q= SearchEngine ilse.nl search_for= SearchEngine vindex.nl search_for= SearchEngine yandex. q= SearchEngine bing. q=
# cd /etc/cron.daily/ # vi 00webalizer #! /bin/bash # update access statistics for the web site /usr/bin/webalizer -c /etc/webalizer-server1.example.com.conf /usr/bin/webalizer -c /etc/webalizer-www.example.com.conf /usr/bin/webalizer -c /etc/webalizer-mail.example.com.conf /usr/bin/webalizer -c /etc/webalizer-blog.example.com.conf /usr/bin/webalizer -c /etc/webalizer-cloud.example.com.conf
For creating monthly statistics use webalize 2011.
You have to make month files like www.example.com-access_log-2011.
# vi /usr/bin/webalize #!/bin/bash # update access statistics for the web site cd /var/log/httpd if [ "$1" = "" ]; then echo "No parameter input. Use yymm." else /usr/bin/webalizer -c /etc/webalizer-server1.example.com.conf server4-access_log-$1 /usr/bin/webalizer -c /etc/webalizer-www.example.com.conf www.example.com-access_log-$1 /usr/bin/webalizer -c /etc/webalizer-mail.example.com.conf mail.example.com-access_log-$1 /usr/bin/webalizer -c /etc/webalizer-blog.example.com.conf blog.example.com-access_log-$1 /usr/bin/webalizer -c /etc/webalizer-cloud.example.com.conf cloud.example.com-access_log-$1 fi cd