Log Parser: The coolest tool Microsoft has ever Released!
Introduction
Log Parser 2.x is, in my skewed perspective, the coolest tool Microsoft has ever released. It easily ranks up there with the tools that Winternals\Systernals release (like Filemon, Regmon, Process Explorer, etc.) The tool is a basic swiss-knife for log files that a administrator can tear through file after file and strip out all the interesting data. In my world, I focus on two major inputs – IIS Log files & Event Viewer.
During our Web Administration Summit (10 countries in 4 weeks in Europe), we displayed the coolness that comes from this tool. Beyond that, we used some simple slides & powerful scripts to show how easy a administrator could tear through event viewer logs & W3C logs. This included a sample 200 Megabyte log file showing the quickness and power of Log Parser.
Log Parser Basics
The purpose of this blog isn’t to “teach” anyone Log Parser but instead to just make people aware of it (if they somehow are not.) The basics are pretty straight forward and include the following:
Flavors:
Log Parser comes in two flavors, command-line & component-based (com object dll), for those who prefer direct interaction or automated. This allows users to use the tool in web-based applications or with ease via script such as vbscript or jscript.
Input Formats:
I couldn’t begin to list all the output formats that are supported by Log Parser. The list scrolls beyond the screen in the help file so I suggest you visit it when time permits. However, from a IIS-centric mindset let me tell you some of the things that I use it for:
- Conversion: The ability to convert one log file format to another. This is useful if you are using say Centralized Binary Logging (2K3 only) and would like a readable format such as W3C. This can be done easily using Log Parser.
- IIS Log Files: No matter the log file type, you can easily indicate using -i:<file format> the type of log file and Log Parser will pick that up and understand it (including built-in functions such as understanding <1> means the /w3svc/1 log file.
- Event Viewer: This is very useful to locate error events thrown by the key IIS services such as w3wp, etc.
- Network Monitor: You got it — it can parse those big ‘ole netmon traces to locate specific traffic you are interested in and only that traffic.
Output Formats:
- NAT: This is the default, but not necessarily the best. This simply will return the results to the screen but this is not really useful sometimes as the data might be hard to read or more often TOO BIG.
- Datagrid: This is nice for formatting data in a easy-to-read format though it is again still slightly limited. I use it often as I am not in the world of automation that many of you are so forgive my passion for it. However, if you do choose to use it then you should learn the -rtp switch that allows you to tell Log Parser exactly how many rows to print when extracting the data. This can be done using -rtp:-1 (all rows no matter how many) or using -rtp:10 to only print 10 rows.
- SQL: Yep, you can easily take data from the log files and input into SQL server. This is very useful by allowing administrators the ability to hand-pick (via SQL commands) only the data they are interested in rather than the “entire” file.
Sample Scripts
- Convert Centralized Binary Log (CBL) files to W3C log files
In some high traffic environments, the use of binary logging can help performance of the web server. The pitfalls of binary logging are the missing extended properties like referrer, etc. and doesn’t always work well in environments. However, the biggest issue most administrators have with binary logging is they *can’t* read the files using their favorite text editor. Enter log parser…
logparser -i:BIN -o:W3C <name of binary file> <name of new W3C text file>
- Top browsers accessing your Websites
If you are interested in understanding what is your top browsers being used, then you can use this script. Based solely on the user-agent Log property it will count, and then sort based on Total Hits
Select top 50
to_int(mul(100.0,PropCount(*))) as Percent,
count(*) as TotalHits,
cs(User-Agent) as Browser
from [LogFileName]
group by Browser
order by Totalhits desc
- Daily Bandwidth
This is a nice script to determine exactly how much bandwidth your site is using. If you possibly pay for bandwidth, or do not have unlimited supply, this is very nice to know.
Select
To_String(To_timestamp(date, time), 'MM-dd') As Day,
Div(Sum(cs-bytes),1024) As Incoming(K),
Div(Sum(sc-bytes),1024) As Outgoing(K)
From [LogFileName]
Into BandwidthByDay.gif
Group By Day
- Hits by Hour
A per hour view of traffic might help you understand better how your entire application set is used by customers. This is useful in many ways such as planning or preparation for expansion. There are many uses beyond those I mention but are a couple that come to mind.
SELECT QUANTIZE(TO_LOCALTIME(TO_TIMESTAMP(date,time)),3600) As Hour,
cs-uri-stem As Url,
COUNT(*) As Hits
FROM [LogFileName]
WHERE cs-uri-stem LIKE '%index.htm%'
GROUP BY Url, Hour
ORDER By Url
- Request by URI
Page down by page down…is a boring exercise. This is the life of a Web Administrator if they choose to manually or worse pay for services to determine what traffic is occuring on their website. This can be avoided with Log Parser which is free and can do it very quickly using the popular T-SQL language.
SELECT top 80
QUANTIZE(TO_TIMESTAMP(date, time), 3600) as Hour,
TO_LOWERCASE(STRCAT('/',EXTRACT_TOKEN(cs-uri-stem,1,'/'))) as URI,
COUNT(*) AS RequestsPerHour,
SUM(sc-bytes) AS TotBytesSent,
AVG(sc-bytes) AS AvgBytesSent,
Max(sc-bytes) AS MaxBytesSent,
ADD(1,DIV(Avg(time-taken),1000)) AS AvgTime,
ADD(1,DIV(MAX(time-taken),1000)) AS MaxTimeFROM [LogFileName]
GROUP BY Hour, URI
Having RequestsPerHour > 10
ORDER BY RequestsPerHour ASC
- Top 10 Images by Size
I love this one because it allows you to find those users who have the digital camera set to top resolution combined with the one lacking skill of editing, cropping, or re-sizing. Find these bandwidth hogs and size ‘em down…
Select
Top 10
StrCat(Extract_Path(TO_Lowercase(cs-uri-stem)),'/') AS RequestedPath,
Extract_filename(To_Lowercase(cs-uri-stem)) As RequestedFile,
Count(*) AS Hits,
Max(time-taken) As MaxTime,
Avg(time-taken) As AvgTime,
Max(sc-bytes) As BytesSent
From [LogFileName]
Where
(Extract_Extension(To_Lowercase(cs-uri-stem)) IN ('gif';'jpg';'png'))
AND
(sc-status = 200)
Group By To_Lowercase(cs-uri-stem)
Order By BytesSent, Hits, MaxTime DESC
- Top 10 URLs for a website with some interesting data to support you
Select
TOP 10
STRCAT(EXTRACT_PATH(cs-uri-stem),'/') AS RequestPath,
EXTRACT_FILENAME(cs-uri-stem) AS RequestedFile,
COUNT(*) AS TotalHits,
Max(time-taken) AS MaxTime,
AVG(time-taken) AS AvgTime,
AVG(sc-bytes) AS AvgBytesSent
INTO chartTop10Url.gif
FROM [LogFileName]
GROUP BY cs-uri-stem
ORDER BY TotalHits DESC
- Top 20 clients using your website
Select
Top 20 c-ip AS Client,
Count(*) AS Hits
INTO Chart.gif
FROM [LogFileName]
GROUP BY c-ip
ORDER BY Hits Desc
- Referrer Broken Links (i.e. site refers to your content though your content has moved, etc.)
SELECT DISTINCT cs(Referer) as Referer,
cs-uri-stem as UrlINTO ReferBrokenLinks.html
FROM [LogFileName]
WHERE cs(Referer) IS NOT NULL
AND sc-status = 404
AND (sc-substatus IS NULL OR sc-substatus=0)
-
Using ReferBrokenLinks.html, here is source for the tpl (Template) file that Log Parser will use to format the data
<LPHEADER>
<HTML><HEAD><TITLE>Pages having broken links to this site</TITLE></HEAD>
<BODY BGCOLOR="#EFEFFF">
<TABLE BORDER="1" CELLPADDING="2" CELLSPACING="2">
<TR>
<TH COLSPAN="2" ALIGN="CENTER">Pages with Broken Links To This Site</TH>
</TR>
<TR>
<TH ALIGN="LEFT">Referring Page</TH>
<TH ALIGN="LEFT">Broken Link</TH>
</TR></LPHEADER>
<LPBODY>
<TR>
<TD><A HREF="%Referer%">%Referer%</A></TD>
<TD>%Url%</TD>
</TR>
</LPBODY><LPFOOTER>
</TABLE>
</BODY>
</HTML>
</LPFOOTER>
- Status code
SELECT sc-status As Status,
COUNT(*) As NumberINTO StatusCodes.gif
FROM <2>GROUP BY Status
ORDER BY Status
- Time-taken for virtual directories
SELECT CASE EXTRACT_TOKEN(cs-uri-stem, 2, '/')
WHEN NULL THEN '/'
ELSE EXTRACT_PREFIX(cs-uri-stem, 1, '/')
END As VDir,
QUANTIZE(time-taken, 100) As TimeTaken,
COUNT(*) As NumHitsFROM [LogFileName]
GROUP BY TimeTaken, VDir
ORDER BY TimeTaken, VDir DESCSELECT CASE EXTRACT_TOKEN(cs-uri-stem, 2, '/')
WHEN NULL THEN '/'
ELSE EXTRACT_PREFIX(cs-uri-stem, 1, '/')
END As VDir,
QUANTIZE(time-taken, 100) As TimeTaken,
COUNT(*) As NumHitsFROM ex05111606.log
GROUP BY TimeTaken, VDir
ORDER BY TimeTaken, VDir DESC
- Grep Event Viewer for W3SVC (IIS) log entries and color-coordinate as to Error, Warning, Information
SELECT TimeGenerated,
EventTypeName,
Strings,
Message,
CASE EventTypeName
WHEN 'Error event' THEN 'RED'
WHEN 'Warning event' THEN 'YELLOW'
WHEN 'Information event' THEN 'WHITE'
ELSE 'BLUE'
END As Color
INTO file.htm
FROM System
WHERE SourceName = 'W3SVC'
- Associated tpl file used for file.htm in the above example
<LPHEADER>
<HTML>
<HEAD>
<STYLE>
TD { font-family: Arial };
TH { font-family: Arial };
</STYLE></HEAD>
<BODY>
<TABLE BORDERCOLOR="BLACK" BORDER="1" CELLPADDING="2" CELLSPACING="2">
<TR>
<TH COLSPAN=4 BGCOLOR="BLACK"><FONT COLOR=WHITE>New W3SVC Messages in System Event Log</FONT></TH>
</TR>
<TR>
<TH ALIGN=LEFT BGCOLOR="#C0C0C0">Time Generated</TH>
<TH ALIGN=LEFT BGCOLOR="#C0C0C0">Event Type</TH>
<TH ALIGN=LEFT BGCOLOR="#C0C0C0">Strings</TH>
<TH ALIGN=LEFT BGCOLOR="#C0C0C0">Message</TH>
</TR>
</LPHEADER><LPBODY>
<TR bgCOLOR="%Color%">
<TD>%TimeGenerated%</TD>
<TD>%EventTypeName%</TD>
<TD>%Strings%</TD>
<TD>%Message%</TD>
</TR>
</LPBODY></TABLE>
</BODY>
</HTML>
- Upload to SQL Server using Log Parser for SQL Reporting Services
logparser “Select * INTO LogsTable FROM ex*.log WHERE TO_LOWERCASE (EXTRACT_EXTENSION(cs-uri-stem)) NOT IN (‘gif’;’jpg’;’png’) AND sc-status <> 404” -i:IISW3C -o:SQL -database:LogsDatabase
Summary
Log Parser is probably one of the only tools on the planet where the possibilities are almost “endless.” The tool is like the prized Kentucky Derby winner who has all the future of a prized champion — all it needs is a little luck. That luck is you the user as it will take you as far as you want it to and I hope that these samples (courtesy or myself & Alexis on the Roadshow in Europe) prove useful.
Great Resources:
Official Log Parser Site (non-MS)
Microsoft Log Parser Toolkit by Syngress Publishing
Data Mining with Log Parser 2.x and IIS (Webcast)