Next Previous Contents

6. Squid Log Files

The logs are a valuable source of information about Squid workloads and performance. The logs record not only access information, but also system configuration errors and resource consumption (eg, memory, disk space).

6.1 access.log

There are basically two formats for the access.log file: ``native'' and ``common.'' The Common Logfile Format is used by numerous HTTP servers. This format consists of the following seven fields:

        remotehost rfc931 authuser [date] "method URL" status bytes

The native format is different for different major versions of Squid. For Squid-1.0 it is:

        time elapsed remotehost code/status/peerstatus bytes method URL

For Squid-1.1, the information from the hierarchy.log was moved into access.log. The format is:

        time elapsed remotehost code/status bytes method URL rfc931 peerstatus/peerhost

6.2 hierarchy.log

This logfile exists for Squid-1.0 only. The format is

        [date] URL peerstatus peerhost

6.3 store.log

The store.log consists of the following fields:

    time       The time this entry was logged.  The value is the
               raw Unix time plus milliseconds.

    action     One of RELEASE, SWAPIN, or SWAPOUT.
               RELEASE means the object has been removed from the cache.
               SWAPOUT means the object has been saved to disk.
               SWAPIN  means the object existed on disk and has been
                       swapped into memory.

    status     The HTTP reply code.

    The following three fields are timestamps parsed from the HTTP
    reply headers.  All are expressed in Unix time.  A missing header
    is represented with -2 and an unparsable header is represented as -1.

    datehdr    The value of the HTTP Date: reply header.

    lastmod    The value of the HTTP Last-Modified: reply header.

    expires    The value of the HTTP Expires: reply header.

    type       The HTTP Content-Type reply header.

    expect-len The value of the HTTP Content-Length reply header.
               Zero if Content-Length was missing.

    real-len   The number of bytes of content actually read.  If the
               expect-len is non-zero, and not equal to the real-len,
               the object will be released from the cache.

    method     HTTP request method

    key        The cache key.  Often this is simply the URL.  Cache objects
               which never become public will have cache keys that include
               a unique integer sequence number, the request method, and
               then the URL.

6.4 Field Definitions

These are the definitions for the various log format components:

remotehost

The IP address of the client host. In Squid-1.1, if the log{[lowbar]}fqdn option is enabled, full hostnames will be logged when available.

rfc931

The username associated with the client connection, determined from an Ident (RFC 931) server running on the client host. By default Ident lookups are not made, but may be enabled with the ident{[lowbar]}lookup option.

authuser

Always NULL ("-") for Squid logs.

method

GET, HEAD, POST, etc. for HTTP requests. ICP{[lowbar]}QUERY for ICP requests.

URL

The requested URL.

code

The ``cache result'' of the request. This describes if the request was a cache hit or miss, and if the object was refreshed. See the full list of cache result codes.

status

HTTP status code: 200 for successful actions, 000 for UDP requests, 403 for redirects, 500 for server errors, etc. See the HTTP status codes for a complete list.

bytes

The number of bytes delivered to the client.

peerstatus

A status code that explains how the request was forwarded, either too your peer (neighbor) caches, or directly to the origin server.

peerhost

The host where the request was forwarded to.

time

Unix timestamp (since Jan 1, 1970) with millisecond resolution. You can convert unix time to something more sensible with this short perl script:

        #!/usr/bin/perl -p
        s/^{[bsol  ]}d+{[bsol  ]}.{[bsol  ]}d+/localtime $&/e;
date

HTTP date format: {[lsqb ]}dd/mmm/yyyy:hh:mm:ss TZ-offset]

elapsed

The time elapsed (milliseconds) during the client connection. For HTTP requests, this is the time between the accept() and close() system calls for the TCP socket. For ICP requests, this represents the time between scheduling the reply message for sending and actually sending it.

6.5 Cache Result Codes

TCP{[lowbar]} codes

Note, TCP{[lowbar]} refers to requests on the HTTP port (3128).

TCP{[lowbar]}HIT

A valid copy of the requested object was in the cache.

TCP{[lowbar]}MEM{[lowbar]}HIT

A valid copy of the requested object was in the cache, AND it was in memory so it did not have to be read from disk.

TCP{[lowbar]}NEGATIVE{[lowbar]}HIT

The request was for a negatively-cached object. Negative-caching refers to caching certain types of errors, such as "404 Not Found." The amount of time these errors are cached is controlled with the negative{[lowbar]}ttl configuration parameter.

TCP{[lowbar]}MISS

The requested object was not in the cache.

TCP{[lowbar]}REFRESH{[lowbar]}HIT

The object was in the cache, but STALE. An If-Modified-Since request was made and a "304 Not Modified" reply was received.

TCP{[lowbar]}REF{[lowbar]}FAIL{[lowbar]}HIT

The object was in the cache, but STALE. The request to validate the object failed, so the old (stale) object was returned.

TCP{[lowbar]}REFRESH{[lowbar]}MISS

The object was in the cache, but STALE. An If-Modified-Since request was made and the reply contained new content.

TCP{[lowbar]}CLIENT{[lowbar]}REFRESH

The client issued a request with the "no-cache" pragma.

TCP{[lowbar]}IMS{[lowbar]}HIT

The client issued an If-Modified-Since request and the object was in the cache and still fresh.

TCP{[lowbar]}IMS{[lowbar]}MISS

The client issued an If-Modified-Since request for a stale object.

TCP{[lowbar]}SWAPFAIL

The object was believed to be in the cache, but could not be accessed.

TCP{[lowbar]}DENIED

Access was denied for this request

UDP{[lowbar]} codes

"UDP{[lowbar]}" refers to requests on the ICP port (3130)

UDP{[lowbar]}HIT

A valid copy of the requested object was in the cache.

UDP{[lowbar]}HIT{[lowbar]}OBJ

Same as UDP{[lowbar]}HIT, but the object data was small enough to be sent in the UDP reply packet. Saves the following TCP request.

UDP{[lowbar]}MISS

The requested object was not in the cache.

UDP{[lowbar]}DENIED

Access was denied for this request.

UDP{[lowbar]}INVALID

An invalid request was received.

UDP{[lowbar]}RELOADING

The ICP request was "refused" because the cache is busy reloading its metadata.

ERR{[lowbar]} codes

"ERR{[lowbar]}" refers to various types of errors for HTTP requests. For example:

ERR{[lowbar]}CLIENT{[lowbar]}ABORT

The client aborted its request.

ERR{[lowbar]}NO{[lowbar]}CLIENTS

There are no clients requesting this URL any more.

ERR{[lowbar]}READ{[lowbar]}ERROR

There was a read(2) error while retrieving this object.

ERR{[lowbar]}CONNECT{[lowbar]}FAIL

Squid failed to connect to the server for this request.

6.6 Peer Status Codes

Hierarchy Data Tags

DIRECT

The object has been requested from the origin server.

FIREWALL{[lowbar]}IP{[lowbar]}DIRECT

The object has been requested from the origin server because the origin host IP address is inside your firewall.

FIRST{[lowbar]}PARENT{[lowbar]}MISS

The object has been requested from the parent cache with the fastest weighted round trip time.

FIRST{[lowbar]}UP{[lowbar]}PARENT

The object has been requested from the first available parent in your list.

LOCAL{[lowbar]}IP{[lowbar]}DIRECT

The object has been requested from the origin server because the origin host IP address matched your 'local{[lowbar]}ip' list.

SIBLING{[lowbar]}HIT

The object was requested from a sibling cache which replied with a UDP{[lowbar]}HIT.

NO{[lowbar]}DIRECT{[lowbar]}FAIL

The object could not be requested because of firewall restrictions and no parent caches were available.

NO{[lowbar]}PARENT{[lowbar]}DIRECT

The object was requested from the origin server because no parent caches exist for the URL.

PARENT{[lowbar]}HIT

The object was requested from a parent cache which replied with a UDP{[lowbar]}HIT.

SINGLE{[lowbar]}PARENT

The object was requested from the only parent cache appropriate for this URL.

SOURCE{[lowbar]}FASTEST

The object was requested from the origin server because the 'source{[lowbar]}ping' reply arrived first.

PARENT{[lowbar]}UDP{[lowbar]}HIT{[lowbar]}OBJ

The object was received in a UDP{[lowbar]}HIT{[lowbar]}OBJ reply from a parent cache.

SIBLING{[lowbar]}UDP{[lowbar]}HIT{[lowbar]}OBJ

The object was received in a UDP{[lowbar]}HIT{[lowbar]}OBJ reply from a sibling cache.

PASSTHROUGH{[lowbar]}PARENT

The neighbor or proxy defined in the config option 'passthrough{[lowbar]}proxy' was used.

SSL{[lowbar]}PARENT{[lowbar]}MISS

The neighbor or proxy defined in the config option 'ssl{[lowbar]}proxy' was used.

DEFAULT{[lowbar]}PARENT

No ICP queries were sent to any parent caches. This parent was chosen because it was marked as 'default' in the config file.

ROUNDROBIN{[lowbar]}PARENT

No ICP queries were received from any parent caches. This parent was chosen because it was marked as 'default' in the config file and it had the lowest round-robin use count.

CLOSEST{[lowbar]}PARENT{[lowbar]}MISS

This parent was selected because it included the lowest RTT measurement to the origin server. This only appears when query{[lowbar]}icmp is enabled in the config file.

CLOSEST{[lowbar]}DIRECT

The object was fetched directly from the origin server because this cache measured a lower RTT than any of the parent caches.

Almost any of these may be preceded by 'TIMEOUT{[lowbar]}' if the two-second (default) timeout occurs waiting for all ICP replies to arrive from neighbors.

6.7 HTTP status codes

These are taken from RFC 2068.

100  Continue
101  Switching Protocols
200  OK
201  Created
202  Accepted
203  Non-Authoritative Information
204  No Content
205  Reset Content
206  Partial Content
300  Multiple Choices
301  Moved Permanently
302  Moved Temporarily
303  See Other
304  Not Modified
305  Use Proxy
400  Bad Request
401  Unauthorized
402  Payment Required
403  Forbidden
404  Not Found
405  Method Not Allowed
406  Not Acceptable
407  Proxy Authentication Required
408  Request Time-out
409  Conflict
410  Gone
411  Length Required
412  Precondition Failed
413  Request Entity Too Large
414  Request-URI Too Large
415  Unsupported Media Type
500  Internal Server Error
501  Not Implemented
502  Bad Gateway
503  Service Unavailable
504  Gateway Time-out
505  HTTP Version not supported

6.8 cache/log (Squid-1.x)

This file has a rather unfortunate name. It also is often called the swap log. It is a record of every cache object written to disk. It is read when Squid starts up to ``reload'' the cache. If you remove this file when squid is NOT running, you will effectively wipe out your cache contents. If you remove this file while squid IS running, you can easily recreate it. The safest way is to simply shutdown the running process:

        % squid -k shutdown
This will disrupt service, but at least you will have your swap log back. Alternatively, you can tell squid to rotate its log files. This also causes a clean swap log to be written.
        % squid -k rotate

For Squid-1.1, there are six fields:

  1. fileno: The swap file number holding the object data. This is mapped to a pathname on your filesystem.
  2. timestamp: This is the time when the object was last verified to be current. The time is a hexadecimal representation of Unix time.
  3. expires: This is the value of the Expires header in the HTTP reply. If an Expires header was not present, this will be -2 or fffffffe. If the Expires header was present, but invalid (unparsable), this will be -1 or ffffffff.
  4. lastmod: Value of the HTTP reply Last-Modified header. If missing it will be -2, if invalid it will be -1.
  5. size: Size of the object, including headers.
  6. url: The URL naming this object.

6.9 swap.state (Squid-2.x)

In Squid-2, the swap log file is now called swap.state. This is a binary file that includes MD5 checksums, and StoreEntry fields. Please see the Programmers Guide for information on the contents and format of that file.

If you remove swap.state while Squid is running, simply send Squid the signal to rotate its log files:

        % squid -k rotate
Alternatively, you can tell Squid to shutdown and it will rewrite this file before it exits.

If you remove the swap.state while Squid is not running, you will not lose your entire cache. In this case, Squid will scann all of the cache directories and read each swap file to rebuild the cache. This can take a very long time, so you'll have to be patient.

By default the swap.state file is stored in the top-level of each cache{[lowbar]}dir. You can move the logs to a different location with the cache{[lowbar]}swap{[lowbar]}log option.

6.10 Which log files can I delete safely?

You should never delete access.log, store.log, cache.log, or swap.state while Squid is running. With Unix, you can delete a file when a process has the file opened. However, the filesystem space is not reclaimed until the process closes the file.

If you accidentally delete swap.state while Squid is running, you can recover it by following the instructions in the previous questions. If you delete the others while Squid is running, you can not recover them.

The correct way to maintain your log files is with Squid's ``rotate'' feature. You should rotate your log files at least once per day. The current log files are closed and then renamed with numeric extensions (.0, .1, etc). If you want to, you can write your own scripts to archive or remove the old log files. If not, Squid will only keep up to logfile{[lowbar]}rotate versions of each log file. The logfile rotation procedure also writes a clean swap.state file, but it does not leave numbered versions of the old files.

To rotate Squid's logs, simple use this command:

        squid -k rotate
For example, use this cron entry to rotate the logs at midnight:
        0 0 * * * /usr/local/squid/bin/squid -k rotate

6.11 How can I disable Squid's log files?

To disable access.log:

        cache_access_log /dev/null

To disable store.log:

        cache_store_log none

It is a bad idea to disable the cache.log because this file contains many important status and debugging messages. However, if you really want to, you can: To disable access.log:

        cache_log /dev/null

6.12 My log files get very big!

You need to rotate your log files with a cron job. For example:

        0 0 * * * /usr/local/squid/bin/squid -k rotate

6.13 Why do I get ERR{[lowbar]}NO{[lowbar]}CLIENTS{[lowbar]}BIG{[lowbar]}OBJ messages so often?

This message means that the requested object was in ``Delete Behind'' mode and the user aborted the transfer. An object will go into ``Delete Behind'' mode if

6.14 What does ERR{[lowbar]}LIFETIME{[lowbar]}EXP mean?

This means that a timeout occurred while the object was being transferred. Most likely the retrieval of this object was very slow (or it stalled before finishing) and the user aborted the request. However, depending on your settings for quick{[lowbar]}abort, Squid may have continued to try retrieving the object. Squid imposes a maximum amount of time on all open sockets, so after some amount of time the stalled request was aborted and logged win an ERR{[lowbar]}LIFETIME{[lowbar]}EXP message.

6.15 Retrieving ``lost'' files from the cache

I've been asked to retrieve an object which was accidentally destroyed at the source for recovery. So, how do I figure out where the things are so I can copy them out and strip off the headers?

The following method applies only to the Squid-1.1 versions:

Use grep to find the named object (Url) in the cache/log file. The first field in this file is an integer file number.

Then, find the file fileno-to-pathname.pl from the ``scripts'' directory of the Squid source distribution. The usage is

        perl fileno-to-pathname.pl [-c squid.conf]
file numbers are read on stdin, and pathnames are printed on stdout.


Next Previous Contents