[ale] Dealing with really big log files....

Michael B. Trausch mike at trausch.us
Sun Mar 22 18:06:38 EDT 2009


On Sun, 22 Mar 2009 12:55:19 -0400
Jeff Hubbs <jeffrey.hubbs at gmail.com> wrote:

> A 114GiB log file certainly will compress like mad, either via gzip
> or bzip2 - the former is faster to compute; the latter generally
> gives smaller output.  Once you've done that and pulled over the
> compressed copy for local use, use rsync -z to keep your local copy
> synced to the server's. 

For a file so large, I'd compress it first and then break it into
chunks.  Something like the following:

$ pigz reallyBigFile
$ cat reallyBigFile.gz | split -d -a3 -b2G prefixForSplitFiles

(pigz is a gzip compressor that will use multiple CPUs.  On my system,
a quad-core, I let it create the default 8 worker threads.  It is
*very* fast.  Obviously, if you only have a single-core system
available to work on, there won't be very much advantage over using
regular gzip.  OTOH, if you're on a system where you can run many, many
concurrent threads at once, you'll get a hell of a speedup.)

That ought to make it easy enough to work with, since all you need to
do then is zcat the segments until you find the one(s) you are
interested in.

	--- Mike
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 197 bytes
Desc: not available
Url : http://mail.ale.org/pipermail/ale/attachments/20090322/32807778/attachment.bin 


More information about the Ale mailing list