Feeds:
Posts
Comments

Posts Tagged ‘logfiles’

Replacing sed

I’ve spoken about processing logfiles with perl previously. Occasionally though, I still reach for sed.

Say I have a logfile that looks like this:

[ <timestamp> ] : somefunc()
[ <timestamp> ] : interesting line 1
[ <timestamp> ] : interesting line 2
... 1000s of lines
[ <timestamp> ] : somefunc()
[ <timestamp> ] : interesting line 1
[ <timestamp> ] : interesting line 2
... 1000s of lines

Picking out lines following a pattern is easy with sed – p prints the current match and n takes the next line.

$ < log.txt sed -n '/somefunc()/ {p;n;p;n;p}'
[ <timestamp> ] : somefunc()
[ <timestamp> ] : interesting line 1
[ <timestamp> ] : interesting line 2
[ <timestamp> ] : somefunc()
[ <timestamp> ] : interesting line 1
[ <timestamp> ] : interesting line 2

My first attempt to replace that with perl looks a bit ugly

< log.txt \
perl -ne 'if (/somefunc\(\)/) {print; print scalar(<>) for (1..2)}'

I’m not that happy with the module I came up with to hide the messiness either.

package Logfiles;

require Exporter;
our @ISA = qw(Exporter);
our @EXPORT_OK = qw(process);

use Carp;

sub process (&$;$)
{
    my ($sub, $regex, $lines) = @_;
    $lines ||= 0;
    return unless /$regex/;
    if (! $lines) {
        $sub->($_);
    } else {
        croak "process() arg 3 not ref ARRAY" unless ref($lines) eq 'ARRAY';
        my $line = 0;
        my @lines = @$lines;
        while (1) {
            if ($lines[0] == $line) {
                $sub->($_);
                shift @lines;
            }
            ++$line;
            last if ($line > $lines[0] or (not @lines));
            $_ = <>;
        }
    }
}

1;

But at least my typical spelunking looks a little cleaner now.

< log.txt perl -MLogfiles=process -ne \
    'process { print } qr/somefunc\(\)/, [0..2]'

Any suggestions on how to improve this (without reverting to sed)?

Read Full Post »

The Perl Flip Flop Operator

Mike Taylor dismisses Perl with a pithy reference to a section of its excellent documentation1. For some reason, I mis-remembered that he was complaining about the flip-flop operator rather than context in general.

So, I’ve come to defend the flip-flop operator, and the opposition hasn’t turned up! Oh well, never mind.

In scalar context, “..” returns a boolean value. The operator is bistable, like a flip-flop, and emulates the line-range (comma) operator of sed, awk, and various editors. Each “..” operator maintains its own boolean state. It is false as long as its left operand is false. Once the left operand is true, the range operator stays true until the right operand is true, AFTER which the range operator becomes false again.

Scanning Logfiles

So, say your logfile looks something like this:

... 100,000 lines ...
10:22:25.279 The first interesting line
... 30 more interesting lines ...
10:22:25.772 Another interesting line
10:22:25.772 The last interesting line
10:22:25.779 And then this line isn't interesting any more
... 100,000 lines ...

If you specify the beginning timestamp and end timestamp then you will get one uninteresting line which you can strip with head -n-1.

And that is it. Pretty easy eh?

jared@localhost $ cat flip-flop.muse \
> | perl -ne 'print if /^10:22:25.279/ .. /^10:22:25.779/' \
> | head -n-1 \
> | mail jared
10:22:25.279 The first interesting line
... 30 more interesting lines ...
10:22:25.772 Another interesting line
10:22:25.772 The last interesting line

Notes:

In a regex I often use an unescaped period (.) to match a period if it doesn’t matter like here

And for anyone thinking useless use of cat… it’s deliberate.


1. With impressive inconsistency, he later on says that Perl is a contender to be His Favourite Language which is why the alternative title for this post was Why Mike Taylor is not my Favourite Blogger.

Just kidding Mike.

Read Full Post »

Follow

Get every new post delivered to your Inbox.