I’ve spoken about processing logfiles with perl previously. Occasionally though, I still reach for sed.
Say I have a logfile that looks like this:
[ <timestamp> ] : somefunc() [ <timestamp> ] : interesting line 1 [ <timestamp> ] : interesting line 2 ... 1000s of lines [ <timestamp> ] : somefunc() [ <timestamp> ] : interesting line 1 [ <timestamp> ] : interesting line 2 ... 1000s of lines
Picking out lines following a pattern is easy with sed – p
prints the current match and n
takes the next line.
$ < log.txt sed -n '/somefunc()/ {p;n;p;n;p}' [ <timestamp> ] : somefunc() [ <timestamp> ] : interesting line 1 [ <timestamp> ] : interesting line 2 [ <timestamp> ] : somefunc() [ <timestamp> ] : interesting line 1 [ <timestamp> ] : interesting line 2
My first attempt to replace that with perl looks a bit ugly
< log.txt \ perl -ne 'if (/somefunc\(\)/) {print; print scalar(<>) for (1..2)}'
I’m not that happy with the module I came up with to hide the messiness either.
package Logfiles; require Exporter; our @ISA = qw(Exporter); our @EXPORT_OK = qw(process); use Carp; sub process (&$;$) { my ($sub, $regex, $lines) = @_; $lines ||= 0; return unless /$regex/; if (! $lines) { $sub->($_); } else { croak "process() arg 3 not ref ARRAY" unless ref($lines) eq 'ARRAY'; my $line = 0; my @lines = @$lines; while (1) { if ($lines[0] == $line) { $sub->($_); shift @lines; } ++$line; last if ($line > $lines[0] or (not @lines)); $_ = <>; } } } 1;
But at least my typical spelunking looks a little cleaner now.
< log.txt perl -MLogfiles=process -ne \ 'process { print } qr/somefunc\(\)/, [0..2]'
Any suggestions on how to improve this (without reverting to sed)?
perl -ne ‘print if $a– || ($a=/fsomefunc\(\)/*2)’
What I was trying to say in my previous comment (before wordpress “smartened” my code) was:
$lines ||= 0;
serves no point, since the very next thing you do with $lines is check it for falsehood.How about:
package S; # for "Sed"
use base 'Exporter';
our @EXPORT = ('p');
sub p # print ARG lines from stdin, starting with current
{
my $nlines = shift || return;
while (--$nlines) { print; $_ = ; }
print;
}
Then at the command line:
perl -MS -n '/somefunc()/ && p(3)'
Hmmm, blank lines seem to screw up <code> formatting. Here’s another try.
$lines ||= 0;
serves no point, since the very next thing you do with $lines is check it for falsehood.How about:
package S; # for "Sed"
use base 'Exporter';
our @EXPORT = ('p');
sub p # print ARG lines from stdin, starting with current
{
my $nlines = shift || return;
while (--$nlines) { print; $_ = <> ; }
print;
}
Then at the command line:
perl -MS -n '/somefunc()/ && p(3)'
Crud. And the second line of the
while
block in the above function was supposed to be$_ = <>
.Ah, HTML….
Hi folks,
Sorry about the WordPress code mangling feature. If it makes it any better, it annoys the heck out of me too.
@Mark – neat golfing, and definitely better than my initial attempt 🙂 I’m aiming for clearer rather than terser code (although terse is still important to compete with sed). Not sure if I got it.
@Sue –
I’ve tried to add back the missing angle-brackets. Let me know if I’ve missed any.
Ah yes, $var ||= 0 is a reflex to avoid undefined warnings when I have warnings turned on.
And I definitely like the way you’re going with
S
. My thinking was that sometimes the very next lines aren’t the interesting ones. Maybe you want the 2nd and 3rd lines following a particular match. It could be that doesn’t happen much in practice mind you.