Feeds:
Posts
Comments

Posts Tagged ‘perl’

I wouldn’t get into google (their interviews don’t select for my brand of genius). But that’s fine as I wouldn’t be able to write perl there anyway.

I was trying to think of a few reasons why Google would have chosen Python over Perl as their primary scripting language. A number of the ones I came up with are less relevant than they used to be.

Python integrates better with C++

Boost.Python promises seamless interoperability between the two languages. It has been a while since I’ve been tempted to embed a scripting language in my C++ apps so I’m not sure what the counterpart is in Perl if any.

Python "threads"

Back in the day, these were better than the Perl offering. More recently, on Unix at least; with Coro, AnyEvent, Twisted, etc. it’s a wash.

Windows Implementation

A few years ago, the various Perl Windows implementations were not very high quality. Strawberry Perl has changed all that though. It’s awesome.

Jython et el.

Was Jython around when Google was choosing Python? I guess it was.

Perl (5) hasn’t escaped its VM as well as Python and Ruby. Having said that, the Jythons and JRubys of the world are always second class citizens compared to the original C VM as no-one can be bothered to rewrite all the C extensions.

Easier to avoid making a mess as a secondary language?

Experts in either language can write clean, efficient code. But what about folks who are expert in C++ and Java and use either Perl or Python on the side and only occasionally?

I think such usage for a large system will result in unwieldy code in either language, unless the large system is curated by experts, but for small to medium systems in the absence of experts, python may have the edge.

Rabid anti-Perlism

Even smart folks I know seem to think Perl is lacking in some way in comparison to Python. I haven’t managed to get them to enumerate any reasons though so I figure they are speaking from a position of ignorance. This could have been the case for the folks making the decisions for google too or, more likely, they may have considered some of the other reasons on my list.

Read Full Post »

Of the languages I know, emacs-lisp is unusual in that when you need to access the field of a structure, you need to know the name of the type1.

(defstruct person age name)
(defvar dave (make-person))
(setf (person-age dave) 20) ;; getter specifies the type!
(setf (person-name dave) "David Jones")
(message (person-name dave)) ;; -- David Jones

Notice I can’t say (age dave) here, it has to be (person-age ...).

The underlying reason for this is that structures are really vectors. (person-age object) translates into (aref object 1) which should hopefully be pretty fast. Another structure doesn’t need to keep its age member at an offset of 1, so I can’t say it is (generic-structure-age object).

In basic perl objects, because they are really just a hash reference, I don’t need to specify the type. When it is needed, the VM figures it out at runtime.

package Person;

sub new
{
    bless {}, $_[0];
}

package main;

my $dave = Person->new(); # Type specified here
$dave{'age'} = 20;        # but not here
$dave{'name'} = 'David Jones';

print $dave{'name'}, "\n";
$ perl t.pl
David Jones

In many statically typed languages, the fact that the compiler knows what the type is enables it to generate efficient code. I still don’t need to specify it explicitly.

#include 
#include 

using namespace std;

struct person
{
    int age;
    string name;
};

int main()
{
    person dave;

    dave.age = 20;             // No type explicitly specified here
    dave.name = "David Jones"; // or here

    cout << "Name: " << dave.name << "\n";
}
$ g++ t.cpp
$ ./a.exe
Name: David Jones
$

For objects created using the new version of defstruct* I’ve defunned a get-field and set-field that don’t require the type to be specified. Error checking is, as usual, elided

(defsubst get-index (object field)
  (cdr (assoc field (symbol-value (aref object 1)))))

(defun get-field (object field)
  (aref object (get-index object field)))

(defun set-field (object field value)
  (setf (aref object (get-index object field)) value))

(get-field dave 'name)
(set-field dave 'name "Simon Smith")

Although by this time we’re probably both wondering why I didn’t just use a hash like all the (other) scripting languages.


1. To be more accurate, you need to tell the computer what the type is. Obviously, if you want to do something useful with an object, you generally need to know what type it is anyway.

Read Full Post »

Following Nilson’s comment, I’ve updated my perl and python method call benchmarks to (hopefully) test what I actually intended to test.

I must admit, I’m not nearly as interested in method call times as I am in vanilla subroutine call times which could be why I made such a hash of it in the first place.

And I haven’t updated the iteration method as I’m not interested in loop speed (hence why I measured the base case) and that would make pointless extra work for me to do.

Read Full Post »

More Subroutine Benchmarking

Some folks requested a few more benchmarks. In this case, I’m happy to oblige. Perl does not come off well in these benchmarks.

Perl Code

package class;

sub new
{
    return bless {}, $_[0];
}

sub f1
{
}

sub f2
{
    my ($self, $x, $y) = @_;
    return ($x, $y);
}

sub f2a
{
    my $self = shift;
    my $x = shift;
    my $y = shift;
    return ($x, $y);
}

package main;

my $obj = class->new();

for ($i = 0; $i < 10_000_000; ++$i) {
    my ($x, $y) = $obj->f2(1, 2);
}

Python Code

class myClass:

    def f1(self):
        pass

    def f2(self, a, b):
        return a, b

x = myClass()

for i in xrange (1, 10000000):
    a,b = x.f2('hello', 'world')

Perl Results

$ perl -v

This is perl, v5.10.1 (*) built for i686-linux-thread-multi

$ time perl ./func.pl # f1()

real    0m5.052s
user    0m5.044s
sys     0m0.008s

$ time perl ./func.pl # f2(1, 2)

real    0m11.598s
user    0m11.585s
sys     0m0.000s

real    0m10.838s
user    0m10.833s
sys     0m0.000s

$ time perl ./func.pl # f2a(1, 2)

real    0m10.740s
user    0m10.713s
sys     0m0.004s

real    0m12.014s
user    0m11.993s
sys     0m0.012s

$ time perl ./func.pl # f2a('hello', 'world')

real    0m16.524s
user    0m16.505s
sys     0m0.008s

real    0m16.521s
user    0m16.489s
sys     0m0.000s

Python Results

$ time python ./func.py # f1()

real    0m3.840s
user    0m3.828s
sys     0m0.004s

$ time python ./func.py # f2(1, 2)

real    0m4.546s
user    0m4.504s
sys     0m0.040s

real    0m5.887s
user    0m5.860s
sys     0m0.016s

$ time python ./func.py # f2('hello', 'world')

real    0m4.548s
user    0m4.540s
sys     0m0.008s

real    0m5.907s
user    0m5.904s
sys     0m0.004s

Read Full Post »

For a long time now I have suspected that calling perl subroutines is slow. And I couldn’t figure out from the language shootout which benchmark tested subroutine calls (did it use to be Ackermann?) so I made up my own benchmark.

Perl in comparison to its closest rival – CPython.

I tested a few things:

First, a loop that does nothing, to see how much is loop overhead
Next a zero parameter function
Then a function called with two integers
And finally, declaring those two integers inline.

I’m assuming that an optimiser doesn’t come along and remove code that does nothing. From the results, it seems like a safe assumption.

And, I’m not running benchmarks multiple times or with many iterations because, frankly, I don’t care that much. I just want to get an idea as to how Perl stacks up.

Python Code

def f1():
    pass

def f2(a, b):
    pass # return a, b

for i in xrange (1, 10000000):
    pass
    # f1
    # f2(1, 2)
    # x,y=1,2

Python Results

 $ python -V
Python 2.6.5

$ time python ./func.py # (pass)

real    0m0.722s
user    0m0.720s
sys     0m0.004s

$ time python ./func.py # (x,y = 1,2)

real    0m2.030s
user    0m2.024s
sys     0m0.004s

$ time python ./func.py # (f1)

real    0m2.265s
user    0m2.244s
sys     0m0.012s

$ time python ./func.py # (f2 - pass)

real    0m2.885s
user    0m2.880s
sys     0m0.004s

$ time python ./func.py # (f2 - return a, b)

real    0m3.190s
user    0m3.144s
sys     0m0.012s

Perl Code

sub f1
{
}

sub f2
{
    my ($x, $y) = @_;
}

for ($i = 0; $i < 10_000_000; ++$i) {
    1;
    # f1();
    # f2(1, 2);
    # my ($x, $y) = (1, 2);
}

Perl Results

$ time perl ./func.pl # (1)

real    0m0.893s
user    0m0.888s
sys     0m0.004s

$ time perl ./func.pl # (f1)

real    0m2.932s
user    0m2.924s
sys     0m0.004s

$ time perl ./func.pl # (f2)

real    0m5.607s
user    0m5.580s
sys     0m0.004s

$ time perl ./func.pl # (my ($x, $y) ...)

real    0m2.687s
user    0m2.672s
sys     0m0.008s

Conclusions

A few things jump out at me.

1. 10 million subroutine calls take around a couple of seconds. Would reducing call speed actually affect any real program much?

2. Declaring the variables and assigning the values takes almost as long as the empty function call.

3. Python function calls are faster than perl function calls but it’s not by enough to worry about.


My next post should hopefully clarify why I was thinking about this.

Read Full Post »

ARCv2

So I’m perhaps 1% of the way to a poorly thought out middleware component like CORBA1. No, it’s more light-weight, maybe a messaging layer, sorry I mean wire-level protocol specification implementation such as AMQP.

And then I think (like hundreds have probably thought before me), you know, this would be more useful if it had authentication. After all, I don’t want just anyone to be able to send kill signals to any processes. That would be like everyone being root. Which terrifies me.

Don’t invent your own authentication mechanism

And the golden rule about authentication, as far as I can work out, is don’t invent your own authentication mechanism. You’ll get it wrong and leave gaping vulnerabilities for the bad guys to have their wicked way with you. That is, if anyone besides you ever uses your code. And besides, I don’t want to waste any of my 1500 lines on coming up with Yet Another Broken Authentication System.

A quick trip to CPAN

Then I’m looking through the Authen::XXX modules on CPAN and none of them behave in exactly the way I want. But somehow I find a perl server that includes authentication and perhaps does everything I want and I should definitely put it on my list of things to look into even though I’m having a lot of fun with AnyEvent right now.

But by the time I come to look again, I can’t find it. And I’ve complained about documentation before, but Emacs really does deserve it, and I know of no system or language that is better documented than Perl. But I guess the classification problem is a bit tricky to overcome.

ARCv2

Anyway, long story short, I found it.

authenticated perl server -http

An Authenticating Perl Server

The first link (warning PDF) is a paper about using Authen::SASL in client/servers and it mentions ARCv2 which sounds like what I’m looking for.

The first thing to do is find out if it does what I want. The second is to check if it works on Windows.


1. Ambiguity left in deliberately

Read Full Post »

This is part 1 in my occasional AnyEvent series

I’ve been playing around with Plack and Twiggy recently and that motivated me to take a look at AnyEvent, the eventing library that Twiggy is built upon.

Now, it seems to me that AnyEvent is useful in a similar problem-space to POE (or Coro, on Unix at least). POE has more documentation but AnyEvent can actually use a couple of Event Loops that were implemented in C: EV and libevent which is the backbone of the hugely successful memcached. Sounds good to me.

As usual, first things first. How do you make a simple TCP server? I took a look at how Twiggy does it – the code is available in Twiggy::Server. I won’t need everything in there of course.

The Preamble

I like the way that Twiggy sets a constant called DEBUG from an environment variable. Now I can call the script like this: $ SERVER_DEBUG=1 ./ae.pl and get debugging output.

#!/usr/bin/perl

use 5.010;

use strict;
use warnings;

package Server;

use constant CTRL_D => 4;
use constant DEBUG => $ENV{SERVER_DEBUG};

use IO::Handle;
use Socket qw(IPPROTO_TCP TCP_NODELAY);
use Errno qw(EAGAIN EINTR);

use AnyEvent;
use AnyEvent::Socket;
use AnyEvent::Util qw(WSAEWOULDBLOCK);

Basic Perl Objects

The watch variables only watch while they are in scope. If we have a simple object, we can dump them into the underlying blessed hash reference to keep ‘em in scope. I’m lamenting the non-standardness of Moose here, but that’s another post.

sub new
{
    my $class = shift;
    my $self = {};
    bless $self, $class;
    return $self;
}

Bottom-up or Top-down I’m never quite sure how to present my code. Hmmm…

AnyEvent::tcp_server

So, we’re assuming here that a server object will be called (with my $server Server->new()=) and then we will call $server->start_listen(...).

The example tcp_server call in the AnyEvent::Socket documentation rather unhelpfully demonstrates closing the socket the moment it has connected with a The internet is full, $host:$port. Go away! message. Oh well, maybe my examples are equally flawed. Thank goodness for Twiggy.

sub start_listen
{
    my ($self, $host, $port) = @_;

    $self->{server} = tcp_server($host,
                                 $port,
                                 $self->_accept_handler(),
                                 \&prepare_handler);
}

prepare_handler just logs a basic message on start-up. The accept handler returns a closure to maintain access to $self. The closure sets the socket options and then creates a watcher using watch_socket.

sub prepare_handler
{
    my ($fh, $host, $port) = @_;
    DEBUG && warn "Listening on $host:$port\n";
}

sub _accept_handler
{
    my $self = shift;

    return sub {
        my ($sock, $peer_host, $peer_port) = @_;

        DEBUG && warn "$sock Accepted connection from $peer_host:$peer_port\n";
        return unless $sock;

        setsockopt($sock, IPPROTO_TCP, TCP_NODELAY, 1)
            or die "setsockopt(TCP_NODELAY) failed: $!";
        $sock->autoflush(1);

        # my $socket = IO::Socket::INET->new_from_fd($sock, 'r+');
        # $socket->autoflush(1);
        # $socket->blocking(0);

        $self->watch_socket($sock);
    };
}

AnyEvent IO watcher

The watcher is setup to echo whatever it received, back to the sender. If it receives EOF (sent when a telnet client hits CTRL-D), then it terminates the connection.

Now, I didn’t manage to get this working immediately. If the watcher goes out of scope, it doesn’t end up watching anything. And I originally omitted the undef $headers_io_watcher statements. As the closure wasn’t referring to the watcher variable, it went out of scope immediately. Adding them added a reference which stopped that happening. Nice, if a little subtle.

sub watch_socket
{
    my ($self, $sock) = @_;

    my $headers_io_watcher;

    $headers_io_watcher = AE::io $sock, 0, sub {
        while (defined(my $line = <$sock>)) {
            $line =~ s/\r?\n$//;
            say "Received: [$line] " . length($line) . ' ' . ord($line);

            if (length($line) == 1 and ord($line) == CTRL_D) {
                print $sock "Received EOF.  Closing connection...\r\n";
                undef $headers_io_watcher;
            } else {
                print $sock "You sent [$line]\r\n";
            }
        }

        if ($! and $! != EAGAIN && $! != EINTR && $! != WSAEWOULDBLOCK ) {
            undef $headers_io_watcher;
            die $!;
        } elsif (!$!) {
            undef $headers_io_watcher;
            die "client disconnected";
        }
    };
}

main()

package main;

my $host = undef;
my $port = 12345;

my $server = Server->new();
$server->start_listen($host, $port);

AE::cv->recv();

And here is the result.

$ telnet localhost 12345
Trying ::1...
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
hello
You sent [hello]
Received EOF.  Closing connection...
Connection closed by foreign host.

Read Full Post »

Older Posts »

Follow

Get every new post delivered to your Inbox.