Posts Tagged ‘batch text processing’

Some time ago, I wrote about doing batch processing of text with an external process running, e.g. Perl. Similarly, emacs-lisp has a lot of functionality for manipulating text.

The Problem

I have a file like this:

John James,Admin,other data,...
Dave Jones,Sales,...
Lisa Sims,IT,...

I want to convert it into the following1:

AND name IN ("Dave Jones", "John James", "Lisa Sims")
AND dept IN ("Admin", "IT", "Sales")

The Solution

First of all I need a helper function that converts lisp lists into a quoted comma-separated list.

(defun make-csv (seq)
  (mapconcat (lambda (e) (format "\"%s\"" e)) seq ", "))

And then I can iterate over the text with re-search-forward, collecting the matched strings. At the end, I’ll output the collected strings. in a sql clause fragment.

(defun process-lines (&optional begin end)
  (interactive "r")
  (goto-char begin)
  (let (names depts)
    (while (re-search-forward "\\([^,]+\\),\\([^\n,]+\\)" end t)
      (push (match-string 1) names)
      (push (match-string 2) depts)
    (insert (format (concat "\n"
                            "AND name IN (%s)\n"
                            "AND dept IN (%s)\n")
                    (make-csv (sort names #'string-lessp))
                    (make-csv (sort depts #'string-lessp))))))

If you liked this post, why not subscribe to my RSS feed.

1. Okay, you got me, I don’t really want to convert it into this. But for the purpose of the example, this will do. Exercise for the reader – how can I convert it into sql that will efficiently extract just the lines I want?

Read Full Post »


Get every new post delivered to your Inbox.