Skip to content

form: codework pseudohaiku

February 1, 2011


Jan. 30 2011, selected from the output of perl scripts identifying overlapping substrings in a list of words taken from lyrics by Joy Division, Nine Inch Nails, and Suicidal Tendencies.

So a while back I noted where Florian Cramer quoted a line by Mez Breeze:

“::Art.hro][botic][scopic N.][in][ten][dos][tions::

Now the thing I like about that, is it just looks cool, like a Woodring Jiva, or Sagrada Familia / Watts Towers -style building. Well, sideways. But then I thought: there ought to be three of them! Kinda like haiku! Machine-generated, of course!

So I define Form 6559f09a-681f-4ec7-a7fb-c0529730cbcc (codework pseudohaiku), which is made up of three ordered sets of strings, where each string can be combined with one or both of its neighbors to produce a word. The first and last sets have 5 strings, the middle one has 7. The strings are separated by punctuation, and optionally the sets have punctuation at their beginning and end. Bonus points for generating it from a corpus.

I started out hand-authoring one (i.e. just thinking up words and consulting a dictionary) to get a feel for the form:][on][ly][can][tor...

So, you see, the first line is made up of strings that combine to make the words “neon”, then “only”, then “lycan”, then “cantor”. This is an example of a “strict” pseudohaiku, where each string with two neighbors is used to make exactly two words. Compare that to the “normal” pseudohaiku example below, where the string “rrib” has two neighbors but is only used to make one word, “terrible”.


The pseudohaiku immediately above uses words from Shakespeare’s sonnets combined with the list of high-arousal words I found earlier. The pseudohaiku below is just from Shakespeare’s sonnets.


I use a perl script to read a set of words and identify sequences that will make up an appropriate line. I run the script a couple times, then select and add punctuation.

Some thoughts:

  • An interesting thing about forms like codework pseudohaiku is that they constrain what might otherwise be an infinite search space. So there are a finite number of pseudohaiku that could be generated from a given corpus; I could conceivably identify them all using exhaustive search.
  • For a given corpus of a limited size, I could probably generate a graph representation of all possible pseudohaiku (and maybe even write a JanusNode generator.)
  • Or I could constrain the form further by doing part-of-speech tagging and mandating a certain sequence of part-of-speech classes per line.
  • Or I could constrain the form even further by mandating that the word sequences be bigram pairs.
  • Or I could generate from a continually-updated corpus from the web: this would make the search space infinite again, but would allow a program that is perpetually searching for novel pseudohaiku which are posted online as it found them.
  • eRoGK7 has sworn to get me the 6x6x6 corpus; I could generate pseudohaiku from that. Beware!

So much fun, so little time…

No comments yet

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: