In an endnote to [The Virginia Woolf Poems], Jackson [Mac Low] explained the “diastic” or “spelling-through” technique he had used in writing the poems. The process began with a striking phrase from Virginia Woolf’s The Waves: “ridiculous in Picadilly.” He reread the novel, looking for the first word that, like “ridiculous,” began with an r; then read the next word following that had (like “ridiculous”) i as its second letter; then the next whose third letter was d; and so on until he had “spelled through” the whole phrase. (There were other rules for line breaks, punctuation, and so on.) The resulting text would be made entirely out of Woolf’s words but would have none of the usual English syntax.
– Charles O. Hartman, “Virtual Muse: experiments in computer poetry”
I don’t know much about the details of the algorithm (like the “rules for line breaks, punctuation, and so on”), so I parameterized a lot of stuff.
- the direction the text is read through – can be forward, backward, or in a random order (shuffled)
- when going through the characters in the seed text, you can look at all characters, or just vowels, or vowels plus approximants (l, r, w, y)
- when matching characters in input text words with characters in the seed text, you can ignore case (upper- vs lower-) or not
- you can add newlines after every word in the seed text. You can also add lines every time you fail to find an appropriate word for a seed text character. (i.e. after cycling through the Input Text once.)
- you can decide to “cycle through” the input text looking for words (i.e. start at the beginning of the input text once you reach the end), or just read through once
- you can append to the output text (for multiple read-throughs with multiple Input Texts) or clear all the TextAreas after every reading
for each character in position p of the seed text, for each least-read word w in the input text if the word w's character in position p is equal to the seed text's character in position p, output the word w
Plus, I wanted to see what that diastic algorithm is all about. Basically you’re building a unigram language model and selecting words from it based on surface features in a seed text. The surface features are fairly arbitrary – character n in word n in the output text must be the same as character n in the seed text. It seems like the distribution of interpretations allowed by the poems produced by the algorithm are mostly constrained by the lexical features of the input text. The seed text doesn’t seem to contribute much semantically or syntactically; basically it just guarantees that each word in the output has at least one identical character in the same position as a word seed text.
Of course, I can use this to mess around a little. Let’s say you use a Seed Text like:
then on Ode to a Nightingale you get the output:
youth eyed beyond Away
on Poesy Though throne
upon But fruit thou
So you can kinda control the sounds of the output text. But I wasn’t sure how else this could be used, which is why I added a parameter to just match by vowels, or vowels and approximants. (i.e. in which the algorithm ignores the consonants in the seed text.) So for vowels-only seed text on the default inputs you get:
Moon winding incense eglantine
I also wondered whether doing a selection going through the text from beginning to end would add some kind of discourse structure. This is complicated by the fact that the algorithm cycles through the input so I implemented a parameter that turns off the cycling and just ‘reads through’ once. Without cycling on the default input models you get:
And fly dull
leaves mid soft names
become death tread down
clown song lands
Along the same lines of trying to eke out a discourse pattern, I parameterized the ability to consider (or not) upper/lower-case differences. The idea was that maybe you’d have output verses beginning with capitalized words.
I’d need to run a lot more data to see how these various parameters affect the output. I’m still trying to figure out a good experiment (exploratory or confirmatory) for this sort of thing – I’m thinking I could generate a number of poems on a number of sources, evaluate them via questionnaire and short-essay on Amazon Mechanical Turk, then look for significantly different questionnaire values and/or word distributions.
I guess what I like about the diastic algorithm is that it is deterministic, not random. Though I spoiled this by adding a parameter that permits testing the words in a random order. This brings it one step closer to straight random selection from a unigram model (which is itself equivalent to the Dada word-level cut-up technique.)
What I don’t like about the diastic algorithm is that the character-position features which determine sampling from the Input Text have little or no semantic or syntactic value; at most they show some phonemic match, distributed over several words, but this is noisy due to English’s phonemic/graphemic mismatch.
I like the whole idea of using a text to build a model, then using a query text to guide generation. It’s analogous to a dialogue system, where you have an input utterance and a knowledge base to draw an answer from. (of course dialogue research now acknowledges there’s a lot more than that going on like nonverbal behavior, disfluencies, etc.) It’s also analogous to question-answering / information extraction approaches, which I know less about. Jackson Mac Low didn’t have a CS or Math background, so it’s not surprising he used a simple character-matching algorithm. I really need to look into QA/IE approaches, it seems like I could be able to use them to extract semantic or discourse features which could then be used to generate poems with ranges of readings that seem more intentionally inspired.
|default Input text from:||default Seed text from:|
|Ode to a Nightingale||Suicidal Thoughts|
|John Keats||Biggie Smalls|
|“last of the great Romantic English poets”||“the Greatest Of All Time”|
|dead of tuberculosis at age 25||dead of gunshot wounds at age 24|
“the most incongruous ideas in the most uncouth language”