Thoughts on NaNoGenMo pt. 2
Continuing my overview of National Novel Generation Month, in which novel-length texts were computationally generated…
jiko’s project “Gen Austen” (hah!) produced several novels derived from Jane Austen (or in one case, Austen-related fanfic).
One uses trigrams with some POSing, one uses some numerological approach, two of them use dialogue-replacement algorithms (replacing the dialogue of one novel with the dialogue of another) and one is passed through an anagram generator to produce, basically, a list of anagrams. jiko also provides a list of resources he’s worked on (or just finds useful?)
Nick’s novel World Clock consist of a series of template-generated paragraphs which describe a time in a location, a person in a place, and an action performed by that person.
For an added bit of class, the script outputs TeX, for easy pdf-ing.
Next up is catseye, who developed a story engine that also powered dariusk’s entry.
catseye’s novel Dial S for Swallows discusses the interactions of two characters, an object, and a location. It reminds me a bit of the time I copy-pasted the output of a mud (a text “multi-user dungeon”, kids) repeatedly until it was novel-length. Skimming the source code, it looks like it is indeed a high-level simulator running several agents through a world representation. Another thing worth looking through. catseye also has several very interesting thoughts on the process worth further study.
Looks like elib wrote a twitter scraper that collated all lines it found that began with “The way that…” Not bad. Works best as conceptual writing.
Looks like MichaelPaulukonis did several NLP-based transformations on texts, including named entity recognition swaps between texts. It looks like something more is going on, but it’s not clear what; need to look through the final text and the source code a bit more.
ianrenton is apparently using some kind of spam-generation technique called Bayesian Poisoning (which is a technique that adds common non-spam words to spam in an attempt to have them classified as spam, thereby rendering the classifier unreliable.) It’s a great idea, since it’s likely to add to a text the kind of words you’d expect to see there (i.e. not spam-like.) ianrenton produced a text using fanfic from fanfiction.net as a basis. I haven’t looked too closely through the source code, but it seems to work by collating sentences from different stories before performing any Bayesian Poisoning techniques.
lilinx is pulling all sentences with the word hit/hurt from a French translation of Homer’s Iliad, and reversing their order. My French isn’t very good, but it’s a great idea. Script, notes, and output.
Then there’s ikarth’s “Gutenberg shuffle” approach. Described as: “Take a collection of texts from Gutenberg, generate a list of all of the names you find. Swap the names around. Shuffle the sentences. Shuffle and redeal for every paragraph.” Here’s the code, which allegedly uses OpenNLP, and the resulting novel.
Strand’s “simple solution” is
puts "Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo. " * 6250
ah ha ha ha ha…. more conceptualism. or is it flarf? or just good ol’ fashioned lulz? I like the sentence, but… I dunno…
jonkagstrom’s Abscission is deep. Apparently , the approach is to part-of-speech tag Kafka’s Metamorphosis, then modify each paragraph according to a series of transformations directed by WordNet. Pretty awesome on a number of levels. Here’s the output and the code.