Break Bear Presenteth: POS Sonnet line templates
Injury everywhere to it, but her to short-number wretchcd,
Beyond beck pretty enough out their decrees love advance,
Kind sober sure pilgrimage which speak my place thou,
Either maladies, rise where-through, effectually raven damask
To conspire these mask which down quenched elsewhere!
Warm saucy flown, within the break bear presenteth,
Why over-plus famoused i wherever it built her wonder,
It break builded graves remov hast, dull yet plain,
Up look’ paper to living deaths?
Worth itself may tell wanton her whom owe barrenly being.
Ere ersways we though before me save through half gor,
My return, no mind to her reason,
Thereby will i touch to tis why them feel achieve disdaineth —
Towards monsters,’ll reckon no tame influence —
Purpos so a silver nor advantage toward raiment,
August 23, unsupervised template generation from POS tagging, source text: Shakespeare’s Sonnets, generator: JanusNode.
Hi guys! Sorry I haven’t posted in a while. Mood swings, you know.
Anyways, the poem above is something I’ve wanted to do ever since I came across JanusNode. I had a part-of-speech (POS) tagged file of Shakespeare’s Sonnets from when I was doing POS-based n-grams with epogees. what that means is that I took the Project Gutenberg copy of the sonnets, and put it through the Stanford part-of-speech tagger to tag each word with it’s part of speech. Anyways, each line of the POS-tagged file looked like this:
From/IN fairest/JJS creatures/NNS we/PRP desire/VBP increase/NN ,/,
where tags like IN and JJS are POS tags (preposition and superlative adjective). so I just wrote a Java program to do a couple things. first, it looked through the whole file and collected every word for each POS class. for example, here are all of the words in Shakespeare’s Sonnets that have the JJS class tag:
(note that “lest” is not actually a superlative adjective! still, the POS accuracy is pretty good overall…) then it wrote them to a JanusNode “BrainFood” file. for example, there’s a file called “e_shkpos_JJS” (edde_shakespearePartOfSpeech_TAG) that contains the words above.
next, it extracted the sequence of pos tags for each line. in the example above, this would be:
IN JJS NNS PRP VBP NN ,
after it had this, it would translate the sequence into a line of JanusNode code such as:
100 <CapitalizeNext() 100>e_shkpos_IN 100 e_shkpos_JJS 100 e_shkpos_NNS 100 e_shkpos_PRP 100 e_shkpos_VBP 100 e_shkpos_NN 100 e_shkpos_COMMA 100 return 100
(you can ignore the “100”s, they just mean that there’s a 100% chance this will happen if the line/word is encountered.) so basically the line just means: pick a word from each of the BrainFood files that contains the appropriate list of POS-tagged words.
so this was done to each line of the shakespearean sonnets, producing a file with 2154 lines of JanusNode commands. each time the file is run, one of those lines is selected at random, and used to produce a line of code.
And there you have it, a quick part-of-speech template-based generator. The way it handles apostrophes and capitalization could be cleaned up a bit. but this will do for the moment, because, you know, the poetry it produces isn’t very good. this is something I found when using POS-based n-grams models: a lot of the coherence you get from word bigrams is lost because instead of using a noun/verb combination (for example) that the original author chose at some point, you’re using potentially any noun and any verb that the original author used, even if that author would never put those words next to each other.
But anyways, if I’d parsed the original text instead of just POS-ing it, I could have had a quick-n-dirty grammar-based generator; that would be the next step if I was interested in continuing this, which I don’t think I am. so anyways, if you want to have your JanusNode poems use shakespeare’s adjectives or something, you can download the JanusNode files I wrote. ok later.