oh trap yes
uh uh uh
Respect damn uh
oh Oh Oh
uh uh uh uh uh uh uh uh uh uh
I guess I know how to parse… I took this NLP class that covered it and I got an A-, and I vaguely remember working through some parsing algorithm… When I was in junior high, we did some sentence diagramming, and I remember thinking it was kind of fun. Parsing is the sort of thing that was never really that important to me, though. It seemed like it didn’t deal enough with the questions of user modeling and interaction that I though were the most interesting parts of meaningful communication.
But look, one of the reasons I like this computer poetry business is that it makes me see the value of things I didn’t see previously. So I’ve started playing around with the Stanford Parser and englishPFCG.ser.gz, which I gather is some kind of probabilistic context-free grammar built on a general english corpus. I don’t know why I chose this particular parser – I guess cause it’s in Java. But I got to admit, I don’t really know that much about parsing.
I mean look at that parse right above. Is that correct? I guess it could be. I dunno… I like how it plunges. It’s pretty. But I wouldn’t feel compelled to argue any one interpretation. I figure if it’s wrong, someone will fix the state-of-the-art in parsers, and then it’ll be all right. But that person won’t be me, because I don’t really care. Data is always noisy, and I don’t really care about the details.
Anyways, for the poem at the top I ran Biggie’s lyrics through the parser and got a bunch of trees. Then I extracted rules from the trees. The terminal leaves are basically part-of-speech tags. The Interjection tags (abbreviated UH) are what makes up this week’s poem. Some of them are clearly wrong (like “critically acclaimed”) and some are questionable (like “respect”) but I like the category. Method 49be238c-cbbe-4a8f-a04c-134955039bb4 (poem from part-of-speech-tagged words on a given corpus.) Part of Biggie’s genius is his use of vocalizations and interjections, and it’s great to see it survive the noise of transcription and parsing. It’s missing the “WHAAAAAT”s, tho. looks like those are stuck in or WDT or WP (wh-determiner or wh-pronoun, apparently?), mostly?
I’m guessing that any poems I generate from a learned grammar are going to have the same kinds of problems that part-of-speech-based generated poems have: reduced coherence due to adjacent pairs that the original author would never have put together themselves. I need to add feature structures or something. I have to admit, it doesn’t sound that interesting. But anyway, if nothing else, it’s a great way to re-read some of Biggie’s lyrics. There’s a different kind of rhythm to reading it on trees.