Skip to content

work in progress: charNGram

April 12, 2011

I know there are plenty of character n-gram generators out there (Travesty, Dissociated Press, JanusNode, etc.) but there’s something enjoyable about coding one yourself. Among other things, I think it’s pretty clear that you can skip explicitly building a language model in memory; like Dissociated Press you can just use the text. At some point I should write a proof of the equivalence of the two approaches.

kn ich flomalveeane thin,
Whe pay s ld tharde whenl,
Wheecheluselauset’dothallseinkn’sselothrkn eav

Looks like a mix of Sanskrit and German.

Advertisements
2 Comments leave one →
  1. April 15, 2011 10:58 am

    It’d be a cool thing to visualize too!

    *assignment idea for CS class? …*

    • April 18, 2011 12:41 am

      > It’d be a cool thing to visualize too!
      > *assignment idea for CS class? …*

      yeah… easy to kludge together a generator and spew something out… modeling is what really teaches you. tracing the data structures and algorithms.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: