Norvig's spelling checker makes use of two probability models. The first model describes how likely a particular word is in a given language. For example, in English, "the" is more likely than"theraputic." The other model describes how likely a given mistake is when typing some word (ie. "tha" is more likely than "thm" when trying to type "the").Wow, that was fun. (Note: sarchasm) If we weren't forced to do this in C++ it really would have been fun... I love lisp.
So the first step in implementing this is to build up a hash table k which maps correctly spelled words to their probability. This is our language model. Next we need to generate the error model. We do this by finding the set of all correctly spelled words which are one or two "edits" away from the word w that we are given to check. If w is in k, then we're done. Otherwise, if the set of words one edit away from w is not empty, take the one with the highest probability. If that set is empty, check the set of words two edits away, and again return the word with highest probability. Finally, if all else fails, return w.
Monday, February 18, 2008
A quick sketch of a spell checker
Judge orders DNS records of wikileaks.org removed
According to the law department at UC Berkely, Judge White was appointed by President Bush to the federal bench in 2002. Here are details from uscourts.gov.
Here is a list of alternate DNS names which, reportedly, can reach wikilinks. Please go read this post before following the https links. Make sure you read the bottom few paragraphs (they appear after the url list)
http://wikileaks.la/
https://secure.wikileaks.la/
http://home.e.co.za/
https://secure.home.e.co.za/
http://joburg.e.co.za/
https://secure.joburg.e.co.za/
http://new.alain.co.za/
https://secure.new.alain.co.za/
http://wikileaks.be/
https://secure.wikileaks.be/
http://stockholm.divx.se/
https://secure.stockholm.divx.se/
http://jwdc.org/
https://secure.jwdc.org/
http://ljsf.org/
https://secure.ljsf.org/
http://freedomsbell.org/
https://secure.freedomsbell.org/
http://freedomspen.org/
https://secure.freedomspen.org/
http://libertypen.org/
https://secure.libertypen.org/
http://sunshinepress.org/
https://secure.sunshinepress.org/
http://new.1.vg/
https://secure.new.1.vg/
http://zurich.base-v.ch/
https://secure.zurich.base-v.ch/
http://bratislava.iypt.sk/
https://secure.bratislava.iypt.sk/
http://new.iypt.sk/
https://secure.new.iypt.sk/
http://wikileaks.org.uk/
https://secure.wikileaks.org.uk/
http://new.ilex.cl/
https://secure.new.ilex.cl/
http://wikileaks.tl/
https://secure.wikileaks.tl/
http://freedomsbell.com/
https://secure.freedomsbell.com/
http://wikileaks.in/
https://secure.wikileaks.in/
http://bucharest.roxi.ro/
https://secure.bucharest.roxi.ro/
http://wikileaks.es/
https://secure.wikileaks.es/
http://wikileaks.ws/
https://secure.wikileaks.ws/
http://riga.ax.lt/
https://secure.riga.ax.lt/
http://special.k.vu/
https://secure.special.k.vu/
http://wikileaks.cx/
https://secure.wikileaks.cx/
I love Paul Graham
Graffiti happens at the intersection of ambition and incompetence:That is from his essay on trolls.
people want to make their mark on the world, but have no other way
to do it than literally making a mark on the world.
I didn't find that essay as interesting as his average essay, but I did like his discussion of the Six Principles for Making New Things.
Tuesday, February 12, 2008
Common Lisp Tutorial
Hey there!
So, I come across many interesting web articles/posts/stories what-have-you. Usually I spam my friends with the links to these interesting tid bits, but I think I should post them here an let my friends (and you) choose when and what to see.
Hopefully you enjoy!