Links for November 13, 2016

The Simpsons by the Data - Todd W. Schneider

The idea behind [term frequency–inverse document frequency] is to find words or phrases that occur frequently within a single document, but rarely within the overall corpus. To use a specific example from The Simpsons, the phrase “dental plan” appears 19 times in Last Exit to Springfield, but only once throughout the rest of the show, and sure enough the tf–idf algorithm identifies “dental plan” as the most relevant phrase from that episode.... Beyond “dental plan”, there are fan-favorites including “kwyjibo”, “down the well”, “monorail”, “I didn’t do it”, and “Dr. Zaius”, though to be fair, there are also some less iconic results.

The Original Emoji Set Has Been Added to The Museum of Modern Art’s Collection

For the revolutionary “i-mode” mobile Internet software NTT DOCOMO was developing, a more compelling interface was needed. Shigetaka Kurita, who was a member of the i-mode development team, proposed a better way to incorporate images in the limited visual space available on cell phone screens. Released in 1999, Kurita’s 176 emoji (picture characters) were instantly successful and copied by rival companies in Japan.


Mining this collection, we extracted over 4,500,000 animated GIFs (1,600,000 unique images) and then used the filenames and directory path text to build a best-effort “full text” search engine. Each GIF also links back to the original Geocities page on which it was embedded (and some of these pages are even more awesome than the GIFs).

Radiophonic Workshop Dr. Who Theme - Yamaha CS-80, ARP Odyssey , Roland SVC-350