GIGO: words unreadable aloud
Mishrogo Weedapeval
 

 

  Wednesday 17 September 2003
Aoccdrnig to rscheearch ...

This paragraph (or others like it) has been all over the blogosphere:

Aoccdrnig to rscheearch at an Elingsh uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer is at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit a porbelm. Tihs is bcuseae we do not raed ervey lteter by it slef but the wrod as a wlohe.

(Note: it doesn't work on email addresses.)

Nerdy followup:

This starts to break down when the words get longer. I saw this in the news a few days ago, and wrote this little python program:

import random, sys, re

def scramble ( str ): if len(str) <= 3: return str mid = list(str[1:-1]) random.shuffle(mid) return str[0] + ''.join(mid) + str[-1] repat = re.compile( '(\W+)' ) def scr_file ( fd ): for ln in fd: wlist = repat.split( ln ) resl = [] for wm in wlist: wr = scramble( wm ) resl.append( wr ) # print wlist # print resl print ''.join(resl)[:-1] scr_file( sys.stdin )

... so that I could run this experiment on some texts of my own choosing. Familiar texts like this one are pretty readable, though a few words (e.g., "destructive") don't work that well for me:

"We hlod teshe trtuhs to be slef-enevidt, taht all men are crateed eaqul, taht tehy are edwoend by tiehr Ctoarer wtih caertin uanabllneie rhgits, taht anomg tshee are life, liberty and the puisurt of hseapipns. Taht to srcuee tshee rthgis, grnvneemtos are iieutsttnd among men, diivenrg tehir just pewros from the cnsoent of the grveoned. That weenhevr any form of gmrennvoet becmeos dtteiurscve to these edns, it is the rghit of the plopee to aeltr or to aosilbh it, and to itsinttue new goemnrnevt, lyiang its fniotuaodn on scuh perpinclis and oniirnazgg its powers in such from, as to them shall seem most lkliey to ecfeft thier stafey and hnsapeips."

Anagrams affect this phenomenon, too. When I re-read this passage: "Taht to srcuee tshee rthgis," my brain supplied "rescue" before it found "secure" ... perhaps that was affected by the current administration's attempts to abrogate those rights.

It gets harder to read scrambled text when the text is unfamiliar or contains a lot of long words. Here's the documentation for one of the functions ("re.split") that the python program above uses:

siplt( praettn, srting[, mxlispat = 0])

Slipt sirntg by the ocrnurceecs of pertatn. If cptiarnug peehtnresas are uesd in pteartn, then the txet of all grpuos in the pttrean are aslo rtreuend as part of the rsneutlig lsit. If mpisalxt is neonzro, at msot mplsxiat sptlis ouccr, and the remedniar of the sinrtg is renterud as the fanil elemnet of the lsit. (Imipnbttiiolacy ntoe: in the oniaigrl Pthoyn 1.5 resalee, mixpslat was ironegd. This has been fexid in laetr rleasees.)

I suspect that most people have trouble with "capturing parentheses" and with "Incompatibility".

Now, try running your spelling checker/correcter on this entry :-)

Bonus tangent: I suspect that this is related to the "chunking" of information that our minds do. I'll explain and relate this to my keyboard layout at another time.
11:58:22 AM   comment/     



Click here to visit the Radio UserLand website. Click to see the XML version of this web page. © Copyright 2007 Doug Landauer .
Last update: 07/2/6; 12:34:23 .
Click here to send an email to the editor of this weblog.

September 2003
Sun Mon Tue Wed Thu Fri Sat
  1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30        
Aug   Oct

Previous/Next