Sunday 23 August 2015

Captain Pugwash

Weekend: light-relief. Had a peek at metafilter.com yesterday. It's all about poor interactions between what people want to write and how computers read it. It sent me down two parallel rabbit holes: one in my mind/memory and the other clicking out and further out on the interweb. Staying close to nerdihood, in the comments there is a link to a classic xkcd cartoon about Little Bobby Tables.  You don't have to know the code to appreciate the joy of one clever woman whacking a foolish bureaucracy. There are some things that you must not write when dealing with computers. When Speedo was installing software on my enormous national bioinformatics computer in 1994 he decided that he should tidy up by deleting some of old intermediate files.  Thinking he was in one limited part of the system he typed "rm * -R" and then muttered "this is taking rather along time" before realising the computer was recursively [-R] removing [rm] all the files [*] of all the directories because he'd forgotten that a previous command [cd /] had delivered him to the very top of the directory structure.  Oooops, luckily we'd backed the system up as it was the night before.

Some errors at the human-computer interface are annoying but not fatal as I found in 1992 when I was writing a program for public distribution in the narrow world of bioinformatics that existed before the WWW launched in 1994. My code was cleverly menu driven.  On starting, the user would be presented with a numbered list of options and invited "Enter a number between 1 and 7".  That seemed to work fine as I was debugging but not so well when I asked The Lads to stress test it. One of them entered "l" [el] instead of "1" [one] and the program collapsed with a "system stack error".  Easily done, see how similar l1l1l1l they are? I had to write an additional few lines of code dealing with alphabetic input: "That input invalid: enter a NUMBER between 1 and 7". You see similar exasperated comments, often in red and usually in very small print, when you leave the zip code out of a web-form.

Getting computers to process subtle differences which are obvious to any six-year-old is a pervasive modern problem.  Our postman can deliver letters with all sorts of weird and wonderful variations on our address but the robot that will replace him as soon as driverless cars become normal will just blow fuses.  Here is a nice list of fatal assumptions to make when writing software to process personal information. Is Joe Blowe of 121 Mimosa Drive the same as Mr Joseph Blow, 12 Mimaso Drive?  Probably; but how to instruct a computer of the fact? Google guesses the intent of typos really well, Wikipedia just gives up. My promotion prospects depend to a certain extent on the number of the scientific papers I've published.  So it's good that I am cited both for my 1991 paper in Molecular & General Genetics v. 230 p. 288 and my 1991 paper, with the same title, in Molecular & General Genetics v. 23 p. 288. Two hits because some careless person made a typo in his list of references and a couple of culpably careless people cited my paper without having read it: they just read the intermediate paper and copied their [incorrect] reference.  There is a LOT of that lazy-arsed science about.

I've written about the prudishness of Google protecting us from things that Masters Brin and Page think are naughty. Apparently this is known generally as the Scunthorpe problem: where businesses in that town just north of Flixborough were unsearchable by Google because of the 2-5th letters in the name.  In the early 1990s, all the Polytechnics in the country became "Universities" in a fatuous central government policy decision.  Everyone knows which institutes of UK higher learning are real Universities and which are Polys masquerading as such.  It was a jamboree for letterhead printers, publicists and designers, because the institutes could re-brand with the re-name.  I had just left the [real] University of Newcastle upon Tyne across the park from Newcastle Polytechnic. When the latter was up for renaming, the governing body were all for City University of Newcastle upon Tyne: expressing a Bluff Northern, Urban, Can-Do effectiveness. That was scotched when it emerged the most likely domain-name to differentiate it from the existing www.ncl.ac.uk (for answer see underlined capitals in previous sentence). They settled for Northumbria University which is going from strength to strength but has a dreadful, busy, and hard to crack website.

That led me off to those widely-circulated unintentionally inappropriate domain names including the Florida fishing tackle shop http://masterbaitonline.com/. An Italian energy utility is no longer using http://powergenitalia.com/ which was been acquired as a dating jumpstation. These sources of hilarity ante-date the WWW by many decades. Eeee, when I were a nipper in a sailor-suit there was a series of 4 minute short TV cartoons called Captain Pugwash [L. with Cabin Boy and Ship] originally made with cardboard cut-outs about a ship full of pirates and their inconsequential adventures.  Contra the urban legend, there were no such characters as Master Bates, Seaman Staines or Roger the Cabin-boy, the assertion went to court in 1991. However, either I'm deaf or "Find me the ship's dictionary Master Bates" does appear at 25s in the first episode of the 1970s re-run of the series.

No comments:

Post a Comment