Wednesday 13 April 2022

Everyone does a little

Well, my current Borrowbox ear-fodder is The Crowd and the Cosmos: Adventures in the Zooniverse by Chris Lintott it features penguins! I was describing it to Dau.II on the phonio last week and she said it sound real heavy. Must be the way I tell it.  It's not, I said, it reads like a thriller. Which was gilding the galaxy a bit but, despite being all about processing [eye-watering quantities of] data, it reads very easy on the ear.

Lintott and Kevin Schawinski co-founded Galaxy Zoo in 2007 as a way of getting on top of the data deluge in astrophysics. For reasons . . . they needed to shoe-horn galaxies into two different bins - spiral or elliptical. There are nearly 1 million images of these distant & enormous objects in the SDSS Sloan Digital Sky Survey. That's a lot and it would be easy if all the cases for classification were clearly one sort or the other - why then a computer program could do the whole task in a couple of minutes. Problem was that, in 2007, there was no digital solution that could reliably handle the edge[-on] cases - not all spiral galaxies appear like a catherine-wheel from Earth . . . because the Earth is not the centre of the cosmic merry-go-round. With enough committed volunteers, the project was able to call each galaxy dozens of times, so that the spirellipticity of each object could be determined to whatever degree of 'certainty' required by the particular question being asked. Everyone agreed that the L image above was 'spiral'. Edge cases could be discarded or referred to expert adjudication. And presumably, the assessment process could be periodically audited by the same experts to ensure that the volunteers weren't having jape at the project's expense had been effectively briefed and instructed.

And it cuts both ways. Just as some galaxies are easy to classify; some classifiers turned out to be more reliable than others. The data processing stream could therefore weight the calls of Effectives based on just how effective they were. You could imagine that adults would be better than children at this task OR you could assume that kids, having been digital all their lives, are more accurate than their parents. But the internal consistency checks allows the Galaxy Zoo Centraal to discover their best contributors by anonymous registration number rather than age, or zip code. 

I'm totally institutionalised! Could do Galaxy Zoo: learn the rules, and then keep clicking until bored. Hanny van Arkel, a teacher, and guitar-picker from Nederland, went a step above and beyond by asking WTF is that peculiar smudge of stardust ?! near [clearly spiral!] Galaxy IC 2497. It turned out to be super interesting "'n heel nieuw ding" as an example of star formation in the making 650 million light years from where we sleep. It could have been named Hanny's Dingetje but was dubbed Hanny's Voorwerp HsV and other similar clouds of ionized stellar matter are all called voorwerpjes. The thing about software, and institutionalised folks, is that they are only as good as the training set. Humans like Ms van Arkel can see beyond the tramlines of the task and be truly creative. Tedx!


No comments:

Post a Comment