I am going to write about the consequence of errors today. This idea was triggered by a book I’m currently reading: ‘The Chemistry of Tears’ by Peter Carey. This is an intriguing tale that weaves together two stories separated by 150 years. Within one of the stories is a vignette describing the death at sea of a character’s wife and children. The deaths were the sad result of the captain of their ship using error-ridden charts issued by the British Admiralty. The captain, believing the way forward was clear, unwittingly steered the ship on to rocks and it sank.
In Peter Carey’s very good novel, the unfortunate sufferer of this tragedy becomes obsessed with these poor Admiralty charts. Obsessively, he composes errata sheets for the Admiralty to send out. Unfortunately, though, he subsequently discovers that more errors arose during the copying process. In fact he finds that a total of 3,700 incorrect errata sheets were distributed.
This tiny tale-within-a-tale caused me to stop reading as I tried to remember a news article I had read a few years ago of erroneous nautical charts surviving into the digital age. As my mind worked away, I recalled a name: Sandy Island.
Sandy island, the consequence of errors
Sandy Island was a phantom island that existed on nautical maps for over a century following its ‘discovery’ by the whaling ship Velocity in 1876. The island was located near the French territory of New Caledonia to the east of Australia. It appeared on a German map in 1881 and an Admiralty map in 1895. It was also included in the World Vector Shoreline Database, from which has been derived many of the standard navigation mapping tools of today. And perhaps most infamously, it was visible in Google Maps and Google Earth until 2012.
The most intriguing aspect to this story is one that emphasises how data inaccuracies can snowball. Many scientists were skeptical that Sandy Island was a phantom. They referred to satellite surveys that indicated the presence of land where Sandy Island was thought to be located. They therefore felt that Sandy Island must exist.
However, it eventually emerged that these satellite surveys made use of a ‘mask’ to differentiate between land and water. As a result, the surveys did not assess the area of Sandy Island, because the mask identified that area as land. So the result as far as Sandy Island was concerned was coded into the survey from the very beginning. And what data set was used to create the land mask? The World Vector Shoreline Database mentioned above! These surveys illustrate how data errors can propagate through disconnected systems with significant consequences.
Our custom apps
It may disappoint you to learn that we probably can’t solve the problem of missing islands. But we can solve a multitude of other problems instead.