No one was really shocked when in the summer of 2013, The Guardian and the Washington Post with the help of Edward Snowden exposed the extent of US government spying on internet users all around the world. The story was less a big shock and more of a global ‘but of course!’ moment. We all had our suspicions, but the Snowden episode meant that we could all talk about these activities openly without seeming like a conspiracy theorist, and much more importantly, it meant that world leaders couldn’t treat journalists asking questions about these activities like conspiracy theorists. It was an important event because at a May 2013 press conference, Barack Obama could dismiss questions about government spying with a wave and a sneer, yet two months later, and forever more, he had to answer, and explain.
I wasn’t really bothered at all by the revelations, to be honest. I could understand why some people were outraged at the confirmation of the hypothetical persistent violation of their privacy, yet to me this was just a part of the modern internet world. In order to live here, you have to give something away. Most things we use online everyday are free, yet companies like Twitter, Facebook and Google are all worth billions. It doesn’t take a genius to work out that in a situation like this, you (the user) are the product that these companies produce. They gather as many users as possible together on their platform, and then charge companies vast sums of money to advertise to us. It’s a simple two-sided market, as used by newspapers, terrestrial TV Stations and credit card companies.
The ‘theft’ of personal information, by Google or the government or both, is a constant theme in mainstream media. It gets the blood boiling, as we can all think immediately of our own online presence, and how comfortable we would be with others looking over it without our permission. How dare they use the text from that status I just wrote to recommend an ad to me, how dare they spy on my WhatsApp messages, how dare they publish my Facebook photo online. All of these are common complaints in the world we now live in, but these statements and concerns all miss the point. The end of privacy shall not be a result of stolen personal data, it’s going to be a result of the seemingly harmless, impersonal data that some people, many of whom are completely unknown to us, freely give away.
The Perfect Fit
This depressing thought first came to me during the summer after I had been using Google Fit for a few days. For those ignorant to Google Fit, it is an app that tells/(lies to) its users that fitness occurs through taking a certain amount of footsteps per day. It does this by continuously running in the background of your phone: you only open it to check your progress. It knows automatically when you are running, or if you are cycling. It knows immediately if you stop moving, and this is all reflected on a nice pie chart you get at the end of the day. On clicking for more detail on any individual activity, it also shows the duration of each bout of walking, along with a little map of where this all took place. After a few days, I was using it to check once and for all, how long exactly it took me to do the weekly shopping at Billa, how long I spent at the office, how long I stayed at the gym. Google Fit is really just a personal record of your day. If you scroll through your history and you see one day it took you longer than usual to get to the bus stop, you click for more detail and realise you made a detour to the shop on the way.
I had been reading up on data analysis a lot at the time, and calculated that if I allowed the app to track me for even a few weeks, it would know me very, very well, and would probably be able to guess, with great accuracy, where exactly I would be going once I left my house on any given day. A simple algorithm could run, and be constantly updating, depending on whether I turn left or right at my building entrance, whether I walk past the bus stop, whether I am running, cycling or walking. If I walk out my door early on a weekday, it would assume I am going to work. If I walk out the door on a Saturday before 6pm (when shops close here in Vienna), it would assume I am going to the shop. Based on my history, it knows I prefer Billa, yet if I turn right out my door instead of left, it will update automatically and assume I am heading for Spar. If I walk past Spar, it will update again to predict my destination given my previous wanderings, and so on.
Eventually, it will be right. And the more I use it, the more accurate it will get. Even the more erratic and unpredictable my behaviour becomes, it will all be fed back to the ever-expanding database on my comings and goings, and hence used to predict behaviour in the future. Which is of course, everything one could want in a Fitness App.
Now I know what many of you out there are thinking right now: why would you ever give them permission to track your every move like that? I would never accept such conditions, I don’t even turn on my smartphones’ GPS functionality. Well, here’s the kicker: they can do it to you, and they don’t even need you to agree to it.
We all like to think of ourselves as special and unique, and in truth we all are, inside our own heads. No one looks like you, no one thinks like you, no one feels like you, no one can live each day exactly like you, and in this sense we all are unique and special. This is just at the individual level however. If you gather a large mass of people together, certain patterns and similarities emerge that were not immediately apparent when observing just a few people. Think of the clothes that you are wearing for example. More than likely, at least one of these articles was made using the standard sizes of S,M,L,XL. This now-universal sizing system was only formulated at the end of the 19th century, and before it emerged, mass production of clothing was impossible because producers thought they had to tailor make every item of clothing to each individual, which was prohibitively expensive. A statistician observed that when a large enough group of population is sampled, the vast majority of the market can be served by simply mass-producing each item of clothing in the four standard sizes of S, M, L, XL. Thus the modern clothing industry was born, and as a result, you don’t need to go to a tailor every time you want to buy clothes. The simple gathering and organising of select, appropriate pieces of information led to a global industry providing a fitted product to millions of people all over the world, despite coming into physical contact with very, very few of them.
Taking this back to the Google Fit case, they don’t need you on board simply because they have enough idiots like me giving all-access to every portion of their lives. Google promise, of course, that all your personal data is safe and secure, and that this personal information is for you and for you alone, in order to track your footstep goal. What Google also do however, hidden well into the user-agreement fine print, is create an identical, yet anonymous, profile for you on their server, and aggregate this data for all of their users (A process similar to this is currently used in Google Adsense to recommend advertisements to you based on your current search history). This data will be organised automatically and applied to the predictive algorithm I described previously. If you are an Android user, Google will know certain things about you, even if you have not opted to have your movements tracked. Simply from knowing your age and sex, it can compare you to the hundreds of thousands of other users like you it has (anonymised) all-access data for, and predict your movements based on this. If you have ever done a personality quiz online, it is a similar process, except this is a computer formulating the calculations, using a practically infinite amount of data, and therefore instead of putting you into one of six personality types, it could be one out of 60,000. They don’t need you to complete a quiz to get the results either, anything they can grab about you from your online presence will narrow your profile down to fit into one of their boxes. Everywhere you go, everything you do, this will only provide more data for the algorithm to use in the future to predict the behaviour of yourself and others, who are statistically similar to you.
Algorithm is the Dancer
This isn’t a conspiracy theory, and it isn’t data theft. It isn’t even an invasion of privacy any more than the clothing industrys’ use of global size standardisation is. It’s simply the application of standard statistical and mathematical methods to modern computing capabilities, along with current attitudes of society towards privacy. What I have described here is where current technology is going, and it really isn’t very difficult to realise. An algorithm is simply a mathematical process repeated continuously, ever towards perfection. Computers are simply machines for achieving and reporting calculations such as this ever-faster. A significant proportion of us are giving enough information to the databases of these combined systems so as to jeopardise the privacy of everyone. We can feel betrayed by governments all we want as a result of the Snowden revelations, yet all the clandestine spying gives anyone is what we did in the past: what photos we posted, what private messages we sent. What will be possible within a few years, without any theft, without any consent from you, with little personal information from you whatsoever, is the surprisingly accurate prediction of what you do any given day once you leave your home. If you haven’t been reading about “big data” over the past year or so, then consider this an introduction.