I recently finished reading "Weapons of Math Destruction" by Cathy O'Neil, and continuing my quest to not only read more - but to also actually digest the content - here's another book review!
Cathy O'Neil sounds like a fascinating person, with quite a surprising career path; upon leaving academia she found herself working as a Quantitive Analyst at a hedge fund, and from this privileged position she was able to witness the effects of the 2008 financial crisis first hand, subsequently she changed allegiances and even found herself part of the Occupy movement.
Needless to say, as someone who considers mass data collection not only horribly intrusive but also greatly detrimental to society, I was quite excited to dive into this book. Very early on though it was marred by quite a few baseball analogies, and as a Brit I'm very clueless when it comes to baseball... so it wasn't the best start.
Yet that's ok, one can forgive the odd localised reference if the actual thesis of a piece is solid. However I pretty quickly came to question how much of the book was objective and neutral observations, and how much was - ironically - pushed by O'Neil's own confirmation bias.
O'Neil seems to use a paper by Michael Mueller-Smith (an Economics Professor at Michigan) to imply that longer prison sentences directly lead to recidivism, and that this is due to the negative effect a longer sentence has on an individual's employment opportunities.
Mueller-Smith has indeed observed that longer prison sentences tend to result in poorer employment outcomes, and that is indeed correct and indisputable.
The first issue is the citation for this claim though: it's not even the paper - nor is it even by Mueller-Smith himself. The reference leads to an article on Quartz, and this article doesn't appear to provide direct access to the primary source either.
Secondly, it's obvious that longer prison sentences correlate with poorer employment outcomes: the length of the prison sentence is merely a proxy for the severity of a crime. Those who commit more serious crimes are going to be seen as more of a risk than those who commit petty ones, hence reducing their employment opportunities - regardless of sentence length.
Sadly I was left questioning quite a few of the more political arguments after noting this example, and it left me questioning the trustworthiness of many of O'Neils assertions.
The real message
The premise of the book seemed to centre around today's Big Data society, and the way in which statistical models are employed to determine various aspects of our lives; be it the outcome of a job interview, whether we're deemed acceptable for a loan application, or what content we see when we browse the internet.
The core message is that statistical models are being employed to reinforce many of the inequalities in society, and some of the more disturbing examples highlight this message well. The predatory tactics of pay-day loan lenders and for-profit colleges targeting the desperate and poor are but two of the more depressing trends.
As someone who bases many of their core values around the concept of social mobility - not to mention, someone who has been personally disillusioned when working on data-driven products1, many of the author's points resonate with me. However many other points seemed to simply ignore some of the many realities of modern society.
I found it particularly peculiar that someone who watched the 2008 financial crisis unfold - one that (on a micro-scale) was largely caused by irresponsible lending - would advocate curtailing the data that can be used when credit scoring. Using every piece of data is not only important to ensure that the lender can minimise their liabilities, but it actively prevents the vulnerable from finding themselves in debt.
There was a common theme throughout the book whereby it felt that O'Neil was advocating the removal of data if it provided the wrong output; something that seems entirely wrong. If specific demographics are hurt by these algorithms, then perhaps the societal issues behind the inference should be investigated? After all - the data isn't lying, it's merely describing a social issue that needs to be confronted.
I felt that O'Neil was advocating the removal of data that led to answers that she didn't agree with, almost expressing a desire for blissful ignorance whereby real issues are ignored because they're inconvenient.
Despite this dubious attitude, it's difficult to disagree with the narrative that these algorithms are simply producing heuristics that in-turn exaggerate and reinforce existing inequalities. This is true, and whilst I find O'Neil's objections to specific arguments to be naive - if not borderline dangerous - most of the examples are indeed cautionary tales that underline topics that need to be confronted and heavily regulated.
Especially interesting was her criticism of Facebook, and the issue of their data collection and lack of transparency - an issue I've long tried to evangelise to anyone who will listen! We're heading towards a dystopian time where powers like Google or Facebook are able to not just control our emotions, but also what information we see.
I think it's no surprise that we find ourselves living in a divided society, one whereby people struggle to empathise and understand the views of others. When people rely upon the likes of Google and Facebook for information, they become unable to challenge their own perceptions and views - after all, Google and Facebook are in the business of providing reinforcement to what people already believe. The content provided to users is actively tailored to the views of that specific user, ensuring a higher likelihood that the user will engage with it, and subsequently generate revenue. This is but one area where O'Neil hits the nail on the head.
I finished the book feeling quite pessimistic, it paints the picture of a world whereby the potential for an individual is limited by a set of key stats - many of which are preordained. It's clear that if we wish to promote the concept of social mobility then we need to confront the way in which decisions are made that directly affect the opportunities afforded to an individual, and in today's world many of those opportunities are decided not by people - but simplistic algorithms and statistical models. Can we do better though?
 One major issue was when working on an absence management platform; it provided the power for HR departments to pre-empt and chastise - often ill - members of staff. The data collected could've been immensely powerful if used to provide insights in to a staff members overall wellbeing, encouraging early-intervention and support. Unfortunately these ideas were simply ignored, as they didn't fit in with the perceived business requirements.