Week of June 26, 2017

Weaponized Data Edition

1. Weaponized Data and American Inequality: Last week I linked to a paper finding minimal effects from minimum wage increases, unaware that a huge explosion of debate on this issue was about to occur. If you follow these things at all, you know that last Friday a paper on Seattle's minimum wage increase was released finding no job losses or cuts in hours. Monday, a different paper finding large losses for households with minimum wage jobs was released. There's a whole lot out there now on the two papers so I'm not going to rehash those arguments (if you need to catch up, try this or this or this or just scroll through Twitter). I want to focus on the backstory of why there were two papers released so close to each other because it's important for the future of research and policy-making. As detailed here, what appears to have happened is researchers at UW shared an early draft of their paper (using tax data that is rarely available in minimum wage studies) with the Seattle mayor's office. The mayor's office didn't like the conclusions so asked a different set of researchers to write their own paper--and release it just before the planned date for release of the UW paper. While I have no special insight into the exact details of what happened, the prospect that the report is accurate disturbs me a great deal. It's a blatant step toward what the author of the Seattle Weekly piece calls "weaponized data." Be afraid for evidence-based policy. Very afraid.   

In other American inequality news on topics that yield strong confirmation bias reactions, Justin Fox reports on new work suggesting that occupational licensing actually crowded-in historically disadvantaged workers--seemingly the transparent rules of licensing reduced formal and informal discrimination that kept these groups underemployed. That's a very plausible story to me, though I generally also buy the anti-licensure arguments.

There's also new work on school vouchers, from Indiana, finding short-term declines in test scores, but later (over four years) gains. It's worth noting how claims for vouchers have down-shifted to "no harm and some students gain." But keeping on the weaponized data theme, the paper is not publicly available and was only obtained by ChalkBeat through public records requests. Apparently the study authors don't think it should be public until it's peer-reviewed, which illustrates the difference in norms in sociology and economics.

2. Our Algorithmic Overlords: Also a few weeks ago I linked to a story about how to tell if borrowers on online lending platforms were going to default, and to the book, Everybody Lies, from which it came. I said I was going to read the book and I started this week--and was immediately dismayed. The opening of the book discusses what search data--particularly searches on pornography websites--can tell us about Americans' hidden desires. You can see a summary in this deeply disappointing Vox piece (isn't Vox supposed to be better at thinking critically about this stuff?). There is no discussion of how such data might be biased or inaccurate, how a site's interface may interact with what people search for, or why we should believe that search data closely corresponds to "real life." In other words, it's an object lesson in the dangers of using data and algorithms without understanding the data or the people, social structures and institutions that generate it. So of course it's a best seller. Suffice it to say that I have radically revised down my faith in any of the book's conclusions.

In other data-generating processes of uncertain usefulness news, Google will stop showing ads inside Gmail based on scans of email content (illustrating the sucker's game that is attention, I had no idea they were still doing this; I hadn't noticed an ad in years). The nominal reason is combating hesitance from corporates to adopt Gmail and Google's suite of web apps. As someone in my Twitter feed noted, the real reason is that Google already gets better information to drive ads to you than your email.

3. Development Economics: David Evans at the World Bank had to teach middle schoolers (6th to 8th grade, also known as Hell on Earth) what development economics is, in 20 minutes. How did he do? I mean, other than not handing out copies of Experimental Conversations.

On a more serious note, here's an interesting new study on the persistence of gains from agricultural extension programs, after those programs end. It's notable both for the cool design, but also for the positive results. I'm always happy to see results that suggest there's hope for getting poor households to adopt productivity-enhancing technology, whether they are farmers or retailers.

4. Household (and Drug Lord) Finance: Stay with me on this one, we'll get to the drug lords. One of the ongoing things I worry about in household finance is that it involves people and people make bad decisions so predictably. Case in point--a major mistake that people make is chasing investment returns via "hot" stock-pickers. There have been major gains in this area as low-cost index funds have grown enormously. But now comes the Quincy Jones Streaming Music Blah Blah Blah Index fund. Yes, you read that right. It's a new kind of index fund that is an arbitrary set of stocks marketed with the name of someone famous nominally connected to the stocks in the index. Yes, the cynicism of the people creating products like this is annoying and frustrating. But the real problem of household finance is the people that will buy them. The enemy is us. Sigh.

In a variety of forums I've complained about over-zealous regulation of remittance providers under Anti-Money Laundering efforts. There's generally been little evidence that this is a significant danger. But last week 11 people were arrested at Atlanta remittance shops for laundering $40 million of illegal drug proceeds and sending them on to Mexico. Here's the kicker: the people doing the laundering were the Anti-Money Laundering staff at the remittance shops. The problem is the people.

And finally, a quick report on planned up-coming M-Pesa outages, for up to 12 hours. It will be interesting to watch for customer behavior effects.

5. Evidence-Based X: Returning to the opening theme, how should (hopefully not weaponized) data be used for making policy decisions? Andrew Gelman has a short post on "clinical significance" and "statistical significance" that should inspire long thoughts. Here's your clickbait: "Forget the hypotheses and the p-values entirely." I should note that Andrew has a new book that I'm going to get to much sooner now that I know I can just skim Everybody Lies: Teaching Statistics: A Bag of Tricks

A reddit group put together a map about the data in maps, illustrating where data is missing. Source: @maxcroser and reddit

A reddit group put together a map about the data in maps, illustrating where data is missing. Source: @maxcroser and reddit

Return to the Weekly faiV