How corporations got all your data

Hello and welcome to another episode of The Weeds. I'm John Quillan Hill. This week, we're taking some time to work on some upcoming episodes we're really excited about. So we wanted to share a conversation our colleagues had on the gray area. It's about corporations and how they have so much access to our data. It's a real deep dive, not just into how tech companies are getting our information now, but how companies have had access to us for centuries. We hope you enjoy. I'll pass it over to Sean Elling. If you spend a bunch of time on the internet, you're often asked by giant corporations to do something pretty extraordinary. We have updated our privacy policy. Please read the following and click Agree to continue using our service. We collect data that identifies you, such as your name, address. You're asked to sign a contract. It's long, full of obscure legal language. You don't really understand it and you definitely don't read it. You just wanna check some sports scores or message with friends. And if you wanna do those things, you gotta click I agree. Or it's game over. You can't use that. You can use your transactions, the precise location data of your mobile device, certain metadata recording the settings of your camera. What you're agreeing to is never clear, but you vaguely know you're allowing some corporation somewhere to harvest and sell your data. And you might even be okay with the idea that that data will be used to profile you. Just clicking on an advertisement or make the user segment or category into which you as a user fall. For example, female. 20 to 49 years old. Interested in sneakers. Model or device type. Operating system and... So, you click accept. How did we get to this strange place? I'm Sean Elling, and this is The Grey Area. My guest today is Matthew L. Jones. He's a professor of the history of science and technology at Columbia, and the co-author of a new book written with the data scientist Chris Wiggins called How Data Happened. It's the story of how we wound up here in a world where tech giants have access to a constant stream of data about us. For Jones and Wiggins, the opaque data-driven algorithms that have more and more influence today are actually the result of a campaign that goes back centuries. It's the fulfillment of a broader effort to reduce the world and people to controllable, predictable machines. Which is why they see the history of data as the history of power. Matthew Jones, welcome to the show. Hey, thanks for having me, Sean. This is a big book, man. A pretty ambitious book. What the hell was the elevator pitch? How did we get to where we are now? We were sitting with a bunch of students, and the students were like, how is it that this happened? And we said, well, we actually need to go back to the early 19th century and figure out, how is it that our law changed? How is it technology changed? How is it that we allowed ourself to collect so much data on people? And then what happened with the way we could analyze that in the last 25 years? We want to tell that story. Yeah, that's interesting. You talk about this in the beginning of the book, how the book sort of grew out of this course that you were teaching at Columbia with your co-author about the history of data. And the students you found wanted a lot more than just the history. They wanted to understand the relationship between data and power. Is that actually the right starting point here? Do we need to first approach data as an instrument of power and not as some neutral tool or some technology for helping us map reality? So the problem with thinking about it as a neutral tool is that it never is. There's always a point at which one has to make sort of fundamental decisions about what and who to record. And so from its very earliest period, data really is about who has authority to make different kinds of decisions, decisions about the state, decisions about the organization of health, decisions about the organization of schools. And so one of the things we wanted to equip students with is this sense of how to think about the way technologies enable different kinds of power from the very start. So one of the ways that we like to do this with students is we have them open up a data set in the computer and asking them questions about where did this come from? Who made these kinds of decisions? Like if it's a data on animal, colic, all of a sudden you have quantitative values for what are hard clinical decisions of veterinarians. And it's a choice about who has the decision-making authority over those kinds of choices. When that is scaled to billions of users, then you've got some real concerns about what it is that data is recording, who gets to use it, and who is empowered by it. So that's why our analytic from the very start involves power as well as what data enables. When we're talking about power, the question is what power? Who's power? And I think it may help to let you impact a distinction that you make in the book between state power, corporate power, and people power. Yeah, so that three-part distinction is really intended to be an easy way to think about some of the major players that we are dealing with right now and have been dealing with for some time. And state power, especially in the US context, involves not just the federal government and not just state governments, but municipalities and other sort of regulators and localities. The broad range of governmental institutions that both regulate the use of power, but also create the conditions under which other entities are able to do things. So the way that regulation allows corporate forms to exist. That is an enabling sort of thing. The corporate power, I think, is relatively more straightforward other than when we think about large corporations. It's easy to think of them as sort of homogenous entities that make sort of good or bad decisions from the top down, but they're complicated sprawling beasts that we need to understand. People power is something that we use to capture the broad array of activities that individual people can do, both as consumers. It's often used in that narrow consumer sense or as voters, or most powerfully in all sorts of diverse coalitions, including civil society coalitions, coalitions with labor unions. You know, what might from a sort of purity of politics standpoint involve odd coalitions. Coalitions between, say, large corporations with interests in certain aspects of privacy on the internet and things like the ACLU and other organizations. So we wanted to give people a sense of the diversity of powers without saying, oh, this is gonna be the easy solution, a panacea to our problems, because that's not a very honest appraisal of how we're going to affect any change. No doubt about that. Why start this history of data at the end of the 18th century or really the beginning of the 19th century? I mean, this is when the word statistics enters the English language. But is this also the period in which the concept of data as we understand it now was really born? There's two reasons. One is that there had just been enormous success in a whole series of new mathematical tools for understanding the celestial world, the stars and these sort of things. And so they were at a moment of intense cultural prestige. And for a variety of people, then became enormously tempting to think, what would happen if we could apply these to the human kind? Could we have such advancement and learn more about everything from the nature of people and then things like crops and geology and this kind of thing? The second facet is that it was a moment, it's sort of turning point in the systematic collection of information often in quantified form that just accelerates and really takes off at the 19th century and begins to become ever more estate capacity. So the US Constitution is actually quite meaningful here because the census is included as a fundamental task of the federal government and really marks a turning point. So it's a moment in which there's an aspiration to use new kinds of mathematical tools to get control and understand the world combined with an ever-increasing collection of quantitative data on peoples and things. Right, and this is something I really wanna emphasize. What happens around this time is that it's not that we just have a new method of understanding social reality. To me, it's almost an entirely new way of thinking about the world where this instinct to reduce the natural world to a predictable, controllable machine spills into what we would now call the social sciences. That is really the fundamental shift we're talking about here, right? Absolutely, and it was understood as incredibly dangerous by those who wanted to protect more traditional ways of understanding what we might think of as social at the time, religious, political life. And to replace it with pretty dramatically different tools and bases of knowledge, that is collections of quantified data usually, and mathematical tools for analyzing it. And when it did so, it allows you to describe things in a very different way. The flip side of that new form of description, which only becomes more complex to our day, is the process of what does that mean for what we ought to do? How we ought to organize society? So it was both an authority in terms of how you describe the world, how do we know human beings? Well, not by writing novels, and how do we tell them what to do? Not necessarily by writing philosophical theses, but by creating systems of governance or systems of organization of a corporation, which depend on precisely those forms of numerical description. So it ends up transforming life in dramatic ways that often we don't even realize. It becomes not just a tool of describing and accounting, it becomes a tool for prescribing. Absolutely. And that's a huge difference. Yeah, so that's what we want to understand. And we experience it all the time without knowing it. Like every high school student who's gonna be taking the SAT and having facets of their future determined, that's part of this long history of collecting lots of data and then saying, look, this is a so-called objective sorting mechanism that allows us to organize our society. It's just one of countless examples long before is our current moment of machine learning and AI about the way that happens. World War II and the early post-war period are pretty important chapters in this history. Yeah. And I think that most people would be aware of the stories about Alan Turing and the legendary code breaking that helped win the war for the Allies. Shout out Benedict Cumberbatch for that. But you and your co-author look beyond that and really regard this as a moment, a period in which data leaps into the corporate world in a new and profound way. Tell me about what happens here. So in the wake of World War II, you have a proliferation of really increasingly large systems that can accumulate data on everything like, every flight that people are taking and accumulate that and have it there for analysis. The same thing is happening in the government. And alongside that is oppressed to say, what can we do with all this data? How are we going to make it useful to say marketing departments or finance departments or other facets of the corporate enterprise? On the one hand, there's ever greater desire to collect data with then a push to come up with new kinds of tools to do so. And as I said, this happens in parallel in the corporate world and the government world. And in fact, agencies like the NSA are helping to pay for the development of things like hard disk drives, which are going to become central to this sort of enterprise. But it also ends up having sort of ramifications in what we consider to be the legitimacy of the recording and analysis of that data. Older rules around privacy are sort of unclear in all of this domain. And the collection of data begins and really gets going without a lot of reflection about the privacy implications of that. As the tools develop, it becomes more and more an issue of concern, particularly over the course of the 60s as in the United States, people become, let's say, more skeptical of government claims around Vietnam, but not exclusively around Vietnam. So is this the case? And this will be a recurring theme, I'm sure, throughout. Of technology sort of outpacing our laws and our society's ability to adapt and keep pace with all of these changes that are happening in the scientific and the technological realms. Yeah, so people will constantly and rightly say that. But it's always conjoined to something else. The technology and law are there's a misfit, and that is absolutely true. There's a lot of people who have very strong reasons for telling us the technology says how the law needs to be. And that's one of the key things that we need to question. Almost anyone who says that the technology requires the law to be changed to, say, allow a new form of wiretapping or to overcome sort of older style privacy protections is probably either trying to build up some sort of new capacities or sell you something. And so we need to recognize at once the law and technology are rarely in lockstep. And that many of the people proffering answers to that are doing so with very particular sets of reasons in mind. And we need to be skeptical of how easy that movement is. What kind of reasons do you have in mind there? Because you said it with a nefarious tone. Well, so there is a little bit of nefarious. With theory. I don't want to feed conspiracy trolls, but around 2000, the US government and agencies like the FBI and the NSA really felt constrained by the existing laws around wiretapping and what they could look into. And so they started producing policy papers that said the same thing that you were hearing in the corporate world. The law is out of step with the technology. What we need to do is simply update the law. And so they provided time and again to senators and representatives, examples of updating the law, which they said is simply allowing them to do the same kind of things they did with older technologies. And they weren't able to get this through in the late 90s, but right after 9-11 and the passing of the Patriot Act, they got a bunch of transformations that they claimed simply to be updating the law. But it was an update that, for example, allowed the way they could record, say, phone numbers on a single phone plugged into a wall, that they might be able to transform that authority into collecting the phone numbers everyone is dialing across the United States. And that, in fact, happened and has been subsequently contested. But it was an example of someone making a very clever move around the idea that law needs to be updated, which is true, and doing so in a way that is far from obvious. There was one more period of the 20th century that I did want to flag. And it's the period a little bit before World War II, when what we think of now as the public relations or the marketing industry was really born. And it turns out this is also an important event in the history of data. I think of the 20th century as the century in which power was forced, for the most part, to accept that if you were going to control people, you couldn't really do it by force. You had to do it by manufacturing consent to use a famous phrase. And all that really meant is that you had to guide behavior by manipulating public opinion. And the founder of public relations is this guy named Ed Bernays, who you talk about in the book, quite a bit. Tell me a little bit about why him and why you say, and now I'm quoting you, his vision was a century ahead of its time. Yeah, so we began with him because he was such a clear expositor of the need, both of corporations and of governments to use what he very openly called propaganda, which didn't quite have the pejorative taste that we have. This was a necessary component. He was dealing with what he felt to be sort of fundamental needs on both the corporate sector and the government sector. But it was against the backdrop of what was something that might seem kind of familiar, an incredibly fractured information realm in which there were radical into our minds, incredibly untruthful campaigns about things like the beginning of the New Deal and other sorts of things. So there was a sense that Bernays embodied, that one needed to be able to shape this information domains and that the tools that you would build to shape information domains could allow you to do all kinds of things. And one needed only to look over what was happening initially in fascist Italy and then in Nazi Germany to see just how powerful propaganda was going to be. And to not avail oneself of that was a fundamental mistake. And so it's easy to pin some blame on Bernays, but on the other hand, he had a sort of clarity about what it's going to be. Now what happens, unfortunately, is we see the sort of conspiratorial side of it, like the pharmaceutical companies get totally into selling drugs after the huge success of antibiotics that transforms the medical world. And they do lots of nefarious things. On the other hand, what's lost in a lot of people's toolbox is a sense that persuasion is a fundamental task that lots of people are going to have to draw on and to denude yourself of that capacity and the capacity to see it at work is profoundly dangerous. 100%. I fund historical fact that we just cannot circumnavigate here is that Bernays' Sigmund Freud's nephew. Right. You can make of that what you will audience. But I love that you mentioned persuasion, right? Because Bernays was right, in my opinion, to notice the importance of persuasion in an open democratic society. People have to rely on external sources to learn about a world that's too big for them to understand firsthand. So there will always be a war to shape our perceptions. And by extension, our understanding of reality or our opinions about or beliefs about reality. But now we have data and algorithms to optimize persuasive messaging. And that's a whole new world. Yeah, it's absolutely transformative in terms of the way that it allows for the granular understanding of different sorts of people. And then the shaping of their preferences over time. And even if all of the online advertising doesn't work as promised, it does transform people. And it changes the cultural worlds in which they operate and certainly changes the informational worlds in which they operate. And the line from Bernays to the modern attention economy is pretty straight and clear. And there's a quote in your book from Herbert Simon. And you can say who that is. That explains a lot or certainly tees this up. So I'm just going to read it to you and then you can take it. Now I'm quoting. In an information rich world, the wealth of information means a scarcity of whatever it is that information consumes. What information consumes is rather obvious. It consumes the attention of its recipients. And what we have today that we didn't have in Bernays time is obviously computers and the internet and an overabundance, a superabundance of information. Why is that such a game changer in terms of the power of data in our lives? Because there was a tremendous problem. Now the idea that there's too much information to know is a kind of old, old trope. And it's always led people to invent ways to summarize information or allow us to get access to the information we need. Like the index in a book might seem like nothing. But it's a radical innovation of the past that allows you access. This problem became all the more marked with the explosion of information that became available over the course of the 20th century. And then especially with the explosion of the internet from the mid-90s on. And the problem that Simon and Simon is very interesting figure. He's at once one of the most important people in development of artificial intelligence. And he's a Nobel Prize winner in economics, interested precisely in the limits of humans to reason. Like often an economist will have a vision that people are perfectly rational and access to all information. Neither of those things is true at all. Simon was profoundly interested in people who had limited access to information. Even if they had all the information, they could only attend a part of it. And they had limited time to think through them. So Simon saw very clearly that this was going to require new kinds of technical solution. And those technical solutions for many people, it was thought that they were going to be simply about finding data. It's hard to imagine. But the moment right before Google came online, for those of us old enough to remember such things, there was a sense that search had completely failed, that there was no way that we were going to figure out the way through the universe of the internet. And then Google, through a tremendous technical innovation, came online. But what was missing from a story that tells you it's only about information. It's as if we're the kind of beings that operate only on pure good information. We are far more complicated than that, obviously. And one facet of that is how that information right or wrong gets marshaled into more or less persuasive ways. So the attention economy, it's often seemed like it was going to be about whether I'm going to attend to the right kind of information. And if only everyone got to the same sets of facts about the economy, then in a kind of technocratic way, we would all come to consensus. Now, if anything, that's not what happened. What turned out that as information not about the things that we say wanted to look up, but information about the people looking those things up, and information about what people thought was authoritative, ended up allowing us to create intense profiles on every single person, almost every one of the cell phone. Today, that information then becomes leveraged to provide answers to a question of the limits of attention. What do you provide people? When coupled with the accumulation of information on different sorts of people, systems were devised that would leverage the pictures we had of people and provide them more of the information, not necessarily that was most factual, not necessarily that was the one that we think they need, but the one that in some sense would most engage them. And so the metric was not correctness, it was engagement. ♪♪ We're all carrying around these little machines designed to collect our data and surface ads. How has this changed the way we see the world? That's coming up after a quick break. ♪♪ This episode is sponsored by Ramp. When you spend your time deep in the weeds, things can sometimes feel overwhelming. That goes for pressing policy issues and for business budgeting, too. You might need help to get yourself organized, and with Ramp, you can easily track all your business expenses. Ramp is a corporate card and expense management software designed to help you save time and reign in expenses. Ramp can help you control spending, save time, and automate busy work. From corporate cards and expense management to bill payments and accounting integrations, Ramp has helped over 10,000 businesses save a total of over $270 million to date. Plus, Ramp can give your finance teams unprecedented control and insight into company spending. You'll be able to issue cards to every employee with limits and restrictions and automated expense reporting, so you can stop wasting time at the end of every month. You can get $250 when you join Ramp. If you go to ramp.com slash weeds. That's ramp. R-A-M-P dot com slash weeds. ♪♪ Support for this episode comes from Mint Mobile. Listen, you're a smart person, probably. I mean, if you're listening to a policy podcast for fun, it's likely you're a fan of doing all your research and feeling confident in your choices. So, you might be a little skeptical about a plan offered at just $15 a month for your phone bill. But by going online only and eliminating the traditional cost of retail, Mint Mobile is able to provide their wireless service at an affordable price without hidden fees. And because families come in all shapes and sizes, at Mint Mobile, family plans start at just two people. You can use your own phone with any Mint Mobile plan and keep your same phone number, along with all your existing contacts. All plans come with unlimited talk and text and high-speed data delivered on the nation's largest 5G network. To get your new wireless plan for just $15 a month and get the plan shipped to your door for free, you can go to mintmobile.com slash weeds. That's mintmobile.com slash weeds. You can cut your wireless bill to $15 a month at mintmobile.com slash weeds. ♪♪ ♪♪ You know, what we're really kind of dancing around here is this notion of surveillance capitalism. That is hard to define but is important to understand, nevertheless. And you put a question to the reader in the book that I think might help. So I'm going to put it to you here and let you answer it. Great. You ask, what does advertising and its recent machine learning optimized form mean for the way that we all construct reality from the world we perceive? What does it mean that our primary source of truth delivered to us in the palms of our hands is funded by an optimized for their surveillance ad model? What does that mean, Matt? So when we look at the causes of what allows us to have the information that each of us has at hand, it's easy to put the onus on individuals and say, look, people are looking in the wrong kinds of places. But it's mistaken the way in which a large set of corporations have been able to use technologies to better understand us where we are understood as a long list of actions of purchasing things, of looking at videos, of reading tweets, of watching television shows, and coming up with a picture of us. And then that being conjoined to how can that picture be used to encourage people to buy more or to watch more of a certain kind of video or to engage in certain kinds of reading of news stories online? So it's a picture that sort of understands the conditions under which each person lives in a particular information environment and understands that certain structures which are mediated by devices these days primarily our phones, but also our smart TVs and computers to live in a particular way of understanding the world, a way of understanding their social and economic condition, a way of understanding politics. So the term surveillance capitalism, which is really powerfully introduced by the Sisanesh Uvaf is a really important analytical tool. The one caveat I would raise about it is that it suggests that there's something absolutely dramatically different with the capitalism that came before. But there's a lot of aspects of this that pre-exist, that the advertising economy was very much central, interested in profiling people and controlling their information worlds. Right, and what I would say is new, and this is Ubaf's point, is that merely by living so much of our lives on these virtual platforms, we're generating all this data for companies, much more than they ever thought they needed, and this data was just laying around like excess waste for a while until they realized it had enormous predictive value. And then they started selling it to other private actors, and then boom, that's the basis of online advertising. And it amounts to social engineering in practice, and most crucially, it was done without asking our permission, hence the phrase, surveillance capitalism. Right, I just wanted to note that. Yeah, and I think the story of it, never asking our permission is a really important one. Huge, it's a case where there are lots of people who will justify this never asking our permission, they'll say, this is precisely what allowed the United States to become the most dynamic economy of the 80s, 90s, and 2000s, that we didn't have heavy-handed regulation, that prevented the use and sale, the constant selling of information on US consumers, which began with credit agencies and it predates the internet, considerably, but it sets the conditions and the legal world of not asking, of not notifying, thinking about how we constrain these things, we still are fighting against today, and right now we're in a really interesting moment because there is a great interest in privacy legislation. We've been there before and it's not happened, not in a full-throated way. You know, the fascinating, that's the wrong word. Let me go with depressing. The depressing thing is that it didn't have to be this way. You're super clear about this in a book when you talk about the rise of Google and initially in those early days, there wasn't any real thought of monetizing, the algorithms infrastructure through advertising. They could have gone another way, and as you mentioned, subscriptions or affiliation fees, sponsored links, whatever, but ads won out. And let's just say that's not been great for the world. Yeah, and often people will say, well, none of these other things worked, we don't know. The huge success recently of the New York Times subscription service really is a sign that news didn't have to become all driven by the ad model, but until it succeeded, a lot of people said there was no way anyone would pay for news. There really was an earlier moment in which there was a real possibility that we could figure out different ways of organizing things that would have meant different ways we organize law around how is it that journalists are going to get paid? We still don't know this, right? No, because the current model is not succeeding. But also, how do we give people the direction their attention needs to the things they wanted? And as you said, the initial innovations around Google were extraordinary, because what it did is it leveraged the entire internet and what people thought was important on the internet in order to help drive that. Now, it was never neutral, and it was never going to be neutral. But it was differently problematic than one that is driven first and foremost by optimizing for advertising. So the big question for these massive tech companies, once they were married to the ad model was, how do we get people to click on ads? That's the whole game, right? And the best data science we have has been marshaled in service of this goal. And I suppose your recurring point throughout the book is that this is the latest iteration of a long-running story about data being used to prop up power, in this case, corporate power, but it's being propped up, and this is me now, at the expense of our minds, really, and certainly at the expense of our social and political stability. Yeah, and so one thing about any kind of process of using data to analyze, there's always some sort of end goals that are built into that. What is it that you care to do with it? We might be able to find really dramatically different ways of curing cancer, or we have been able to now produce devices that can understand spoken speech to such an extent that it really enables millions of people to have interactions and to work with devices. They might not, and that is absolutely true. But if those tools are mostly focused on a particular small set of what they would call optimization metrics, like getting people to watch YouTube, then you have really dramatic and negative effects, and we're seeing those, and exactly how negative they are, it's going to remain for the historians of the future to really tell. But one of the reasons we don't just sort of say everything is a disaster, all data is bad, is because you need only think of an example, like in the early 60s, the FDA, the US FDA, got the power to regulate drugs, and this was a good thing, and it did so using some of the data tools that we talked about earlier in the book. It was able to push back against manufacturers claiming that drugs were efficacious. So a lot of the tools we're talking about, if put to different ends, can work for dramatically different purposes. Unfortunately, that is too rarely the case. Who is it you quote in the book? I think it's Jeffrey Hammerbacher, who's, I guess he worked, he was an engineer, worked at Facebook, and he's quoted as saying, the best minds of my generation are thinking about how to make people click on ads. It made me think about how you had all of this scientific talent from the best schools in the country being plucked in order to engineer destructive financial products, say, that ended up wiping out firefighters' pensions or whatever. I mean, we have just this obscene misuse of human ingenuity and intelligence for the most insidious ends. It's just making money and capturing people's attention. It's really depressing. Yeah, and it can be sort of heartbreaking. You think there are now more people than ever before who are highly educated in the use of these technologies which could be put to all kinds of use. But just as most people going to law school, even if they're very publicly spirited, for pretty understandable reasons, end up not doing, say, public service law. The same thing applies to all kinds of people who get into this remarkable world of data analysis and machine learning. We have been picking on the corporations rightly so, same thing in the government, but also in terms of what are the incentive systems in modern scientific circles that lead to certain kinds of focuses and staying away from other kinds of focuses. And that has macro effects, just like you're talking about. I mean, this whole story about the emergence of the attention economy is a powerful example of how technological changes are simply happening without any democratic input or accountability. And they're driven entirely by commercial motives. And the result is that we're all part of probably the greatest experiment in social engineering in human history. And I think it's been immeasurably bad for us as individuals. And as a society, these technologies are designed to capture and hold our attention as much as possible. And what captures and holds our attention, unfortunately, is outrage and spectacle. The instinct for diversion for entertainment is very, very powerful. I succumb to it every day. And these companies know that. And it's how they keep us plugged in. And keeping us plugged in is how they keep generating all this data, which ultimately makes them more wealthy and more powerful. It's a hell of a circle we're in here. Yeah, it's a hell of a circle. And one of the problems is it's also seen as almost inevitable. Yes. The stories that are telling, this is why we want to tell a different story, are that, oh, well, are you going to resist technology? That's not at all the right story, because technology doesn't have one development. And it doesn't exist absent all these other things. Absent a particular permissive regulatory framework, you never have the development of this particular way of using data. You could still have remarkable new machine learning technologies. And so the more we tell a narrative that says, oh, it's technology driving it, politics is backwards, existing social groups are backwards, the more we disempower ourselves in the mind about how is it that we could turn that technology towards the things that we collectively care about and that we individually care about. And so to understand how technological choices are not autonomous is really to understand that things could be otherwise. And then to begin thinking about, well, how on earth are we going to do that, given that so many of the interests are among the most powerful entities, not just today, but in all of human history. ♪♪ What's our best hope for putting limits on the power of these tech companies? I'll ask Matthew after one last short break. ♪♪ ♪♪ Hey, this is Noah. I'm the host of Vox's Science Podcast, Unexplainable, for a show all about scientific questions that are still unanswered and everything we can learn by diving into the unknown. One of my favorite things about my job is hearing directly from our audience about why they listen. Here's one email I really love. This listener said, I find myself inspired to go study science, learn, or card, and conduct my own research one day. This instills in the dreams for my future of being one of the scientists' podcasts like yours call or being someone to discover a long-held mystery. I can't explain how much this means to me. Honestly, reading an email like that, I just feel so lucky to be part of a show that makes this kind of impact and has these kind of wonderful listeners that take their time to tell us what this show means to them. And I'm really glad Vox makes unexplainable free and accessible to everyone. Part of what makes this possible is the Vox Contributions program. In April, we're aiming to add 1,500 contributors to celebrate Vox's ninth anniversary. You can go to Vox.com slash Give today to help us reach our goal. There's also a link to give in the show notes. Thanks so much for all your support. ♪♪♪ Over the last few years, a big idea has taken root. Trees might be talking to each other. Some call it the wood-wide web. The wood-wide web. The wood-wide web. And it's all happening underground. Underground, there is this other world. A world of infinite biological pathways. The connect trees and allow them to communicate. And allow the forest to behave as though it's a single organism. It's a beautiful image, but is it real? Lots of people potentially are sort of getting this so much of a fantasy about how forests work rather than the true picture. It's a welcome debate. We need to lean in and figure out what is actually happening. This week on Unexplainable, the story of the talking trees and the pushback. Follow Unexplainable wherever you listen for new episodes every Wednesday. ♪♪♪ ♪♪♪ We talked a lot about this ongoing relationship between corporate power, state power, and people power. I mean, is state power, in your opinion, the best check on corporate power? I don't know that it's the best. It is an essential one. And it's essential at multiple levels. And so in the United States, it's clear that it's going to matter at municipalities. It's going to matter at state and county levels and then at the federal. And why do we say this? Because there has been both success and loss in dealing with municipalities wanting to use essentially snake oil things that are supposed to help, say, figure out where people are being shot that are incredibly problematic. And states, and here this echo is a sort of older conservative talking point, they are seed beds for innovative policymaking. And so in the privacy realm, after the European Union, California and then Illinois, have really been pioneering the creation of legislation that is less easy for large corporations to perform regulatory capture on. And then ultimately, the federal government, I think, is going to have to be involved. And it's had some really important successes in privacy regulation in really key domains. Domains around higher education and probably to a lesser extent, health care. So I think those are all important. They're not adequate. And it is the case that there is a danger that federal regulation or any kind of state regulation can get in the way of interesting new technologies. Any kind of regulation shapes the environment in which corporations can exist. Like the idea of a joint stock corporation is not a truth of nature. It's an innovation in human history and a kind of peculiar one. But it shapes the ability to build certain kinds of things. And the same thing does in terms of technical regulation about who's a provider and who's a publisher in the internet. So you can't think of it as an either or. Is the state adequate? No, particularly because one thing about these new machine learning systems, these new data driven systems is they are so capital intensive to run something like the chat algorithms that are very much in the news. You can't do that with your home computer. My university does not have the resources to produce those kind of models. Only a small set of very large corporations have that capacity. So we're always going to be dealing with incredibly large corporate institutions. This is the point you make in the book. Because of the persuasive slash coercive power of data and algorithms, corporations in the state will always have a big structural advantage over individual citizens. And I know you say that we have to make data compatible with democracy. We have to make it serve democratic ends. I don't know what that looks like in the world with these disparities and resources. And power, I don't know if the answer is just old school antitrust laws or if we can look to sort of how Europe is dealing with the internet as a model. I don't know what scares me is. The technology we're talking about it have helped engineer the situation. But what scares me is the extent to which the public has been fractured and atomized. A fractured population is a much more controllable population because of the barriers to mass mobilization are so extreme. In so many ways, it feels like the modern world is a deliberately engineered collective action problem. I think that's a beautiful way of putting it. One of the reasons we don't just provide a sort of cookie cutter solution is that how are we going to get over that sort of massive collective action problem? And I think it is going to involve actually occasionally taking sort of a really kind of cynical approach to the different interests of different actors. For example, Apple pushes itself very heavily in the privacy sphere and it's easy to say, well, of course they're doing this for corporate interest. They are, but that also means that they can become an ally for various sorts of facets. Privacy is a very strange beast right now in that it involves parts pretty far to the right and pretty far to the left. That's a coalition that doesn't usually work and many people will be unhappy with. I think a lot of the solidarity to come will involve all kinds of those sorts of coalitions. Now it probably does involve antitrust to the extent that the number of extremely large and powerful corporations is so small that it gives us many fewer levers for any kind of visions of how cooperation is supposed to work to prevent, as the economists would say, negative externalities, as we're thinking about how do we build coalitions that over the long term are going to reinforce the set of values that we think are necessary to substantiate the idea of a democratic society. I very much agree with something that you write near the end of the book about how we're transfixed by the sci-fi nightmares of Terminator robots or world-destroying super intelligence. And I'm not saying those are non-concerns. I still don't know how concerned I am. I just know that the reality is that the Terminator scenario remains sci-fi, but there are live existing technologies that are already upending our social order right now. You mentioned chat GPT. How much does AI and chat GPT worry you? And if that doesn't, is there anything about the potential future of AI and data and algorithms that does something that's not quite here yet, what you think might be around the bend? I think you don't want to totally give up the possibilities that there might be some kind of singularity or robot apocalypse or something. It is such a figuration through sci-fi, but I do think attending to that means not attending to the way that these whole sets of technologies are already here. And one of the consistent narratives that people have shown, and that comes from a wide variety of scholars making the point that you're citing, is that many of the technologies that are most potentially oppressive and most limiting are first tried out on those populations having the least amount of power. And what happens and what is happening with something like chat GPT is they come to infringe on ever more powerful elements of society with jobs that have been seen as immune, immune to automation since the Industrial Revolution and whatnot. And I think tools like chat GPT are likely to affect changes in my domain of the university, in journalism, we're going to see them. But they're likely less to be these kind of world transformative events where an evil AI takes over, and more things that structurally look a lot like people losing jobs to factories and automation. And more than that, shifting of competencies to other sorts of people. One of the things that we haven't talked about is that almost all of the so-called AI, the machine learning of today, depends on vast amount of human judgment. But more and more depends on low paid human judgment of people dispersed around the world. It looks a lot like corporate supply chains everywhere. And so it's those kind of more granular effects I think we need to attend to. And we need to be wary of thinking about the political and economic dangers as exclusively about some turning point in which an evil master lord starts killing us or putting us into the matrix. It's not that we shouldn't worry about that, but the amount of our attention that we should divert to that should be rather small. Do you think we can ever trust these corporate powers to make changes to design and redesign their technologies such that they do serve just social ends? Could they even do that if they wanted to? I mean, this is a problem as old as politics, the fact that we don't agree on what constitutes just social ends to begin with. So how can we expect Facebook or Twitter to? I think trust in some sense is probably the wrong idiom. We're never going to trust them and we ought not. What we ought to do is think about how is it that we can devise things such that they serve some of the ends we need. So I'll give you an example from early in telecommunications. It's not obvious that everyone should have a connection to a telephone. And in fact, it's not very economically feasible. And yet it was required of AT&T. Now that they didn't do out of the goodness of their own heart. And yet it did fundamentally transform life, the same thing about rural electrification. These are issues. It's not that you trusted the electric companies. And it's certainly we don't right now. We don't trust them in the provision of these services. But it created a very different infrastructural world. So I give that example because it's not about blind trust. It's about creating the conditions in which they do more of that, which through various kinds of political processes, we deem to be most central. Of course, the problem all along, and as we've been discussing, is just that our political processes are so poisoned that coming to collective decisions and then implementing them is incredibly difficult at this moment. And this is why I really wanted to linger on the anti-democratic nature of surveillance capitalism. I mean, democracy by design is not supposed to give us these final authoritative answers on what is good or right or just. It is simply an ongoing conversation between citizens about what ought to be done. And the fact that these things are happening beyond the reach of the public is the problem. And that has to change first. Yeah, that's right. And I think one thing to think about that is that the democratic process, of course, always happens in some information environment. And there is no neutral information environment. Yeah. So one of the prods can be to think about how is it that the information environment might look different? And the answers to that, I think, are going to involve technical and legal and social transformations. And it's going to not be some grand revolutionary moment. I think it's enormously important for us to reclaim our right to control our data and to do so not simply through processes where we as individual people need to like opt in constantly or click on sorts of things. But we need both legal transformations that empower citizens. And we need technological opting in not to be the answer. One of the key moments in the internet is there's a decision that cookies, which are these ways of profiling us that get built into browsers quite early, the defaults, the technical defaults are to just enable them. And any time you make a default in favor of an absence of privacy, you're making a societal decision about the absence of privacy. So I think we need to make both large-scale legal transformations around privacy. But we also need to make all kinds of technical choices to enable privacy, including technical choices that make governments very unhappy. So that's not an easy answer. But I don't think easy answers are where we are. The book is How Data Happened, a history from the age of reason to the age of algorithms. Matthew Jones, this is fun. Thanks so much for coming in today. Thank you for having me, Sean. This was wonderful. ♪♪ This episode was produced by Eric Janikis and engineered by Patrick Boyd. Alex Overington wrote the theme music, additional engineering help from Brandon McFarland. The weeds is produced by Sophie Lalonde, our editorial director is Am Hall, and I'm your host, John Glenn Hill. The weeds is part of the Vox Media Podcast Network. ♪♪