## Ken Steiglitz: Garage Rock and the Unknowable

Here is the second post in a series by The Discrete Charm of the Machine author Ken Steiglitz. You can access the first post here

I sat down to draft The Discrete Charm of the Machine with the goal of explaining, without math, how we arrived at today’s digital world. It is a quasi-chronological story; I take what I need, when I need it, from the space of ideas. I start at the simplest point, describing why noise is a constant threat to information and how using discrete values (usually zeros and ones) affords protection and a permanence not possible with information in analog (continuous) form. From there I sketch the important ideas of digital signal processing (for sound and pictures), coding theory (for nearly error-free communication), complexity theory (for computation), and so on—a fine arc, I think, from the boomy and very analog console radios of my childhood to my elegant little internet radio.

Yet the path through the book is not quite so breezy and trouble-free. In the final three chapters we encounter three mysteries, each progressively more fundamental and thorny. I hope your curiosity and sense of wonder will be piqued; there are ample references to further reading. Here are the problems in a nutshell:

1. Is it no harder to find a solution to a problem than to merely check a solution? (Does P = NP?) This question comes up in studying the relative difficulty of solving problems with a computing machine. It is a mathematical question, and is still unresolved after almost 40 years of attack by computer scientists.
As I discuss in the book, there are plenty of reasons to believe that P is not equal to NP and most computer scientists come down on that side. But … no one knows for sure.
2. Are the digital computers we use today as powerful—in a practical sense—as any we can build in this universe (the extended Church-Turing thesis)? This is a physics question, and for that reason is fundamentally different from the P=NP question. Its answer depends on how the universe works.
The thesis is intimately tied to the problem of building machines that are essentially more powerful than today’s digital computers—the human brain is one popular candidate. The question runs deep: some believe there is magic to found beyond the world of zeros and ones.
3. Can a machine be conscious? Philosopher David Chalmers calls this the hard problem, and considers it “the biggest mystery.” It is not a question of mathematics, nor of physics, but of philosophy and cognitive science.

I want to emphasize that this is not merely the modern equivalent of asking how many angels could dance on the point of a pin. The answer has most serious consequences for us humans: it determines how we should treat our android creations, the inevitable products of our present rush to artificial intelligence. If machines are capable of suffering we have a moral responsibility to treat them compassionately.

My first reaction to the third question is that it is unanswerable. How can we know about the subjective mental life of anyone (or any thing) but ourselves? Philosopher Owen Flanagan called those who take this position mysterians, after the proto-punk band ? and the Mysterians. Michael Shermer joins this camp in his Scientific American column of July 1, 2018. I discuss the difficulty in the final chapter and remain agnostic—although I am hard-pressed even to imagine what form an answer would take.

I suggest, however, a pragmatic way around the big third question: Rather than risk harm, give the machines the benefit of the doubt. It is after all what we do for our fellow humans.

Ken Steiglitz is professor emeritus of computer science and senior scholar at Princeton University. His books include The Discrete Charm of the MachineCombinatorial OptimizationA Digital Signal Processing Primer, and Snipers, Shills, and Sharks. He lives in Princeton, New Jersey.

## Matthew Salganik: Invisibilia, the Fragile Families Challenge, and Bit by Bit

This week’s episode of Invisibilia featured my research on the Fragile Families Challenge. The Challenge is a scientific mass collaboration that combines predictive modeling, causal inference, and in-depth interviews to yield insights that can improve the lives of disadvantaged children in the United States. Like many research projects, the Fragile Families Challenge emerged from a complex mix of inspirations. But, for me personally, a big part of the Fragile Families Challenge grew out of writing my new book Bit by Bit: Social Research in the Digital Age. In this post, I’ll describe how Bit by Bit helped give birth to the Fragile Families Challenge.

Bit by Bit is about social research in the age of big data. It is for social scientists who want to do more data science, data scientists who want to do more social science, and anyone interested in the combination of these two fields. Rather than being organized around specific data sources or machine learning methods, Bit by Bit progresses through four broad research designs: observing behavior, asking questions, running experiments, and creating mass collaboration. Each of these approaches requires a different relationship between researchers and participants, and each enables us to learn different things.

As I was working on Bit by Bit, many people seemed genuinely excited about most of the book—except the chapter on mass collaboration. When I talked about this chapter with colleagues and friends, I was often greeted with skepticism (or worse). Many of them felt that mass collaboration simply had no place in social research. In fact, at my book manuscript workshop—which was made up of people that I deeply respected—the general consensus seemed to be that I should drop this chapter from Bit by Bit.  But I felt strongly that it should be included, in part because it enabled researchers to do new and different kinds of things. The more time I spent defending the idea of mass collaboration for social research, the more I became convinced that it was really interesting, important, and exciting. So, once I finished up the manuscript for Bit by Bit, I set my sights on designing the mass collaboration that became the Fragile Families Challenge.

The Fragile Families Challenge, described in more detail at the project website and blog, should be seen as part of the larger landscape of mass collaboration research. Perhaps the most well known example of a mass collaboration solving a big intellectual problem is Wikipedia, where a mass collaboration of volunteers created a fantastic encyclopedia that is available to everyone.

Collaboration in research is nothing new, of course. What is new, however, is that the digital age enables collaboration with a much larger and more diverse set of people: the billions of people around the world with Internet access. I expect that these new mass collaborations will yield amazing results not just because of the number of people involved but also because of their diverse skills and perspectives. How can we incorporate everyone with an Internet connection into our research process? What could you do with 100 research assistants? What about 100,000 skilled collaborators?

As I write in Bit by Bit, I think it is helpful to roughly distinguish between three types of mass collaboration projects: human computation, open call, and distributed data collectionHuman computation projects are ideally suited for easy-task-big-scale problems, such as labeling a million images. These are projects that in the past might have been performed by undergraduate research assistants. Contributions to human computation projects don’t require specialized skills, and the final output is typically an average of all of the contributions. A classic example of a human computation project is Galaxy Zoo, where a hundred thousand volunteers helped astronomers classify a million galaxies. Open call projects, on the other hand, are more suited for problems where you are looking for novel answers to clearly formulated questions. In the past, these are projects that might have involved asking colleagues. Contributions to open call projects come from people who may have specialized skills, and the final output is usually the best contribution. A classic example of an open call is the Netflix Prize, where thousands of scientists and hackers worked to develop new algorithms to predict customers’ ratings of movies. Finally, distributed data collection projects are ideally suited for large-scale data collection. These are projects that in the past might have been performed by undergraduate research assistants or survey research companies. Contributions to distributed data collection projects typically come from people who have access to locations that researchers do not, and the final product is a simple collection of the contributions. A classic example of a distributed data collection is eBird, in which hundreds of thousands of volunteers contribute reports about birds they see.

Given this way of organizing things, you can think of the Fragile Families Challenge as an open call project, and when designing the Challenge, I draw inspiration from the other open call projects that I wrote about such as the Netflix Prize, Foldit, and Peer-to-Patent.

If you’d like to learn more about how mass collaboration can be used in social research, I’d recommend reading Chapter 5 of Bit by Bit or watching this talk I gave at Stanford in the Human-Computer Interaction Seminar. If you’d like to learn more about the Fragile Families Challenge, which is ongoing, I’d recommend our project website and blog.  Finally, if you are interested in social science in the age of big data, I’d recommend reading all of Bit by Bit: Social Research in the Digital Age.

Matthew J. Salganik is professor of sociology at Princeton University, where he is also affiliated with the Center for Information Technology Policy and the Center for Statistics and Machine Learning. His research has been funded by Microsoft, Facebook, and Google, and has been featured on NPR and in such publications as the New Yorker, the New York Times, and the Wall Street Journal.

## Matthew J. Salganik on Bit by Bit: Social Research in the Digital Age

In just the past several years, we have witnessed the birth and rapid spread of social media, mobile phones, and numerous other digital marvels. In addition to changing how we live, these tools enable us to collect and process data about human behavior on a scale never before imaginable, offering entirely new approaches to core questions about social behavior. Bit by Bit is the key to unlocking these powerful methods—a landmark book that will fundamentally change how the next generation of social scientists and data scientists explores the world around us. Matthew Salganik has provided an invaluable resource for social scientists who want to harness the research potential of big data and a must-read for data scientists interested in applying the lessons of social science to tomorrow’s technologies. Read on to learn more about the ideas in Bit by Bit.

Your book begins with a story about something that happened to you in graduate school. Can you talk a bit about that? How did that lead to the book?

Who is this book for?

This book is for social scientists who want to do more data science, data scientists who want to do more social science, and anyone interested in the hybrid of these two fields. I spend time with both social scientists and data scientists, and this book is my attempt to bring the ideas from the communities together in a way that avoids the jargon of either community.

In your talks, I’ve heard that you compare data science to a urinal.  What’s that about?

Well, I compare data science to a very specific, very special urinal: Fountain by the great French artist Marcel Duchamp. To create Fountain, Duchamp had a flash of creativity where he took something that was created for one purpose—going to the bathroom—and turned it a piece of art. But most artists don’t work that way. For example, Michelangelo, didn’t repurpose. When he wanted to create a statue of David, he didn’t look for a piece of marble that kind of looked like David: he spent three years laboring to create his masterpiece. David is not a readymade; it is a custommade.

These two styles—readymades and custommades—roughly map onto styles that can be employed for social research in the digital age. My book has examples of data scientists cleverly repurposing big data sources that were originally created by companies and governments. In other examples, however, social scientists start with a specific question and then used the tools of the digital age to create the data needed to answer that question. When done well, both of these styles can be incredibly powerful. Therefore, I expect that social research in the digital age will involve both readymades and custommades; it will involve both Duchamps and Michelangelos.

Bit by Bit devotes a lot attention to ethics.  Why?

The book provides many of examples of how researchers can use the capabilities of the digital age to conduct exciting and important research. But, in my experience, researchers who wish to take advantage of these new opportunities will confront difficult ethical decisions. In the digital age, researchers—often in collaboration with companies and governments—have increasing power over the lives of participants. By power, I mean the ability to do things to people without their consent or even awareness. For example, researchers can now observe the behavior of millions of people, and researchers can also enroll millions of people in massive experiments. As the power of researchers is increasing, there has not been an equivalent increase in clarity about how that power should be used. In fact, researchers must decide how to exercise their power based on inconsistent and overlapping rules, laws, and norms. This combination of powerful capabilities and vague guidelines can force even well-meaning researchers to grapple with difficult decisions. In the book, I try to provide principles that can help researchers—whether they are in universities, governments, or companies—balance these issues and move forward in a responsible way.

Your book went through an unusual Open Review process in addition to peer review. Tell me about that.

That’s right. This book is about social research in the digital age, so I also wanted to publish it in a digital age way. As soon as I submitted the book manuscript for peer review, I also posted it online for an Open Review during which anyone in the world could read it and annotate it. During this Open Review process dozens of people left hundreds of annotations, and I combined these annotations with the feedback from peer review to produce a final manuscript. I was really happy with the annotations that I received, and they really helped me improve the book.

The Open Review process also allowed us to collect valuable data. Just as the New York Times is tracking which stories get read and for how long, we could see which parts of the book were being read, how people arrived to the book, and which parts of the book were causing people to stop reading.

Finally, the Open Review process helped us get the ideas in the book in front of the largest possible audience. During Open Review, we had readers from all over the world, and we even had a few course adoptions. Also, in addition to posting the manuscript in English, we machine translated it into more than 100 languages, and we saw that these other languages increased our traffic by about 20%.

Was putting your book through Open Review scary?

No, it was exhilarating. Our back-end analytics allowed me see that people from around the world were reading it, and I loved the feedback that I received. Of course, I didn’t agree with all the annotations, but they were offered in a helpful spirit, and, as I said, many of them really improved the book.

Actually, the thing that is really scary to me is putting out a physical book that can’t be changed anymore. I wanted to get as much feedback as possible before the really scary thing happened.

And now you’ve made it easy for other authors to put their manuscripts through Open Review?

Absolutely. With a grant from the Sloan Foundation, we’ve released the Open Review Toolkit. It is open source software that enables authors and publishers to convert book manuscripts into a website that can be used for Open Review. And, as I said, during Open Review, you can receive valuable feedback to help improve your manuscript, feedback that is very complimentary to the feedback from peer review. During Open Review, you can also collect valuable data to help launch your book. Furthermore, all of these good things are happening at the same time that you are increasing access to scientific research, which is a core value of many authors and academic publishers.

Matthew J. Salganik is professor of sociology at Princeton University, where he is also affiliated with the Center for Information Technology Policy and the Center for Statistics and Machine Learning. His research has been funded by Microsoft, Facebook, and Google, and has been featured on NPR and in such publications as the New Yorker, the New York Times, and the Wall Street Journal.

## From “Brexit” to “dumpster fire”: Benjamin Peters on why digital keywords matter

In the digital age, words are increasingly important, with some taking on entirely different meanings in the digital world. Benjamin Peters’ new book, Digital Keywords: A Vocabulary of Information Society & Culture  presents modern humans as linguistic creatures whose cultural, economic, political, and social relations are inseparable from these “keywords”. Recently, Peters took the time to answer some questions about the book:

Why digital keywords? Why now?

BP: “Brexit” and “Trumpmemtum.”

What are these but marked keywords that—together with, say, the trendy new phrase “dumpster fire”—trigger anxieties very much alive today? What work do such words do?

40 years ago, in 1976, the Welsh literary critic Raymond Williams published his classic Keywords: A Vocabulary of Culture and Society, establishing a critical and ongoing project for taking seriously the work of over 100 words in postindustrial Britain. This book, taking Williams as its (all too) timely inspiration, seeks to refresh the keywords project for English-language information societies and cultures worldwide.

This book seeks to change the conversation about the digital revolution of language at hand. The real world may not be made out of language but our access to it surely is. Modern humans are linguistic creatures: our cultural, economic, political, social, and other relations cannot be separated from the work our words do. And as everyone who has ever put pencil to paper knows, our words do not always oblige. This is especially true in the age of search. Digital keywords are both indispensable and tricky. They are ferociously important and often bite back.

Digital Keywords also seeks to offer a teachably different approach to “digital keywords” than currently championed, as a simple Google search will reveal, by the meddling reach of search engine optimizers (SEO). No older than the OJ Simpson trial and valued at no less than \$65 billion (about the economy of Nebraska), the SEO industry is arguably the dominant approach to taking keywords seriously online at the moment: and yet reason strains at the massive capital flows that, say, the term “insurance” alone commands. SEO, with its shady markets of pay-per-click advertising and results manipulation, cannot be the best approach to working with digital keywords.

How else might we begin (again)?

I’m hooked. So which keywords does the book take up? And what makes those words key?

BP: Let me answer that in reverse. As editor I figured I had a choice: I could either start by choosing the words I thought were key for the information age and then find people to write about them, or I could invite the best contributors to the project and then let them choose their keywords. As it happens, this volume does both. On the one hand, the appendix lists well over 200 candidate keywords—from access to zoom—and we’ll be soliciting other keywords to that growing list on the scholarly blog Culture Digitally this July.

On the other hand, the 25 words featured in this book are “key” simply because the scholars that populate this book demonstrate that they are. That may sound tautological, but I actually uphold it as the high standard in keyword scholarship: a word is key because it does meaningful social work in our lives. It is the task of each essay to prove such work. The reader too is invited to take up Williams’ search for themselves and to test these essays accordingly: do they convince that these terms, once understood, are somehow tectonic to the modern information society and culture—and why or why not? Which words would you add—and why?

Fair enough. Can you give us a sample of what the authors claim about their keywords?

BP: Sure thing. The freely available extended introduction critically frames the project as a first step toward a grammar for understanding terministic technologies; it also summarizes each essay and draws critical connections between them, so I won’t do any of that here. Since the book itself is organized alphabetically by keyword, I’ll list the essays alphabetically by author last name. Rosemary Avance critically reclaims community online and off, Saugata Bhaduri risks the collective action baked into gaming, Sandra Braman tackles Williams’ keyword flow in information systems, Gabriella Coleman decrypts hackers and their crafts, Jeffrey Drouin takes on document surrogates in copy cultures, Christina Dunbar-Hester critically appraises the gender in computing geeks, Adam Fish reflects on what mirror is doing in data mirroring, Hope Forsyth grounds the online forum in ancient Rome, Bernard Geoghegan telegraphs back the origins of modern information, Tarleton Gillespie demystifies the omnipresent algorithm, Katherine D. Harris unpacks the digital archive, Nicholas A. John rethinks sharing cultures online, Christopher Kelty unearths root causes and consequences of participation, Rasmus Kleis Nielsen separates democracy from digital technologies, John Durham Peters seeds an outpouring of the cloud in cloud computing, Steven Schrag reworks memory and its mental and mechanical discontents, Stephanie Ricker Schulte repossesses personalization, Limor Shifman reanimates the meme online, Julia Sonnevend theorizes events beyond media, Jonathan Sterne and I, separately, deconstruct the analog and digital binary, Thomas Streeter pluralizes the internet, Ted Striphas rereads culture alongside technology after Williams, Fred Turner goes Puritan on the Silicon Valley prototype, and Guobin Yang launches the book with the de-radicalizing of activism online.

Who is the audience for this book? Who are you writing for?

BP: Students, scholars, and general interest readers interested in the weighty role of language in the age of search in particular and the current information age in general. Ideally, each essay will prove plain and short enough (average length 3000 words) to sustain the attention of the distracted undergraduate, substantial enough to enrich the graduate students, and pointed enough to provoke constructive criticism from the most experienced scholar. Of course this ideal will not hold uniformly across this or any other volume, but perhaps this group of contributors delivers on the whole, I must say, and that is enough for this editor.

I’m also excited to note that later this year Princeton University Press also plans to release for free download my teaching notes for this book. These notes aim to offer in an easily editable format enough material to teach the book as the main course text for a semester-long undergraduate or graduate course in media and communication studies. We hope this will benefit courses worldwide. Meanwhile, the scholarly blog Culture Digitally maintains, with Princeton University Press’ generous support, the early drafts of fair share of the published essays here.

Benjamin Peters is assistant professor of communication at the University of Tulsa in Tulsa, Oklahoma. He is also affiliated faculty at the Information Society Project at Yale Law School.