Harvard Symposium on Applied Computational Text Analysis - Abstracts

Chris Bail
Exposure to Opposing Views can Increase Political Polarization: Evidence from a Large-Scale Field Experiment on Social Media

There is mounting concern that social media sites contribute to political polarization by creating "echo chambers" that insulate people from opposing views about current events. We surveyed a large sample of Democrats and Republicans who visit Twitter at least three times each week about a range of social policy issues. One week later, we randomly assigned respondents to a treatment condition in which they were offered financial incentives to follow a Twitter bot for one month that exposed them to messages produced by elected officials, organizations, and other opinion leaders with opposing political ideologies. Respondents were re-surveyed at the end of the month to measure the effect of this treatment, and at regular intervals throughout the study period to monitor treatment compliance. Using a combination of techniques for causal inference and text analysis, we find that Republicans who followed a liberal Twitter bot became substantially more conservative post-treatment, and Democrats who followed a conservative Twitter bot became slightly more liberal post-treatment. These findings have important implications for the interdisciplinary literature on political polarization as well as the emerging field of computational social science.

Bart Bonikowski
Measuring Frames and Identities: What Computational Text Analysis Can Teach Us about Radical Politics

Scholarly and journalistic accounts of the recent successes of radical-right parties and candidates in Europe and the United States tend to conflate three concepts: populism, nationalism, and authoritarianism. The resulting lack of analytical clarity has hindered accounts of the causes and consequences of the radical turn in contemporary politics. In my ongoing work, I analytically separate the three constitutive elements of the radical right, theorize the relationship between them, and empirically examine temporal patterns in their supply and demand, that is, politicians’ discursive strategies and the corresponding public attitudes. The result is a theory of structural resonance, whereby a confluence of structural shocks has enabled outsider political actors to mobilize public fears and anxieties into powerful resentments against ethnic, racial, and religious minority groups. While a variety of methods lend themselves to this inquiry, computational text analysis is particularly useful for examining variation in populist and nationalist frames on the supply side, and fluctuations in the salience of nationhood on the demand side. In this presentation, I illustrate the findings yielded by this empirical approach and outline a methodological strategy for improving the measurement of populism and nationalism using large-scale textual corpora.

Taylor Brown
Qualities and Inequalities: Gendered Valuation in the Contemporary Art Market

Do women make art with different characteristics than men, or is women’s art valued differently for the same characteristics? Treating artworks as cultural texts—wherein qualities such as color, style, medium, and subject matter are their content—I implement machine learning classification on over 250,000 artworks produced by over 26,000 artists. I do so to assess how well an algorithm, provided with 1,000 salient qualities of an artwork, can be trained to correctly attribute that work to either a female or male artist. The results of this analysis provide the best empiric estimate to date on whether women make art with different characteristics than men, and what those distinguishing characteristics may be. I then use these results to conduct a matched-pair regression of female and male artworks with similar creative profiles, predicting the outcome of gallery listing price. Preliminary results suggest not only that women and men make art with different characteristics, but also that their work is valued differently for the same characteristics–with men’s work being valued significantly higher across five mediums: painting, sculpture, textile, photography, and other. These findings are further explored through computational analysis of over 73,000 art criticism texts. Implications for future work on gender inequality, creative markets, and computational social science are discussed.

Paul DiMaggio, Clark Bernier, Charles Heckscher & David Mimno
Interaction Ritual Threads: Does Collins' IRC Theory Work On-Line?

Since the publication of Randall Collins’s Interaction Ritual Chains, much interaction has moved from face-to-face to online settings. We draw on Bakhtin’s theory of speech genres to adapt key concepts from IRC theory to the online world. Using more than 40,000 postings from two intranet discussions at a global corporation, we find that IRC theory helps predict which posts contribute to robust conversations by eliciting responses. As in face-to-face interaction rituals, shared topical focus and adherence to temporal rhythms both contribute to success. Thus IRC theory proves a valuable resource in understanding communication online.

Amir Goldberg
Duality in Diversity: Cultural Heterogeneity, Language, and Firm Performance

Existing literature often understands cultural diversity in organizations as presenting a trade-off between task coordination and creative problem-solving. This work assumes that diversity arises primarily through cultural differences between individuals. Drawing on the toolkit theory of culture we propose that diversity can also exist within persons. We argue that interpersonal heterogeneity undermines coordination and portends worsening firm profitability, while intrapersonal heterogeneity facilitates creativity and leads to patenting success and positive market valuations. To evaluate these propositions, we use unsupervised learning to identify cultural content in employee reviews of nearly 500 publicly traded firms on a leading company review website and develop novel, time-varying measures of cultural heterogeneity. We demonstrate that a diversity of cultural beliefs in an organization does not necessarily impose a trade-off between operational efficiency and creativity.

Gary King
How the media activate public expression and influence national agendas


This talk reports on the results of first large scale randomized news media experiment. We demonstrate that even small news media outlets can cause large numbers of Americans to take public stands on specific issues, join national policy conversations, and express themselves publicly—all key components of democratic politics—more often than they would otherwise. After recruiting 48 mostly small media outlets, and working with them over 55 years, we chose groups of these outlets to write and publish articles on subjects we approved, on dates we randomly assigned. We estimate the causal effect on proximal measures, such as website pageviews and Twitter discussion of the articles’ specific subjects, and distal ones, such as national conversation in broad policy areas. Our intervention increased discussion in each broad policy area by a substantial ≈62.7% (relative to a day’s volume), accounting for 13,166 additional posts over the treatment week, with similar large effects across population subgroups. We also discuss the normative implications of this for individual journalists, the national ecosystem of media outlets, and democratic politics. This talk is based on work recently published in Science with Benjamin Schneer and Ariel White; for more information, see GaryKing.org/media.

Carly Knight
A Good  Trust:  Moral  Discourse  and  the  Rhetoric  of  Corporate  Legitimation 

An essential  feature  of  American  corporate  capitalism  is  that  corporations  are  considered  to  be  natural  market  actors.  In this project,  I  examine  the  rhetorical  process  by  which  corporations  transformed  from  unnatural  and  threatening  “creatures  of  the  state”  into  legitimated,  naturalized,  and  taken-for-granted  market  actors.  Institutional approaches argue  that  new  organizational  forms  become  legitimated  by  complying,  either  symbolically  or  in  fact,  with  pre-existing  cultural  norms.  By contrast,  I  argue  that  the  legitimation  of  big  business  required  a  wholesale  reconceptualization  in  the  nature  of  the  corporate  entity.  Using  a  mixed  method  approach  involving  text  analysis  and  targeted  deep  reading,  I  follow  the  changing  language  used  to  talk  about  the  corporation  in  order  to  understand  its  legitimation.  Drawing upon  emerging  work  in  morals  and  markets,  my  key  insight  is  that  corporate  legitimation  followed  a  process  I  term  “moral  bifurcation,”  where  moral  argumentation  legitimated  an  organizational  form  by  drawing  moral  distinctions  among  particular  organizational  actors.

Ryan Light
Interviewer Effects and Historical Data: Race and the “Born in Slavery” Collection

Survey methodologists have long attended to the estimation of interviewer effects, but few researchers have focused on estimating the interviewer effects in qualitative, historical data. The “Born in Slavery” data collected from 1936-1938 is controversial due to the potential for race-based interviewer effects, as a majority of those interviewing formerly enslaved Americans were white. To more formally model race-based interviewer effects, I use text networks for document classification in addition to other texts features. Findings indicate that the “Born in Slavery” corpus is, indeed, structured by race and that white interviewers are more likely to transcribe positive descriptions of life in the antebellum South.

Laura Nelson
Finding Simple Patterns in Complex Movements

Computational methods enable scholars to identify simple patterns that can help explain the social world, without sacrificing an understanding of the full diversity and complexity of social systems. Using two empirical examples – political cultures in feminist movements and the rise of lifestyle politics in the environmental movement – I employ computational network, clustering, and text analysis techniques to find simple yet insightful patterns in these two diverse and complex movements. This approach – finding patterns while embracing diversity – provides a more detailed yet comprehensive understanding of social movements than previously possible, and in a way that is methodologically rigorous and reproducible.

Alix Rule
Actor-oriented descriptions over the "longue duree".

Proposition: automated text analysis helps us craft what Clifford Geertz called “actor-oriented descriptions.” The descriptions generated by abstracting bodies of text are, at best close, but not deep. They can’t compete with ethnographic writing for richness, but they may allow us to capture what could never be observed by a single individual, thus opening up new empirical possibilities. I give examples from my recent work with JP Cointet: Relying on neural embedding techniques, we model the organization of the social world visible in the coverage of The New York Times 1900-present. We can navigate this model historically, compiling collections of ledes (events) that represent a given social context from distinct, consistent standpoints, over many decades. By abstracting these document collections, we achieve three forms of historical (ie diachronic) description, each with precedent in the work of historians. What’s the point? Consistent descriptions of context, I argue, promise new purchase on action that unfolds over the longue duree .