Uniformity and Diversity in the US Press, 1841-1884: A Computational Analysis of Space, Time and Content


Israel Science Foundation 1790/22 (2023-2026).


Dr. Zef Segal, The School of Media Studies, The College of Management Academic Studies.
Prof. Menahem Blondheim, The College of Management Academic Studies.

Our study uses computational tools to analyze characteristics of the US journalistic network that operated between 1840 and 1884. Its expected scientific contribution focuses on methodology and communication history. On the methodological level, the study will advance the use of computational tools on huge corpora of US newspapers stored at the Library of Congress. We will apply several advanced digital approaches and techniques to identify and trace the structure and contents of America’s journalistic networks. Among other methods, we will use tools developed to detect plagiarism in order to locate reuse of texts in nineteenth-century periodicals, as well as advanced GIS mapping techniques to detect the direction, pace, and rhythm of news flows. We will also use LDA algorithms to detect temporal and regional clusters of topical content, in order to identify possible trends active across a number of periodicals and geographical regions.

Consequently, this research will identify two primary types of networks: first, the flow of news along with its "viral" contents; and second, a weighted content-driven network, expressing changing ideologies and affinities. We will interpret the historical meaning of our findings in two interrelated contexts. One is the development of the US communication environment in the wake of the transportation revolution and the emergence of technological and institutional innovations to include the telegraph and postal systems, as well as microeconomic and regulatory changes. The second and broader level of analysis will try to understand the interrelations between the structure of America’s journalistic output and parallel political, ideological, and social developments. In particular, we will try to observe links between the changing newspaper network and processes of national integration and diversity in the decades leading up to and during the Civil War, during Reconstruction and all the way to the Gilded Age.

Corpus size: 102,146 issues of 94 dailies across 44 years (1841-1884)

Figure 1: A map of the editorial locations of the corpus. The map was created by Tableau Public.