Vulnerable voices: using topic modeling to analyze newspaper coverage of climate change in 26 non-Annex I countries (2010–2020)

News media influence how climate change is represented, understood, and discussed in the public sphere. To date, media and climate change research has primarily focused on Annex I countries, or treated non-Annex I countries as a homogenous bloc, despite the global nature of climate change and its geographically uneven impacts. This study uses a mixed-method approach, combining machine learning (topic modeling), econometrics, and qualitative analyses, to investigate newspaper coverage of climate change in 26 non-Annex I countries. We compiled a dataset of 95 216 news articles (dated between 2010 and 2020 from 50 sources) in 26 lower-middle and upper-middle income non-Annex I countries. In line with previous research results, we find that most common topics represented are international governance of climate change, the economics of energy transitions, and the impacts of climate change. Advancing current research understanding, we also demonstrate heterogeneity in coverage between non-Annex I countries and discover that a country’s vulnerability to climate change is positively associated with the diversity of topics (based on an article-level entropy index) portrayed by its domestic news media outlets.


Introduction
Human-induced climate change severely threatens ecosystems (IPCC 2022) and human communities (Watts et al 2021).Media coverage critically influences how information reaches people, and how communities respond to climate impacts.In this study, we examine two key research questions: (1) what are the dominant themes of media coverage in lowermiddle and upper-middle income non-Annex I countries?And (2) how do differences in the vulnerability of non-Annex I countries to climate change shape relevant news media coverage?The term 'non-Annex I' countries is derived from the text of the Kyoto Protocol where 35 countries from Europe and North America agreed on targets and timetables for emissions reductions; while these countries were listed in 'Annex I' of the treaty, other countries were referred to as 'non-Annex I' parties (Glantz 2003).
Focusing on climate media coverage in non-Annex I countries is crucial for two key reasons.First and foremost is the relative lack of scholarly attention to these countries.For example, in a systematic cross-national literature review, Comfort and Park (2018) found that nearly three-quarters of their articles examined media and climate coverage in the United States (US) and wealthy European Union (EU) nations.This is likely due to Annex-I countries' historical and ongoing role in emitting greenhouse gases (IPCC 2022, Friedrich et al 2023) and reliable scholarly access to data archives (Boykoff et al 2023).Scholars have thus repeatedly called for more research on climate change media coverage in non-Annex I countries (e.g.Schäfer and Schlichting 2014, Ghosh and Boykoff 2019, Boykoff et al 2021).
Second, scholars often group and study non-Annex I countries as a homogenous bloc.For example, cross-national comparative research often compares Western and non-Western countries (e.g.Painter and Ashe 2012, Schmidt et al 2013, Broadbent et al 2016, Engesser and Brüggemann 2016, Brüggemann and Engesser 2017, Gurwitt et al 2017, Vu et al 2019, Painter et al 2020, Schäfer and Painter 2020, Hase et al 2021, Ejaz et al 2022).This work has undoubtedly led to several insights.For example, recent studies by Vu et al (2019) and Hase et al (2021) showcase differences in media frames between wealthier and poorer countries (e.g. as a matter of politics and science vs. impacts on humans and their daily lives respectively).Other studies, such as Ejaz et al (2022), have also demonstrated areas of increasing global consensus such as public attitudes toward climate change.Yet, such comparative research also has limitations.It does not necessarily account for the heterogeneity among non-Annex I countries in journalistic cultures and context (Finlay 2012, Ajaero and Anorue 2018, Hase et al 2021).The unique social, economic, and political conditions of countries complicate the interpretation of findings (Olausson and Berglez 2014) due to 'the often missing "functional equivalence" of measurements' between countries (Wirth and Kolb cited in Schmidt et al 2013Schmidt et al , p 1234)).
Our paper addresses these dual challenges by comparing an expansive set of economically similar non-Annex I print media over 11 years (2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017)(2018)(2019)(2020).We systematically build a dataset that focuses on highly vulnerable countries (Schäfer and Schlichting 2014)-indicated by their 'susceptibility to harm…exposure, sensitivity, and adaptive capacity' (Ford et al 2018, p 194).Our data sample consists of 95 216 articles from 50 news sources in 26 non-Annex I countries: Bangladesh, Botswana, China, Egypt, Ghana, India, Indonesia, Jordan, Kenya, Lebanon, Malaysia, Namibia, Nepal, Nigeria, Pakistan, Philippines, South Africa, Sri Lanka, Tanzania, Thailand, Tunisia, Ukraine, Uzbekistan, Vietnam, Zambia, and Zimbabwe (see supplementary materials, table S1).We selected these countries as they are among the most vulnerable to climate change (Chen et al 2015).By focusing on examining differences within a group of highly vulnerable, non-Annex I countries we build upon and extend insights from existing single-country studies in areas such as Bangladesh (Miah et al 2011), Botswana (Faimau Focusing exclusively on vulnerable countries also allows us to systematically examine the interrelated nature of economic, social, and environmental challenges caused by climate change (Gasper et al 2011).The worst impacts of climate change are experienced unevenly, with populations in more vulnerable countries facing numerous inequalities (Thomas et al 2019, IPCC 2022, Ngcamu 2023).As the intensity and frequency of extreme climate and weather events increase, the co-occurrence of climate impacts on vulnerable populations also rises (AghaKouchak et al 2020, Ebi et al 2021, IPCC 2022).Extreme weather, water and food shortages, and disruption in critical services, among other climate impacts, can increase the risk for marginalized populations (Cuartas et al 2023).However, what remains empirically unexamined is whether and how the media in vulnerable countries portrays the interrelatedness of such issues.We posit that, ceteris paribus, the news media from more vulnerable countries should report several types of climate impacts and consequences.Moreover, if the interrelated nature of climate change impacts is accurately reflected in the media, a focal news article in a comparatively more vulnerable country should also span (i.e.discuss) more topics vis-à-vis focusing its attention on a single topic (see methods for article 'topic diversity' computation).Formally, we test the following hypothesis at the news article level: Hypothesis 1: There will be a positive association between country-level climate change vulnerability and the diversity of climate topics covered in relevant news media articles.

Sample selection and data collection
We selected lower-and upper-middle-income countries using World Bank (2022) income classification data, as they have traditionally lacked significant resources to address climate change.We did not select low-income countries due to limited data access throughout the study period (Okoliko and de Wit 2020  Schäfer and Painter 2020, Thackeray et al 2020, Hase et al 2021).From these lists, we sampled the highest circulating sources that also included access to the entire print article.We focused on Englishlanguage media, which have been viewed as international drivers of public discourse and agenda-setting (Sonwalkar 2002, Khatri et al 2016).Using Englishlanguage print sources also enabled us to manually and qualitatively cross-check the findings (see topic modeling section below).Figure S1 and table S1 of the supplementary materials provide more details on our sample.

Topic modeling
We used Latent Dirichlet Allocation (LDA) to uncover the major themes in our dataset.LDA is a probabilistic modeling technique for discovering the 'latent' topics in a large and unstructured collection of documents (Blei et al 2003, Griffiths and Steyvers 2004, Blei 2012), such as news articles, journals, blogs, and annual reports, among others.LDA assumes that a document is generated in a process that includes hidden distributions (e.g. the topic structures).Specifically, there are two distributions to be inferred.The first is the word distribution within a topic; the second is each document's topic distribution.The algorithm treats each word in a document (i.e. a news article in our study) as created through two generative probabilistic steps: (1) a topic is chosen among the set of topics with a priori distribution, and (2) a word is chosen from the selected topic, which is a distribution over a set of words.The topic distribution for each news article and the distribution of the words are estimated based on the set of documents.We manually read the top 100 articles (based on model-assigned topic probabilities) from each topic to verify the model's veracity and crossvalidate the topic labels.More details on the models implementation and tuning metrics are in the supplementary materials (see methods S1, table S2, and figure S2).

Multivariate regression analyses
After identifying the main topics, we examined the relationship between a country's vulnerability to climate change and the diversity of topics discussed by its media.We joined our data to annual measures of a country's vulnerability to climate change from the Notre Dame Global Adaptation Initiative (e.g. , where p represents the probability that a given news article belongs to a topic, and i is one of the 11 topics in our LDA model (also see articles S1-S2 in the supplementary materials).We then executed multivariate regression analyses using (1) ordinary least squares estimates at the news article level and (2) a fixed effects specification with an unbalanced panel, organized by newspaper source and the month that an article was published as the time identifier (132 time periods, i.e. 11 years × 12 months).We also included controls at both the news article (word count, sentiment, average sentence length, and % of unique words) and country (readiness to adapt to climate change (Chen et al 2015), population, gross domestic product (GDP) per capita, and human development index (HDI)) levels.Standard errors were clustered at the news source level.

Major topics covered by the media in non-Annex I countries
Figure 1 (panel (A)) shows the average probability (as a percentage) of the occurrence of a topic in a news article.The three most commonly occurring topics are international governance & development (14.45%), the economics of energy transitions (10.86%), and the impacts of climate change (10.70%).These three topics account for over onethird (36%) of the sample topic distribution.The overall distribution is relatively uniform, albeit with a high level of variability as shown by the error bars.Exemplar excerpts in the supplementary materials (see table S3) qualitatively illustrate these topics.
Figure 1 (panel (B)) shows an intertopic distance map, with two distinct clusters.The first 'socioeconomic' cluster (in green) focuses on energy economics, agriculture, and activism & collective action.The second 'sociopolitical' cluster (in blue) focuses on international governance & development, national governance, and education-related events about climate change.Discussions about climate science (impacts of climate change) co-occur more often with socioeconomic vis-à-vis sociopolitical topics.Four topics are spatially isolated (national political elites, culture, US politics, and health) indicative of their more peripheral and/or discrete (vis-à-vis interconnected with other topics) media coverage.

Geographic differences in media coverage among non-Annex I countries
Countries in table 1 are organized by five geographic regions, using World Bank definitions.Topics in table 1 are organized from left to right (same order as figure 1), and the three most prevalent topics are highlighted in grey.One-way ANOVA tests at both the country and region level showed that mean differences were statistically significant.We observed that the media in Sub-Saharan Africa pays a significantly higher level of attention to agriculture relative to other regions.The media coverage in South Asia, which has the most vulnerable countries in our sample (standardized average = 0.83; last column of table 1), also has a relatively unique focus.It emphasizes the impacts of climate change, educational events, and national political elites (likely due to the comparatively low focus in India on international governance).
The inter-topic distance map in figure 2 uses the country-level topic averages in table 1.We observe that the coverage in South Asia and East Asia is more homogenous relative to the middle East or Sub-Saharan Africa (i.e. the South Asian and East Asian countries form smaller clusters).Interestingly, coverage in South Asia is more similar to that in Sub-Saharan Africa, relative to that in East Asia & Pacific.Note that the eight most vulnerable countries in our sample were Bangladesh, Pakistan, Kenya, Tanzania, Zimbabwe, India, Nepal, and Nigeria (all in the highest quartile of the ND-gain global index, >75% relative to the rest of the world).We observe that they cluster together (shown in red), indicative of an association between media coverage and country climate change vulnerability.We also note that South Africa and Vietnam appear as 'outliers' (i.e.distant from other countries in their region).This is consistent with their disproportionately high focus on topics such as energy economics and collective action respectively (see table 1).

Relationship between climate change vulnerability and topic diversity
The average vulnerability for each country (see last column of table 1) is positively moderately correlated with the diversity of topics in an article (r = 0.44).In table 2, we examined this correlation more systematically through multivariate regressions.Model 1 and Model 2, use the entire data sample (N = 95 216).A 1 standard deviation increase in country climate change vulnerability is associated with ∼0.25 standard deviation increase in the topic diversity of the average news article in that country (β model 1 = 0.27, p = 0.01; β model 2 = 0.23, p = 0.01).In Models 3 and 4, we repeated the analyses after collapsing (averaging) the data and creating an unbalanced panel (N = 5155) at the news source-observation month level.Again, we find a positive and statistically significant relationship (β model 3 = 0.31, p = 0.01; β model 4 = 0.20, p = 0.06) between a country's climate change vulnerability and the diversity of topics discussed by its media.We carried out several sensitivity analyses (e.g.removing the bottom 15 sources which had less than 500 articles, excluding India, restricting data from India to one or two sources, excluding 'outlier countries' such as Ukraine, Uzbekistan, and South Africa, omitting highly correlated (with country vulnerability) control variables such as GDP per capita and/or HDI, using a 1 year time lag in the models) and found robust results.We thus find a positive association between country-level climate change vulnerability and the diversity of climate topics covered in relevant news articles (see article S1 and article S2 in the supplementary materials for illustrative examples).
We also carried out a supplemental analysis (see supplementary materials, table S4) where we regressed country vulnerability on the probability of occurrence of each topic as separate dependent variables (instead of computing a single entropy measure).A country's climate change vulnerability is significantly negatively associated with a focus on international governance (β = −0.43,p = 0.01), and US politics (β = −0.49,p = 0.03).It is significantly positively associated with coverage of national governance (β = 1.5, p = 0.004) and national elites (β = 0.62, p = 0.04).Thus, we find tentative evidence that an increase in a country's vulnerability to climate change is associated with a shift in media coverage towards domestic (or internal) issues visa-vis external matters.We provide exemplar articles to illustrate this pattern (see supplementary materials articles S3-S6).
Table 1.Geographic variability in topic coverage.Averages were computed using the probabilistic topic distribution for each news article.Mean comparisons are provided by country and region as grouping units.One-way ANOVA tests were conducted to calculate F statistics, and show significant group mean differences.The three highest values for each row (i.e., the top three most common themes) are highlighted in grey.The average vulnerability for each country is provided in the last column.

Discussion
Scholars  (2019) found that poorer countries report more on 'international relations and the natural aspects of climate change' (p 7).Using a much larger sample of countries from non-Annex I countries, we found that the three most discussed themes were international governance & development, the economics of energy transitions, and the impacts of climate change.We also find significant cross-regional differences (e.g. between South Asia, sub-Saharan Africa, and East Asia).The importance of international politics and the impacts of climate change, such as sea level rise and flooding, are thus highlighted across all studies.These findings are also consistent with the fact that many non-Annex I countries often depend upon resources from Annex I countries to mitigate and adapt to the consequences of climate change (IPCC 2022).Interestingly, both our study and Vu et al (2019)'s findings do slightly differ from those of Hase et al Table 2. Multivariate regressions between country vulnerability and the diversity of topics covered by the news media.Topic diversity is measured using the Shannon entropy measure (Models 1 and 3) and a reverse-coded Herfindahl-Hirschmann concentration index (Models 2 and 4).Models 1 and 2 use the complete article-level data set, while Models 3 and 4 used an unbalanced panel created at the newspaper-month level.Results are robust to using or omitting a one-year time lag between the dependent and independent variables in all models.(2021), who did not find an increase in extreme weather events in their analysis.One potential reason for these differences could be the vastly different sample sizes; both Vu et al (2019)

Relationship between country climate change vulnerability and diversity of topics covered by the media
We also show that there is a significant positive association between countries' vulnerability to climate change and the diversity of topics discussed in their media coverage (table 2).Many studies discuss the interrelated nature of climate change impacts; our study systematically quantifies this phenomenon in non-Annex I countries.Our study thus adds nuance to the literature by highlighting the localized impacts of climate change across the Global South (Ejaz et al 2022).We also highlight the need for more partnerships and knowledge transfer between South Asia and Sub-Saharan Africa, as these regions share similar climate coverage and climatic challenges, relative to East Asia & Pacific.Most importantly, our findings demonstrate that in the most vulnerable countries, the media explicitly reports on the complexity and multifaceted nature of climate change.By speaking to the social, economic, scientific, and political aspects of climate change, news coverage in the most vulnerable countries is highly sophisticated.Our findings call out a Western bias and colonial lens by refuting a reoccurring narrative in the climate media literature that journalism in the Global South is less robust due to resource constraints.The Western bias is further reflected by the dearth of scholarship on the media coverage of climate change in less developed countries.For example, Okoliko and de Wit (2020) show that scholarship on media coverage of climate change has only been examined in nine African countries.Our study shows that, even while developing countries may have fewer resources or training to inform climate coverage (Ajaero and Anorue 2018), reporting in the most vulnerable countries is highly nuanced and distinctive from common Western conceptualizations of climate change (Olausson and Berglez 2014).
Intuitively, the implications of our findings make sense as we see that as the most vulnerable countries experience the urgency and worst impacts of our changing climate; the co-occurrence of issues and interconnectedness of climate change is forefronted and reflected in media reporting through the diversity of topics covered in its articles.In the most vulnerable countries, climate change information is not siloed, but addressed systemically.This also tracks the objectives of developmental journalism as vulnerable countries work on developmental, interventional, and educational objectives in the context of lived climate impacts (Kalyango et al 2017).This contrasts with global media coverage, which still struggles to make explicit connections between two topics, such as climate change and health (Romanello et al 2022).Our findings emphasize the opportunity for the Global North to learn from media coverage of climate change in the Global South where the problem is most acute.

Study limitations
First, engagement with print media is declining in many countries (Newman et al 2022).That said, audiences for print media remain particularly relevant in the countries in our sample (Schmidt et al 2013, Wahyuni 2017, Comfort et al 2020, Painter et al 2020).Single-country studies have also found that print media content is highly correlated with online versions of newspapers (Dhiman 2022).Moreover, print media is still an influential source of information for local elites and has been found to impact voting choices (Prat and Strömberg 2013).Second, our study only used English language sources.This is problematic because of the vast circulations of vernacular languages across many countries in our sample, such as India (Audit Bureau of Circulations 2023), China (World Association of Newspapers 2023), Malaysia, and Thailand (Newman et al 2022).However, we purposefully chose English-only sources because: (1) English language sources are essential for reaching policy-making elites and agenda setting (Sonwalkar 2002), (2) English-language newspapers include articles from many experts and representatives of funding agencies (Khatri et al 2016), and (3) we could manually qualitatively verify the topic model findings.Third, we acknowledge that the countries we compare have different media landscapes and political structures, among other unobservable confounding factors.We tried to mitigate such bias by comparing non-Annex I countries with similar economic profiles and using fixed-effects model specifications (see table 2).Fourth, there are notable limitations of topic modeling, such as the inability of the method to understand contextual specifics (Brookes and McEnery 2019).We thus complemented the LDA analysis with a set of qualitative analyses (e.g.Hase et al 2021) to reach 'substantive interpretability' (see methods and supplementary materials, table S3).

Conclusion
The historic and ongoing scarcity of climate communication research in the Global South has understandably been highlighted as an area of scholarly concern (Wright et al 2019, Okoliko andde Wit 2020).
Moreover, treating the Global South as a homogenous bloc is reductionist and limits explanatory power.Our study is the first to categorize the themes used by the media to cover climate change in 50 sources within 26 economically similar non-Annex I countries, and to identify a positive relationship between a country's vulnerability to climate change and the diversity of topics covered by its news media.Our findings help expand heterogeneous understandings of Global South climate change news coverage and advance ongoing investigations into relationships between climate change and discourses in the world's most vulnerable countries.Future research can also use the methodology presented in this paper across other forms of digital media, such as social media.
Our study can also serve as an important bridge, helping the media in the Global North learn from their counterparts in non-Annex I countries.As climate threats increase globally, the ability of the media to articulately reflect the overlapping and interrelated nature of climate threats is critical.Ultimately, it is our hope that a more nuanced portrayal of climate change in the global media as a systemic problem (Lehtonen et al 2018) can inform and improve the sophistication of local, national, and international mitigation and adaption efforts.

Figure 1 .
Figure 1.Distribution of topics and inter-topic distance map.Panel (A): the sorted histogram shows the heterogeneity in coverage (i.e.across the sample (average from 2010 to 2020)).Standard deviations are shown as positive error bars.Panel (B): for the intertopic distance map, the first two principal components are plotted using multidimensional scaling.The bubble size represents the marginal topic distribution, while the distance on the plot represents the 'proximity' (i.e.overlap) between topic.Clusters of thematically related topics are color-coded and annotated.

Figure 2 .
Figure 2. Inter-topic distance map (countries).For this distance map, we used the average topic distribution in each country to compute their distances.The shapes are used to demarcate the 5 world bank regions in our sample The 8 most vulnerable countries in our sample (those with a vulnerability value in the >75th percentile of the global ND-gain index) are shown in red.The two outlier countries (South Africa and Vietnam) are shown in green.
have called for climate communication strategies that fit the unique context and 'localized climate impacts' (Ejaz and Najam 2023) of individual nations as well as nations outside Annex I countries (Vu et al 2019, Nguyen et al 2020), given the significant variance in predictors of the broader public's risk perception and awareness of climate change (Lee et al 2015, Hase et al 2021).Despite facing the worst impacts of our changing climate, non-Annex I countries are largely under-represented in the climate change and media literature (Schäfer and Schlichting 2014, Painter and Schäfer 2018, Bohr 2020).Our cross-national study addresses a significant research gap by focusing on the differences between economically similar, highly vulnerable non-Annex I nations.4.1.Dominant themes and regional differences in media coverage in non-Annex I countries Our study supports and extends findings of recent analyses of media reporting on climate change.For example, Hase et al (2021) found that Global South media coverage focuses on the societal dimension of climate change, including increased reporting on economics, climate politics, and human impacts.Vu et al Suresh V, Madhavan C E V and Murthy M N N 2010 On finding the natural number of topics with Latent Dirichlet Allocation: some observations Advances in Knowledge Discovery and Data Mining ed M J Zaki, J X Yu, B Ravindran and V Pudi (Springer) pp 391-402 Batta H E, Ashong A C and Bashir A S 2013 Press coverage of climate change issues in Nigeria and implications for public participation opportunities J. Sustain.Dev. 6 56 Billett S 2010 Dividing climate change: global warming in the Indian mass media Clim.Change 99 1-16 Blei D M 2012 Probabilistic topic models Commun.ACM 55 77-84 Blei D M, Ng A Y and Jordan M I 2003 Latent Dirichlet Allocation J. Mach.Learn.Res. 3 993-1022 Bohr J 2020 Reporting on climate change: a computational analysis of US newspapers and sources of bias, 1997-2017 Glob.Environ.Change 61 102038 Boykoff M T 2010 Indian media representations of climate change in a threatened journalistic ecosystem Clim.Change 99 17-25 Boykoff M T, Daly M and McAllister L 2021 The beat goes on?Print media coverage of anthropogenic climate change over the past three decades Glob.Environ.Change 71 102412 Boykoff M et al 2023 World newspaper coverage of climate change or global warming, 2004-2023 Media and Climate Change Observatory Data Sets (Cooperative Institute for Research in Environmental Sciences, University of Colorado) (https:// doi.org/10.25810/4c3b-b819)Broadbent J, Sonnett J, Botetzagias I, Carson M, Carvalho A, Chien Y J and Zhengyi S 2016 Conflicting climate change frames in a global field of media discourse Socius 2 2378023116670660 Brookes G and McEnery T 2019 The utility of topic modelling for discourse studies: a critical evaluation Discourse Stud.21 3-21 Brüggemann M and Engesser S 2017 Beyond false balance: how interpretive journalism shapes media coverage of climate change Glob.Environ.Change 42 58-67 Chattopadhyay S 2019 Development journalism Int.Encyclopedia J. Stud.1-8 Chen C, Noble I, Hellmann J, Coffee J, Murillo M and Chawla N 2015 University of Notre Dame global adaptation index country index technical report University of Notre Dame Global Adaptation Index (Notre Dame University) (available at: https://gain.nd.edu/assets/254377/nd_gain_technical_ document_2017.pdf)Comfort S E and Park Y E 2018 On the field of environmental communication: a systematic review of the peer-reviewed literature Environ.Commun.12 862-75 Comfort S E, Tandoc E and Gruszczynski M 2020 Who is heard in climate change journalism?Sourcing patterns in climate change news in China, India, Singapore, and Thailand Clim.Change 158 327-43 Cuartas J et al 2023 Climate change is a threat multiplier for violence against children Child Abuse Negl.106430 Das J 2020 The struggle for climate justice: three Indian news media coverage of climate change Environ.Commun.14 126-40 Deveaud R, San Juan E and Bellot P 2014 Accurate and effective latent concept modeling for ad hoc information retrieval Doc.Numérique 17 61-84 Dhiman B 2022 A comparative study of content in print and online newspapers in India SSRN Electron.J. (available at: www.researchgate.net/publication/364621749_A_Comparatively_study_of_content_in_Print_and_Online _Newspaper_in_India) DiMaggio P, Nag M and Blei D 2013 Exploiting affinities between topic modeling and the sociological perspective on culture: application to newspaper coverage of U.S. government arts funding Poetics 41 570-606 Ebi K L, Vanos J, Baldwin J W, Bell J E, Hondula D M, Errett N A and Berry P 2021 Extreme weather and climate change: population health and health system implications Annu.