Spatially detailed agricultural and food trade between China and the United States

The United States and China are key nations in global agricultural and food trade. They share a complex bilateral agri-food trade network in which disruptions could have a global ripple effect. Yet, we do not understand the spatially resolved connections in the bilateral US–China agri-food trade. In this study, we estimate the bilateral agri-food trade between Chinese provinces and U.S. states and counties. First, we estimate bilateral imports and exports of agri-food commodities for provinces and states. Second, we model link-level connections between provinces and states/counties. To do this, we develop a novel algorithm that integrates a variety of national and international databases for the year 2017, including trade data from the US Census Bureau, the US Freight Analysis Framework database, and Multi-Regional Input-Output tables for China. We then adapt the food flow model for inter-county agri-food movements within the US to estimate bilateral trade through port counties. We estimate 2,954 and 162,922 link-level connections at the state-province and county-province resolution, respectively, and identify core nodes in the bilateral agri-food trade network. Our results provide a spatially detailed mapping of the US–China bilateral agri-food trade, which may enable future research and inform decision-makers.


Introduction
The United States and China are key nations in global agricultural and food trade [1,2]. Bilateral agrifood trade between the US and China impacts their domestic economies, natural resources, and environment [3][4][5], with spillovers to other countries [6,7]. Yet, we do not understand the spatially resolved agrifood trade between these two countries. Estimates of high-resolution agri-food trade between the US and China would enable assessments of spatiotemporal risks, critical infrastructure, and environmental footprints. For this reason, the goal of this study is to estimate the trade of agricultural and food commodities at a high spatial resolution between the US and China, with the identification of core locations.
Both the United States and China are important nations in the global food system. Both countries are major producers, consumers, and trade powers in agri-food commodities [8]. The US is a major producer of soybean and corn [9], and is the world's largest exporter and second-largest importer of crop and livestock products [10]. The US is also a key nation for global processed food trade. In fact, the US is the top exporter of processed food commodities with an average of 16.19% of market share between 1980 and 2012 [11]. It is also the top importing nation for processed foods with 13% of market share in 2018 [12]. In the last two decades, China has undergone tremendous economic growth, resulting in improved living standards [13] and increased demand for food [14]. This increased food demand is largely met by importing from global markets, making China the largest importer of agricultural products [10,15].
US exports to East Asia have increased over the last few decades (150% from 1996 to 2021 [16]), with exports to China dramatically rising since China joined the World Trade Organization in 2001 [17].
In 2017, China accounted for 14.2% of US agricultural exports ($138.4 Billion) and 57% of US soybean exports ($21 Billion). In 2018, China implemented retaliatory tariffs on US agriculture, which dramatically reduced Chinese imports of US agricultural commodities, most notably in soybeans. The trade dispute between the US and China had significant environmental repercussions [18,19], including unintended increases in nitrogen and phosphorus pollution and irrigation water use in the US as farmers shifted from soybeans to other crops [20]. However, agricultural and soy exports rebounded following the trade dispute between the two countries [21].
Mapping the high-resolution agri-food trade network between the US and China would inform supply chain security, infrastructure investment, and environmental footprint assessments. Interruptions within this bilateral trade network could be consequential to both countries, such that spatially resolved risk mapping may help to alleviate potential disruptions [22]. Climate change will alter domestic agricultural supply chains in both countries [23,24], making it essential to understand current infrastructure investment requirements. There is a rich literature on the environmental footprints of trade within and between the US and China [25][26][27], which could be refined in future work with spatially resolved trade information between the two countries.
Recent research has highlighted the importance of identifying connections between sub-national production and international trade [28]. For example, sub-national production in Brazil has been linked with the country of final consumption for beef [29] and soy [28,30]. Sub-national production data linked to trade can improve our quantification of supply chain sustainability, such as for embodied water [31][32][33]. Our study aims to contribute to this literature by developing, what is to our knowledge, the first large-scale estimation of high-resolution agricultural trade between subnational regions of both the receiving and sending countries. The scope of our paper is to develop a modeling framework to estimate bilateral trade links between sub-national locations in two major trading partners, and identify the major trade links, locations, and core nodes.
International trade data between the US and China is available from the COMTRADE database [34]. Data on sub-national food flows are available within both the US and China. The Freight Analysis Framework (FAF) database within the US provides information on the movement of commodities between the 132 FAF zones of the US, which are primarily Metropolitan Statistical Areas and States [35]. The FAF database also provides information on the import and export of FAF zones with 8 world regions. The FAF database has been used in prior work to develop the food flow model (FFM) to estimate high-resolution food flows between counties in the US [36,37]. The US Census provides bilateral import and export data between states in the US and other countries [38]. The multi-regional inputoutput (MRIO) database in China provides information on commodity transfers between provinces, including imports and exports between provinces and the Rest of the World [39,40]. A major hurdle to integrating these disparate databases is the fact that they utilize different coding conventions.
The goal of this study is to spatially resolve the trade of agricultural and food commodities between the United States and China. Doing this requires the development of a novel methodology to overcome data challenges and inconsistencies across a variety of government databases and coding systems. The research questions that guide this study are: (1) How much agri-food trade occurs between China and the United States? (2) What are the major agri-food trade links between Chinese provinces and U.S. states? (3) What are the major agri-food trade links between Chinese provinces and U.S. counties? (4) What are the core nodes in the US-China agri-food trade network? Our methods are detailed in section 2. We present findings that address our research questions in section 3. We discuss our approach in section 4 and conclude in section 5.

Methods
We develop a data-fusion approach to estimate agrifood trade between China and the United States as shown in figure 1. We integrate a variety of government databases (detailed in section 2.1). First, we determine the total amount of agrifood bilateral trade between the two countries (section 2.2). Second, we estimate link-level agrifood trade between Chinese provinces and U.S. states (section 2.3). Third, we further resolve links to estimate bilateral trade between Chinese provinces and U.S. counties (section 2.4). Finally, we use our estimates to determine the core locations in the US-China agri-food trade network (section 2.5).
All 50 states (and the District of Columbia) and 3135 counties are included in the analysis for the United States. There are 8 counties that are not included in the study due to data limitations; see [37] for a discussion. Our study includes 31 provinces for China. The supporting information (SI) provides a detailed list of the counties, states, and provinces included in this study.

Data
A variety of government databases are used as input to this study. Table 1 lists all of the data and models used in this study for the year 2017 as this is the most recent year with all of the required data inputs. We utilize US Census Bureau data [38], which provides bilateral state and port-level import and export data Figure 1. Schematic of the data-intensive methodology developed to estimate high-resolution agricultural and food trade between China and the United States. A variety of government databases are combined to model connections between counties in the United States and provinces in China. The multi-regional input-output (MRIO) database is used to assess trade to/from Chinese provinces. The Food Flow Model (FFM) is used to determine flows to/from port counties within the United States. for the US with China. We use FAF data which indicates imports/exports of FAF zones in the US with East Asia.
We adapt the improved FFM [37] to estimate county-to-port links. The original FFM [36] estimated all inter-county connections within the US. Here, we only require the transport of agricultural and food commodities within the US that is exchanged with China. For this reason, we restrict the FFM to downscale international trade with Asia to international port counties. This is done by selecting the international FAF spreadsheet and restricting the model to FAF imports/exports with Asia. We then partition the fraction of these flows that are traded with China.
A variety of commodity coding systems and resolutions is used across the data. The US Census uses the Harmonized Systems (HS) coding system, whereas the FAF database uses Standard Classification of Transported Goods (SCTG) codes. To enable interoperability between these datasets, we developed an SCTG-HS commodity crosswalk for SCTG categories Agricultural products (except for animal feed, cereal grains, and forage products) 04 Animal feed, eggs, honey, and other products of animal origin 05 Meat, poultry, fish, seafood, and their preparations 06 Milled grain products and preparations, and bakery products 07 Other prepared foodstuffs, fats, and oils 01 to 07, which are the agricultural and food commodities. Our SCTG-HS crosswalk is a table that assigns each HS commodity to a corresponding SCTG category. The Multi-Regional Input-Output (MRIO) database [39] classifies trade into 42 sectors (for a complete list of MRIO sectors, see the SI). In this analysis, we extract Sector P1 (Agriculture, forestry, animal husbandry, and fishery products and services) and Sector P6 (Food and tobacco). Since the MRIO dataset has a different and coarser commodity classification, we create another crosswalk table to assign MRIO commodity categories to SCTG categories. In this way, our final study is organized around the SCTG commodity coding system, which is provided in table 2. In this analysis, we fill all missing values with 0.

Import/export by state and province
In this section, we estimate the bilateral state and province trade for agri-food commodities corresponding to SCTG 01-07 through a deterministic method which manipulates existing data sources. First, we obtain the US Census trade data at the state level that is provided in the HS coding for commodities. The US Census provides trade information for Total Export Value and Total Import Value in dollar value, but not in mass. However, the US Census dataset also provides the variables Vessel Total Exports SWT, Vessel Total Imports SWT, Air Total Exports SWT and Air Total Imports SWT, where SWT stands for shipping weight in kilograms. We utilize Vessel Total Exports SWT and Air Total Exports SWT to calculate Total Export and Total Import in mass (kg) as: TI sc = VTI sc + ATI sc (2) where TE stands for Total Export (kg), TI for Total Import (kg), VTE for Vessel Total Exports SWT, VTI for Vessel Total Imports SWT, ATI for Air Total Imports SWT and ATE for Air Total Exports SWT. Subscript s represents a state and c stands for a commodity in the HS commodity coding system. After obtaining Total Export and Import in mass, we use the commodity crosswalk to assign HS commodities to SCTG codes (see the SI for the SCTG-HS crosswalk). We then sum the exports of agri-food commodities within an SCTG group: where sctg stands for an SCTG commodity group, s stands for a given US state and c for a specific HS commodity. TE and TI stand for Total Export (kg) and Total Import (kg), respectively. We aggregate individual commodities c exported from each state s to SCTG groups. In this way, we obtain estimates of SCTG agri-food exports from US states to China in units of both mass and dollar value (USD).
To estimate province-level trade with the US, we utilize the MRIO dataset. The MRIO data contains information on province-scale imports and exports to the Rest of the World in units of Chinese Yuan (¥). As mentioned in section 2.1, we extract Sector P1 (Agriculture, forestry, animal husbandry, and fishery products and services) and Sector P6 (Food and tobacco) data in this study.
We first convert MRIO trade data from its native units of ¥ into units of mass and USD. To convert MRIO data from ¥ to USD we multiply all entries with the 2017 average annual conversion ratio between the two currencies (which is 0.148 from [41]). This is explained mathematically below: Let's represent a single data entry in the MRIO dataset as M, then: where cf is the 2017 average conversion factor, sec stands for a specific Sector (P1 or P6) and p stands for a particular province.
The values are filtered for only Sectors P1 and P6 to obtain the export and import of each province. US Census data is used to calculate the ratio of imports and exports between the US and China. This ratio is then applied to province-level import and export with the Rest of the World to partition trade with the US. Next, we map the MRIO trade data into SCTG commodity codes. To achieve this, we follow a four-step process.
(a) Aggregate the State level data across all states to obtain import or export values for each SCTG category. (b) Use the crosswalk (see supplementary information) to designate MRIO sectors to the Census data. (c) Obtain the imports and exports for sectors P1 and P6 and merge them with the Census data. (d) Calculate the ratio of SCTG categories as a fraction of the respective MRIO sectors. This is formulated mathematically below: CF e,sctg = CVE sctg /MCT sec (8) where CF stands for the commodity fraction of the SCTG category in the respective MRIO commodity category. CVE and CVI represent the total US Census commodity export and import value (in USD) for the specific SCTG commodity, and MCT is the MRIO commodity total for a particular MRIO commodity category. sctg implies a specific SCTG category (e.g. 01), s stands for a given US state (e.g Illinois), e and i stand for export and import respectively, and sec implies a specific MRIO sector (e.g. Food and Tobacco). This process allows us to extract the commodity fraction of each SCTG commodity type in China's trade with the US. We take the province-level clean MRIO data and merge the ratios calculated in the previous step. Finally, we multiply these ratios with corresponding province-level values to obtain an estimate of trade value (in USD) in a province for each SCTG category. This is shown mathematically below: where ME and MI represent the MRIO export and import value (in USD) respectively for a given province, and CF stands for the commodity fraction of the SCTG category in the respective MRIO commodity category. Also, sctg implies a specific SCTG category (e.g. 01), p stands for a Chinese province (e.g. Hunan), and sec implies a specific MRIO sector (e.g. Food and Tobacco). This gives corresponding SCTG trade values for every sector value of a given Chinese province, based on MRIO-SCTG crosswalk. To obtain a similar estimate of trade mass (in kilograms), we use the statelevel standardized data (US Census data), to obtain the proportion of trade mass with respect to the trade value for every SCTG category. This is expressed mathematically as: where MProp stands for the proportion of Trade in mass (Trade Mass (kg)) corresponding to trade in value (Trade Value (USD)), and MME and MMI represent the MRIO export and import value in kilograms. Also, p stands for a given US state and sctg for a specific SCTG commodity category. We multiply these ratios with trade values for respective SCTG types to obtain the trade mass.

Estimating links between provinces and states
To estimate link-level connections between US states and Chinese provinces, we utilize the standardized datasets (where commodities are classified in SCTG 01-07, and have trade mass and trade value in kilograms and US Dollars, respectively) for both countries at the sub-national level (section 2.2). Then, we calculate the location fraction of provinces, grouped by each SCTG. We obtain the location fraction of a province for a given SCTG category by dividing the trade amount (export or import) by the total trade (export or import) value for that commodity where LF is the location fraction calculated for both imports and exports. EV and IV represent export and import values of a province for a given SCTG commodity sctg respectively, and CT is the total value for a given trade type and commodity. Variables e and i indicate exports from China and imports to China respectively, sctg stands for a specific SCTG commodity category, and p indicates a particular province. Bilateral export value between a given state and province for a given SCTG commodity BI Bilateral import value between a given state and province for a given SCTG commodity Finally, for every commodity, we use location fraction and US state values to obtain the corresponding link-level connections with Chinese provinces. The following equations delineate the process: where SEV and SIV are State Export Values and State Import Values, respectively. BE and BI represent bilateral export and import values between a given state and province for a given SCTG commodity respectively, and LF is the location fraction calculated for both imports and exports. Also, s stands for a given US state and sctg for a specific SCTG commodity category, e and i indicate exports from China and imports to China, and p indicates a particular province. We list all variable symbols, definitions, and sources in table 3.

Estimating links between provinces and counties
We use the FFM to estimate county-level agri-food trade with China. The FFM was developed to estimate county-level food flows within the US. The original model mapped county-level food flows between all county pairs [36,37]. In this study, we adapt this model to estimate county-level food flows specifically for import and export with China.
To do this, we use the FFM to downscale FAF level imports and exports between Eastern Asia and the US per SCTG commodity (see table 2 for the list of SCTG commodities). FAF-level imports and exports between Eastern Asia and the US data is obtained by using only the international FAF spreadsheet. We estimate 2017 food flows between provinces and counties under two main assumptions: (i) Counties with international ports are the only ports of entry/exit for international trade, consistent with FAF-level data [35]. (ii) Commodities are perfectly mixed once they arrive at international ports and in trade to/from ports.
We adapt the FFM to capture international trade with China by constraining international trade to pass through port counties. Here, we use the canonical FFM which employs logit/probit technique for link estimation and Gamma mixture model to estimate food flows between counties. However, we limit the input data of our model, such that domestic origins are only the counties that have international ports for import from China. For exports to China, we limit domestic destinations to counties with international ports this time (see SI for the complete list of counties with international ports). However, we include all 3135 counties within the U.S. for the re-distribution of food commodities along the U.S. both before and after the international export and import, respectively. Figure 2 illustrates the multi-step nature of our modeled estimates.
The FAF database provides information on international trade between FAF zones and eight world regions, including Eastern Asia (see the SI for the list of world regions provided in FAF). This provides key information on the total trade amount (kg) per commodity between Eastern Asia and FAF-zones in the U.S. We disaggregate the total trade to/from Eastern Asia into trade solely with China through the use of the perfect mixing assumption. To do so, we compute the percentage of trade amount (kg) China contributed over the total trade amount of Eastern Asia countries per commodity for the year 2017 by UN COMTRADE [34]. This ratio is applied to our estimated county-level food flows to disaggregate China's contribution to food flows between port counties and interior counties (see SI for perfect mixture percentages per commodity for import and export separately). This process is repeated separately for imports and exports.

Core nodes in US-China agri-food trade network
We identify the core nodes in the US-China bilateral agri-food trade network following the approach in [1]. We do this per SCTG commodity for imports and exports separately. For each SCTG commodity, we create a binary network where nodes represent US counties and China provinces, and links represent whether a trade exists or not based on the estimated food flows. Here, we looked at the binarized version of the network since we want to identify the locations (both counties and provinces) that are critical for the connectivity structure of the US-China trade network. We include all 3135 US counties and 31 Chinese provinces as network nodes, where 228 of the US counties have international ports.
There are 719 643 potential links in the bilateral trade network when we consider a network consisting of an internal county (either a source or destination of commodity trade), a county with a port, a province with a port, and the internal province (either a source or destination of commodity trade). The density of the county-level network is computed as the ratio of the estimated number of links over the potential number of links.
To determine the core nodes, we adopt TOPSIS, a multi-criteria decision analysis technique. TOPSIS is a commonly used approach to determine the importance of network components [42,43] as it ranks the components based on their pre-determined criteria performances [44]. We use node betweenness centrality and degree [45] as criteria to assess the core counties. Degree centrality is the number of connections each node has, and betweenness centrality is the ratio of shortest paths passing through a node's overall potential shortest paths. High degree centrality represents a node that is well-connected with the rest of the network [46]. Similarly, if a node has high betweenness centrality, then it is one of the main bridges connecting the origin and destination points [47]. Nodes with both high degree and betweenness centrality are defined to be core nodes, which represent the most central and connected nodes in the bilateral trade network [1].

Results
In this section, we address our research questions. We present findings on US-China agri-food bilateral trade at both national and sub-national scales, and highlight key commodities, trade links, and core nodes. Our results here are based on trade in units of mass (kg); the SI contains additional results based on trade in units of US Dollars ($). Figure 3 shows US exports and imports with China at the national scale for agri-food commodities. Agrifood commodities are categorized in SCTG groups (see table 2) in both units of mass (in million tons) and value (in billion USD). The US generally is a net exporter of agri-food to China. The US is a major exporter of SCTG 03 to China, with an export mass of 32.58 million tons and value of $14.15 billion US dollars. This is consistent with the fact that the US is a major exporter of soybean to China, and soy is contained within SCTG 03. The US is a net importer of processed food from China, as can be seen for SCTG 06 and 07. The US imports the most SCTG 07, with a mass of 1.38 million tons and a value of $2.06 billion US dollars.

How much agri-food trade occurs between China and the United States?
Figures 4-7 map imports/exports for provinces and states for each SCTG (e.g. SCTG 01-07). The top ten importing and exporting states and provinces by mass are listed in table 4. Louisiana, Washington, Illinois, Texas, and Ohio are the five largest exporters of agri-food to China, accounting for 72% of agri-food exports by mass. California, New Jersey, New York, Illinois, and Texas are the top five importing states, responsible for 62% of all US imports from China. For China, Shandong, Yunnan, Hubei, Guangdong, and Guanxi are the biggest exporting provinces, making up about 49% of all exports. Guangdong, Fujian, Jiangsu, Shandong, and Jilin are the top importing provinces, with 42% of all agrifood imports.

What are the major agri-food trade links between Chinese provinces and U.S. states?
Answering this question requires examining linklevel interactions between states and provinces. We estimate that there are 2954 undirected links between states and provinces (2618 directed links for US imports; 2704 directed links for US exports) out of a potential 3348 links. This means that there is a high level of interaction between states and provinces, with a network density of 0.88. Density broken down by direction of trade is 0.81 for total exports and 0.78 for total imports (see SI for the network density categorized by each SCTG commodity type). It is important to note that we calculate network density in which each commodity provides a separate link. For example, if Beijing province exports SCTG 01 to Illinois, that is a unique trade link, and so is the export of the same commodity from Illinois to Beijing. Figure 8 maps link-level trade between states and provinces for all agri-food commodities (i.e. the sum of SCTGs 01-07) in mass (see the SI for a version in dollar value ($)). Blue lines show exports from US states; red lines represent exports from Chinese provinces. The four largest export links are from Louisiana to Guangdong, Fujian, Shandong, and Jiangsu. Washington is also a major exporter to Guangdong, Fugian, Shandon, Jiangsu, and Jilin. The largest US import links are to California from Yunnan, Hubei, and Shandong, with New Jersey another important location for imports from Yunnan, Hubei, and Shandong. These link-level results are consistent with our state-level estimates in section 3.1, where we established that Louisiana, Washington, and Illinois are the top exporters, and California and New Jersey are the top importers.  Figure 9 shows link-level trade between counties and provinces for aggregated agri-food commodities (i.e. the sum of SCTGs 01-07). We estimate that there are 162 922 undirected links for the countyprovince network (137 375 directed links for US exports; 112 179 directed links for US imports) out of 719 643 potential links. This means that the countyprovince network has a density of 0.23. Density broken down by direction of trade is 0.191 for total

. Barplots of US exports and imports with China in mass (A), (B) and US Dollars (C), (D) by Standard Classification of
Transported Goods (SCTG) agri-food category. SCTG 03 'Agricultural products' , which contains soy, is the largest export from the US to China, whereas SCTG 07 'Other prepared foodstuffs' is the largest import to the US from China. exports and 0.156 for total imports (see SI for the network density categorized by each SCTG commodity type). The county-province network is thus less connected than the state-province network, as we would expect, due to finer spatial granularity. Yet, it is still a well-connected network, highlighting the tight coupling in agri-food trade between the US and China. Figure 9 shows a similar spatial mapping to figure 8 (see the SI for a similar map in units of dollar value ($)). The largest county-level exports to China are those of SCTG 03 from Louisiana counties, namely Terrebone Parish, East Baton Rouge Parish, and New Orleans, to the Chinese provinces of Guangdong, Fujian, Shandong, and Jilin. For US imports from China, we observe a mix of SCTG 03, 05, and 07 commodities. Shandong to Los Angeles County, CA and Yunnan and Hubei to Middlesex County, NJ trading SCTG 03 and SCTG 07, respectively, are the largest US import links in mass (see table 6).

What are the core nodes in the US-China agri-food trade network?
The ten core locations for import and export across agri-food commodities are listed in table 7 (see SI for the top 10 core locations broken down by SCTG commodities as well as a detailed explanation of the core node analysis). As we expect, the counties with international ports are identified as the core locations, since they are responsible for collecting and disseminating trade between the two countries. Here, port counties are the transit hubs that facilitate the international trade of goods through air, land, and sea.
Note that our core node analysis include the complete US-China bilateral trade network. This means that international trade between counties and provinces is included, as well as our inter-county estimates to ports in the US, and inter-province connections within China as given by the MRIO data. However, Chinese provinces are not identified to be core to the network. We believe this is primarily because spatially refined food flow information is not available within China to the same detail that it is within the US. This means that we do not have the same degree of topological detail within China as we do within the US. We have normalized the node degree centrality of all network nodes to account for this difference in resolution. Chinese provinces do appear in the top twenty list of core nodes, just not within the top ten. This highlights the importance of spatial detail in food trade information for resolving the core locations.
Core nodes have domestic significance in terms of regional employment, export gains [48], and have an international relevance as a medium of connecting supply chains across countries [49]. However, they are sensitive to disruptions which could have major consequences for both importing and exporting nations in the bilateral network [50,51]. Identification of such centralized nodes can better inform the scale of the potential impact of any future disruptions, help to mitigate logistical and economical losses for both countries, and assess network resiliency [52,53]. Also, since these core counties (port counties) are transit hubs, catering to high trade demands, recognizing such hubs in the network is integral to prioritizing infrastructure investments in these locations for an uninterrupted operation.

Discussion
Here, we discuss the limitations of our model, and highlight opportunities for future research.

Limitations of modeling approach
The methodological framework that we employed in this study is data-driven and relies on a variety of input databases. The data-driven nature of our approach is both a limitation and an advantage. It is a limitation because we do not explicitly include mechanisms that would enable us to run experiments with the model. Our data-driven approach also means that input data availability is a limitation to the time frame of our study. This is the reason that our study is restricted to the year 2017, for example. All estimates in this study are for the year 2017, which is before the retaliatory tariffs were implemented by China on US agriculture. This means that our results represent a snapshot in time prior to these US-China trade tension.
We show in figure 10 that our model captures the animal feed supply chain between the two countries, in which soy is grown in the US and supplied to animal feed operations in China. There is reasonable spatial correspondence between soy production in the US (from USDA data) and our model estimates of SCTG 03 exports to China (e.g. compare figure 10(A) with figure 10(B), although it is not perfect, largely because we map soy production by SCTG 03 exports. There is good spatial agreement between the locations with intensive pig production in China (taken from [54]) and our model estimates of province-level imports of SCTG 03. We thus notice that major soy producing locations also tend to be major exporters, while pig fattening locations in China are likewise major importers. Figure 10 also highlights a major limitation of our study, in that it is restricted by the commodity resolution of the SCTG groups (e.g. see table 2). This means that we do not explicitly model the trade of soy, for example, but, instead estimate imports/exports of SCTG 03 'agricultural products' , the commodity category which contains soy.
The main shortcoming of our study is the absence of ground truth data to validate our high-resolution trade estimates. Yet, our methodological framework is constrained by a variety of real-world measures to ensure that our estimates are bounded by reality. Another limitation of the analysis is the mirror differences in the data sources e.g. there might be a difference in reporting for exports from US to China and China imports from the US. Here, we utilize the US Census data as the standard dataset for the conversion of MRIO sector-based commodity data into SCTG commodity data due to accessibility and credibility concerns. Also, as MRIO reports data in a coarser commodity category than SCTG commodity categories, the crosswalk only provides an estimate of the corresponding SCTG equivalent share which might be different from the actual records.
Our approach relies on some key assumptions. Namely, we assume that goods are well mixed once they arrive at a port. This means that they are sent out in the same proportion that they are received. In other words, if 30% of SCTG 03 received at Los Angeles County, CA was from Champaign County, IL then we assume that exports from Los Angeles County, CA to provinces in China are still in the 30% proportion from Champaign County, IL (see section 2.4). Similarly, we determine the proportion of trade by SCTG commodity that is from the US to China (out of the East Asia region reported by the FAF database) using a national ratio to partition this trade. The same is true for estimating exports from provinces to the US as a fraction of the Rest of World reported by the MRIO database. Further, we assume that the national average shares of SCTG commodities within MRIO sectors (using SCTG-MRIO crosswalk) applies equally to all provinces. Finally, we assume that the location fractions of provinces apply equally to the trade with all US states. These limitations and assumptions of our study are important to note when using our estimates.

Future research
The main advantage of our model is the provision of high-resolution US-China agri-food trade. We provide a schematic of what the output of our model looks like in figure 2, for several important locations. Figure 2 illustrates that we map both imports and exports at high spatial resolution between the two countries. In particular, we restrict our trade estimate to move to/from port counties in the US, highlighting the two-step nature of our food flow estimates within the US. This figure is a simple illustration of two trade links, while we provide many more link-level estimates that future researchers and decision-makers could use to inform their future analysis. Decisionmakers may want to augment our estimates with additional local data that they are in possession of.
We used the FFM to estimate flows between counties in the US, but adapted it to specifically estimate exports and imports from China. Our adaptation of the FFM highlights that it can be used to partition both domestic and international supply  chain information within the US in future work. Moreover, this study could serve as a useful blueprint for other geographic regions and goods. In particular, other countries that have sub-national supply chain data would make viable candidates for applying our methodology to. For example, future work may integrate our approach to map the US-Brazil supply chain, using the TRASE database [30]. Application to other supply chains with important sustainability and national security implications, such as rare earths, energy commodities, or forestry products, would be particularly valuable.
The spatially detailed trade data that we have generated in this study (and make freely available with the paper) could be paired with footprint estimates to quantify embodied resource (e.g. water, carbon, etc) in future work. However, it is important to note that our flow database would help to resolve transport footprints, which are considerably smaller than production footprints [27], but could still be large and comparable to the emissions of some nations [55]. Another important avenue of future research is to determine any vulnerabilities that may exist in this bilateral agri-food trade network. Since  trade between the US and China has global repercussions it is important to identify opportunities to reduce risk in their joint supply chains. As such, our work could also be used to inform supply chain security, such as identifying locations to explore further for infrastructure investment.
Our results are provided at two spatial resolutions within the US: states and counties. However, the statistical information needed for our model are only available for provinces within China. Future research on multi-scale agri-food supply chains in China could build on this work. In particular, it would be beneficial for future studies to identify the movement of agricultural and food commodities through the ports of China, as we were able to do within the US. Similarly, further efforts to identify the chokepoints in agri-food supply chains within both countries, such as processing facilities, transit hubs, and storage depots, would provide the insight necessary to evaluate their sustainability and resilience.

Conclusion
In this study, we mapped the finest spatially resolved bilateral agri-food trade between the US and China.
To do this, we developed a novel data-driven model that allowed interoperability between different national datasets. We generated state-province and county-province links that help in identifying the most concentrated food flows and major subnational locations in the bilateral trade network. Louisiana, Washington, and Illinois are the major agri-food exporters, while Guangdong, Fujian, and Jiangsu are the major importers. We also identified the core export locations in the bilateral agri-food trade network are namely, San Bernardino County, CA; Los Angeles County, CA; Ventura County, Ca; Cook County, IL; and DuPage County, IL. These core nodes are transit hubs that are critical to the seamless functioning of the US-China bilateral trade network, as any disruption could have major logistical and economic losses for both countries. We notice that Chinese provinces are not part of the top 10 core nodes in the network. We suggest that this is primarily due to the unavailability of high resolution food flow information within China. This highlights the importance of spatially detailed data for the identification of critical nodes in a trade network.
Our study contributes to a more comprehensive understanding of agricultural and food trade between the US and China. Future research can explore supply chain sustainability, security, and investment between the two countries using the database that we have provided with the paper. Notably, the framework that we developed may prove to be a useful blueprint for modeling the supply chains of other geographic regions and goods.

Data availability statement
The data that support the findings of this study are openly available at the following URL: https://doi.org/ 10.13012/B2IDB-3649756_V1.
or The MITRE Corporation. We would also like to gratefully acknowledge the publicly available datasets that enabled this work provided in table 3.