Spatial and temporal distribution of HCHO and its pollution sources based on satellite remote sensing: a case study of the Yangtze River Economic Belt

In recent years, with the acceleration of industrialization and the expansion of urban scale, air pollution including formaldehyde (HCHO) becomes more and more serious. In order to study HCHO pollution in the Yangtze River Economic Belt (YEB), the temporal and spatial evolution of atmospheric HCHO and its influencing factors were analyzed by using the Ozone Monitoring Instrument (OMI) during 2012–2021. The results showed that the concentration of YEB HCHO column was unevenly distributed, with high values concentrated in Anhui, Jiangsu, Yunnan and Hubei provinces. During the past 10 years, the concentration of YEB HCHO column varied between 10.28 and 17.19 × 1015 molec cm−2, and the lowest concentration of HCHO column was 13.16 × 1015 molec cm−2 in 2015. However, it reached the peak value in 2018 (14.93 × 1015 molec cm−2). In natural sources, normalized vegetation index (NDVI) and leaf area index (LAI) had greater influence on YEB HCHO, and the correlation was −0.91 ∼ 0.97 and −0.9 ∼ 0.95, respectively. The positive correlation area between HCHO and Mean annual temperature (MAT) reached 93%. The contribution of high-intensity human activity areas to HCHO cannot be underestimated. Industrial and civil sources have great influence on HCHO. In addition, the potential source of HCHO in Shanghai is affected by local emission sources, trans-regional potential sources, northwest air mass and ocean airflow.


Introduction
Formaldehyde (HCHO) is one of the main sources of photochemical pollution. At the same time, it is listed as the first carcinogenic substance and the second in the list of toxic chemicals in China, accounting for more than 50% of the risk of cancer. Long-term exposure to high concentrations of HCHO can cause respiratory diseases, memory loss and nervous system failure [1]. Its source can be divided into man-made source and natural source, the human source is mainly energy consumption, chemical enterprise production and motor vehicle exhaust emissions. Vegetation and precipitation are one of the natural sources, and plants are the largest emission source of biogenic volatile organic compounds (BVOCs). When BVOCs are comprehensively affected by temperature, photoactive radiation (PAR) and water vapor, HCHO will be produced through photochemical reaction [2].
In recent years, using remote sensing to analyze and observe regional atmospheric HCHO is an effective method to study HCHO. After GOME and SCIAMACHY, a new generation of the Ozone Monitoring Instrument (OMI) is favored by scholars for its high accuracy, high stability and high safety monitoring of HCHO concentration [3,4]. Kuttippurath et al believed that HCHO was of great harm to human body. They used remote sensing satellite to measure HCHO column concentration in India and found that in the lockdown period of COVID-19, the increase of HCHO concentration in atmosphere was related to pyrogen and biological sources besides anthropogenic sources [5]. Bai et al estimated the emission fluxes of isoprene and BVOCs in subtropical plantation based on HCHO data measured by satellites and the quantitative relationship between BVOCs-HCHO [6]. Syuichi et al and Hyo-Jung Lee et al used OMI satellite data to study the ratio of HCHO to NO 2 and its long-term trend in East Asia (China, South Korea, and Japan), and found that anthropogenic emissions greatly affected the O 3 sensitive mechanism [7,8].
The Yangtze River Economic Belt (YEB) has a highly integrated city agglomeration system that hosts 30% of the country's petrochemical industry and 40% of the cement industry, and the pollutant emission intensity per unit area is 1.3 ∼ 2.1 times higher than the national average, and the unreasonable energy industry structure has led to increasingly serious air pollution in the YEB [9]. Therefore, the Chengdu-Chongqing urban agglomeration, the middle reaches of the Yangtze River urban agglomeration and the Yangtze River Delta urban agglomeration have all become key regions for air pollution prevention and control in China, and recent studies have shown that the GDP of the YEB becomes the largest contributor to carbon dioxide (CO 2 ) emissions in the region [10]. Zhou et al analyzed PM 2.5 in three major YEB urban agglomerations and found that particulate matter pollution in the middle reaches of the Yangtze River urban agglomeration is severe and related to precipitation and temperature [11]. So far, volatile organic compounds (VOCs) emitted from YEB account for 44% of the national total, and the contribution of VOCs emissions from industrial sources and traffic sources accounts for more than 80% of the total regional emissions, and research on non-methane VOCs (NMVOCs) for YEB has been a hot issue in the region. In 2012, the Chinese Ministry of Environmental Protection released the 12th Five-Year Plan for Air Pollution Prevention and Control in Key Regions. It was emphasized that atmospheric pollutants have become a bottleneck limiting regional socioeconomic development and need to further improve the atmospheric environment, which has led to extensive discussions in the society, and most of the scholars' reports on the atmospheric environment of YEB have focused on ozone, PM 2.5 , or CO 2 [12,13]. In contrast, based on remote sensing inversion data, little has been reported about the spatial and temporal pollution characteristics of HCHO in the YEB, and the relationship between human activities, including NMVOC, and HCHO in the region.
In view of the above reasons, this paper explores the spatial and temporal pollution characteristics and source-sink relationships of HCHO in YEB from 2012 to 2021 based on OMI satellite column concentration data of HCHO, combined with relevant meteorological data and statistical yearbook information. This is still of great strategic importance to promote the establishment of joint prevention and control of regional air pollution.
YEB (97°21′E ∼ 123°10′E, 21°8′N ∼ 35°08′N) spans the east, central and west regions of China (figure 1), covering 10 provinces and 1 city, with a population of 42.8% of the country, and is the largest inland industrial belt and manufacturing base in the world, with large scale of large petrochemical, chemical and iron and steel enterprises, whose industrial structure is heavy and energy consumption is dominated by coal, and urbanization rate exceeds 60%. However, urbanization and economic development also bring huge environmental pressure, leading to an increase in the contribution of anthropogenic volatile organic compounds (AVOCs), further aggravating air pollution.

Datasets
The daily remote sensing data of HCHO, O 3 and NO 2 (L2 v003) are obtained by The Ozone Monitoring Instrument (OMI) mounted on the Aura satellite. The OMI has an average spectral resolution of 0.45 nm and covers the globe once a day. The transit time is generally 13:38 local time. The monitor data products contain various atmospheric data such as HCHO, O 3 , NO 2 and absorbent aerosol. The concentration data of atmospheric pollutants (HCHO, O 3 , NO 2 ) selected in this paper eliminate cloud cover interference, and are stored in HDF5 format, based on DOAS technology (Differential Optical Absorption Spectroscopy), it is calculated by Combineding the radiation transport model and IMAGES global chemical transport model [14].
The Multi-resolution Emission Inventory (MEIC) (http://meicmodel.org/about.html) is a man-made emission inventory model of air pollutants and greenhouse gases developed on the cloud computing platform. The inventory provides annual data of five emission sectors (industrial sources, civil sources, transportation sources, power sources and agricultural sources), which can be used for the analysis of the causes of air pollution and the research in air quality control [15]. Due to the lack of data beyond 2018, the annual data of NMVOC from each sub-sector (industry, transportation, residential, and electricity) from 2012 to 2017 was selected in this paper.
Vegetation data (Normalized Vegetation Index (NDVI), Leaf Area Index (LAI)) and land utilization data were provided from Data Center for Resources and Environment, Chinese Academy of Sciences (http://www. resdc.cn). National Oceanic and Atmospheric Administration (https://psl.noaa.gov/) provided meteorological data (mean annual temperature (MAT), mean annual water content (MAWC)). The National Bureau of Statistics of the People's Republic of China (http://www.stats.gov.cn/) was the source of economic data on human activities.

Calculation of data
Pearson correlation coefficient was proposed by British mathematician Carl Pearson in the study of regression statistics, to indicate the degree of linear correlation between two variables. The greater the absolute value of the coefficient, the stronger the correlation between two variables. It has been widely used in atmospheric science, teaching and research development, medical biology and other fields [16].
In this study, Pearson correlation coefficient was used to analyze the correlation between HCHO and NDVI, LAI, MAT, MAWC and other atmospheric pollutants (O 3 and NO 2 ). Formula (1) was provided as follows: Where: R xy represents the correlation coefficient between x and y, and its value range is [− 1, 1]. x i refers to the average value of HCHO column concentration in the i year,x represents the multi-year mean value of HCHO, and y i represents the annual average value of vegetation data, meteorological data or other air pollutant data in the i year;ȳ refers to the multi-year mean value of other data, and N refers to the number of samples. The potential source contribution factor algorithm (PSCF) can roughly determine the potential source area of air pollutants, which is obtained by the ratio of the trajectory greater than the set threshold and the residence time of all trajectories in the area. MeteoInfo software was used to create coverage grid according to the longitude and latitude range of the study area, with the grid cells being 1°× 1°. For calculation of PSCF value, see Formula (2): is the track of grid (i, j) or the number of track nodes; m (i, j) represents the number of tracks or track nodes with pollution concentration of corresponding receptor points in grid (i,j) that is greater than the set concentration level. Since the PSCF value represents the probability function, and the dwell time of each trajectory through the grid varies, the PSCF value may be uncertain, so it shall be interval weighted and error reduced [17,18]. The weight function ( ) W i j , is defined as (3):

Interannual spatial distribution of tropospheric HCHO
In order to illustrate the spatial distribution of YEB formaldehyde column concentration more intuitively, based on the HCHO column concentration from 2012 to 2021, the YEB formaldehyde was divided into 7 grades with interval between the minimum concentration of 10.28 × 10 15 molec cm −2 and maximum concentration of 17.19 × 10 15 molec cm −2 (figure 2).
In 2012, the concentration exhibited a downward trend from the southeast to the northwest, mainly at the third and fourth level. The high value areas were mainly gathered in Anhui and Hubei provinces, generally at the fifth and sixth level. The low value areas were mainly distributed in Chongqing and Sichuan, while the concentration in the surrounding areas of Ganzi Tibetan Autonomous Prefecture was the lowest, appeared as the first level area. In 2013, HCHO decreased as a whole, and more second and third level regions appeared in Sichuan, Jiangxi and central China. The newly issued environmental pollution control standard in 2013 required serious control of the concentration of environmental pollutants, and the concentration of formaldehyde was significantly reduced. In 2014, HCHO increased significantly, mainly at the fifth and sixth levels. High value areas mainly located in Anhui, Hubei and eastern Yunnan. The concentration of HCHO reached its peak in northwest Anhui. Biomass combustion, especially straw combustion, accounted for a large proportion in Anhui, and external transportation from the northwest also had a certain impact [19]. From 2015 to 2017, the HCHO column concentration showed a general downward trend, mainly at the third and fourth levels. In 2015, the concentration of HCHO column on the whole decreased to the lowest level, with the average value of HCHO 13.16 × 10 15 molec cm −2 , and it was also the year with the widest distribution of third level region. In 2015, the Law of the People's Republic of China on the Prevention and Control of Air Pollution was issued to control and prevent pollution caused by coal burning, industrial pollution, dust pollution and other energy pollution, which significantly reduced the concentration of air pollutants nationwide. In 2016, highvalue areas expanded, mainly in Yunnan, Anhui and Jiangsu. In 2016, the annual growth rates of automobile ownership in Yunnan, Anhui and Jiangsu were 13.63%, 5.86% and 1.45% respectively. The increase of automobile exhaust emissions increased the concentration of formaldehyde to a certain extent. In 2017, the concentration of HCHO notably decreased, and the concentration of formaldehyde in Chongqing, Guizhou, Jiangxi and Hunan decreased significantly. As the year of the outbreak of environmental protection policies, the environmental standards in 2017 covered urban air quality, air pollution control of thermal power plants, volatile organic compounds (VOCs) control and so on, causing decrease in HCHO and other atmospheric pollutants. In 2018 and 2019, the concentration of formaldehyde column increased first and then decreased, but overall was high. The average value was 14.93 × 10 15 molec cm −2 in 2018, reaching the peak in the past 10 years, with a growth rate of 10.54%. The fifth and sixth level regions account for about 77% of the area. In 2019, a slight decline apperared, and the scope of the sixth level areas was narrowed, with main areas at the fifth level. In 2018 and 2019, YEB's GDP accounted for 44.76% and 46.6% of the national total, respectively, indicating rapid economic development in YEB. Industrial energy consumption has been increasing year by year, with the change rate of energy consumption from industrial sources increasing from −0.3 percent in 2017 to 2 percent in 2019. At the same time, the number of cars increased by 8% and 7%, respectively, which increased the concentration of YEB formaldehyde column to a certain extent. The concentration of formaldehyde column continued to decrease in 2020, and the range of the third level and the fourth level was expanded, which was related to the lockdown of the COVID-19 pandemic in 2020. Due to extensive restrictions on travel, commercial operations and interpersonal communication, the emission of air pollution including formaldehyde significantly decreased in 2020 (in February 2020 compared with February last year, air pollutant emissions decreased by 19%-36%), NMVOCs emissions decreased by 4%-31%, NOx emissions decreased by 7%-36% [20]. HCHO concentration increased slightly in 2021. In general, the HCHO concentration in the northwest of Sichuan, central and coastal areas of YEB has been low in the past 10 years. Due to the terrain, the concentration of formaldehyde in Ganzi Tibetan Autonomous Prefecture of Sichuan and the central part of YEB was low. In addition, due to the flat terrain and the influence of sea and land breeze, the HCHO concentration in the coastal area was not easy to accumulate.

Interannual temporal variation of tropospheric HCHO
The mean value of YEB formaldehyde concentration is 13.98 × 10 15 molec cm −2 . The average concentration of formaldehyde column in the Yangtze River Delta (YRD) is 14.16 × 10 15 molec cm −2 . Since its more developed economy, large population and large energy consumption, the HCHO concentration of YRD is slightly higher than that of YEB [21]. The average concentration of formaldehyde column in the troposphere in China is 10.21 × 10 15 molec cm −2 , 37% less than the YEB formaldehyde column concentration [22].
The interannual change of YEB formaldehyde column concentration from 2012 to 2021 was shown in figure 3. In this decade, the interannual minimum and maximum of HCHO column concentration were 11.94 × 10 15 molec cm −2 and 15.84 × 10 15 molec cm −2 . From 2012 to 2014, the concentration of YEB formaldehyde column decreased first and then increased. From 2014 to 2018, there was a trend of 'W' type change. 2014 and 2018 were the two peaks, with HCHO concentration notably higher than the previous year. In these two peak periods, YEB's economy developed rapidly and its industrial output value increased. In 2014, the number of industrial enterprises above designated size was 182573, and the car ownership increased by 10.35% compared with the previous year, while the completed housing area increased by 11.77%. In 2018, the total output value of YEB reached 40298.5.1 billion CNY, accounting for 44.76% of the country's total, the GDP growth rate reached 7.7%, and the growth rate of industrial growth above designated size was 6.9%. At the same time, the significant increase in concentration of formaldehyde column was also due to the impact of high temperature, solar radiation and other natural factors. In the past 10 years, the HCHO column concentration in 2015 was the lowest, being 13.16 × 10 15 molec cm −2 , and the change rate was also the largest, down 10 percent from the previous year. In 2015, the list of emission sources of air pollutants was released, VOCs pollution sources in the fossil industry were investigated, and ultra-low emissions and energy-saving transformation of coal-fired power plants were comprehensively implemented. The emission of air pollutants was effectively controlled, and the concentration of formaldehyde column was significantly reduced. During the period from 2018 to 2021, with 2020 as the dividing point, the change of formaldehyde column concentration exhibited a decline first and then an increase, but in a general downward trend. In 2020, with continuous effort to improve air quality, coupled with the impact of the COVID-19, and implement of a series of measures to limit production, the total emissions of air pollutants declined, and the number of pollution days reduced. The concentration of formaldehyde rose in 2021, which may be related to the recovery of China's industrial economy [23]. In addition, according to the classification standard of formaldehyde level shown in figure 2, the proportion of areas of different YEB formaldehyde level in the past 10 years was calculated. The total amount of formaldehyde column concentration at different levels can be sorted as follows: the fourth level > the fifth level > the third level > the sixth level > the second level > the seventh level > the first level. Among them, the fourth and fifth level accounted for more than 66% of the total area, while the first and sixth level accounted for less than 2% of the area. The proportion of the fifth level has expanded in 2014. In general, the concentration of formaldehyde column varies greatly in the decade, and the proportion of formaldehyde column concentration level in the area is also different, but was still dominated by the fourth and fifth levels.

Discussion
Areas with high level of formaldehyde concentration mainly located in Yunnan and northeast of YEB, such as Anhui, Jiangsu, Hubei, and western Zhejiang ( figure 4(a)). Combined with the land utilization map of the Yangtze River ( figure 4(b)), it was found that areas with high formaldehyde concentration were all areas with high vegetation coverage or intensive human activities. Subtropical forests are an important source of BVOCs emissions in China. Isoprene released by vegetation is oxidated through photochemical reaction, increasing the concentration of formaldehyde column concentration [24]. In Anhui, Jiangsu, Hubei and northern Zhejiang, the cultivated land area is widely distributed, human activities are intensive, and the concentration of formaldehyde column is high. Based on this, this paper discusses the contribution of natural factors and human factors to the column concentration of formaldehyde.

. Influence of vegetation and meteorological factors on HCHO
To study HCHO column concentration based on BVOCs, it is necessary to consider vegetation coverage and leaf growth. Compared with shrubs and herbs, woody plants produce more BVOCs [25]. In addition, under different growth environments, the contribution of leaf development characteristics to isoprene emission rate is different [26]. Barkley et al introduced the two factors-leaf area index and the enhanced vegetation index when studying the change of formaldehyde in South America, and found that the change of vegetation corresponded with the decrease of HCHO column concentration [27]. Generally speaking, if the temperature rises by 10°C, the isoprene emission will also increase by two to four times. If the plant is transferred to a low temperature environment, the leaf stomata will be closed to reduce its concentration [28,29]. Water vapor leads to greenhouse effect meanwhile providing growth conditions for vegetation, thus affecting the conversion rate of VOCs and other precursors of formaldehyde [30]. It can be seen that the vegetation condition and atmospheric environment have a great impact on the HCHO column concentration. Pearson correlation coefficient method was used in this study to discuss the contribution of natural factors to the concentration of HCHO column.
YEB is mainly located in the monsoon climate area, with simultaneous rainfall and heat. In addition to the high vegetation coverage, MAT and MAWC often work together on atmospheric formaldehyde, resulting in HCHO column concentration changes more sensitive to the atmospheric response effect. Figure 5(a) is the spatial distribution map of the correlation between YEB formaldehyde and NDVI. The correlation value ranges from −0.91 to 0.97. The positive correlation areas located in most areas except Yunnan Province, accounting for 72% of the total area. Among them, the provinces in the middle and lower reaches of the Yangtze River plain showed stronger positive correlation. The negative correlation area accounted for less than one third of the total area, mostly in Yunnan, southern Sichuan, Guizhou and other parts of the region. This was suspected to be related to precipitation. In addition, the higher altitude in northern Yunnan and southern Sichuan inhibited the growth of vegetation due to the poor light and heat condition; River and lake water showed strong negative correlation. The correlation between HCHO and LAI is shown in figure 5(b). The figure shows that the correlation value ranges between −0.9 ∼ 0.95, and the west of YEB mainly exhibits negative correlation. In addition, there are some negative correlation areas in the north of Anhui and Jiangsu and in the water area. Most of the eastern provinces exhibit positive correlation, and their area is four times of the negatively correlated region, accounting for 80% of the total area of YEB. The positive and negative correlation distribution pattern of LAI is consistent with that of NDVI ( figure 5(a)). In general, areas with large vegetation coverage have a higher contribution rate to local formaldehyde, especially in summer when large area of leaf greening is significant [31]. The area with positive correlation between MAT and HCHO reached 93%, with a notably strong positive correlation in the middle and lower reaches of the Yangtze River. The west of YEB mainly exhibits medium correlation, and only in the south of Sichuan Province and a small part of western Yunnan Province shows weak negative correlation ( figure 5(c)). YEB is mostly located in the subtropical region with high temperature. On the one hand, high temperature accelerates the collision frequency between gas molecules, and formaldehyde is more easy to produce under photochemical conditions; On the other hand, under high temperature conditions, the activity of biological synthetase in plants increases, and the plants emit a large amount of isoprene through the leaf stomata, leading to the increase of HCHO concentration in the atmospheric environment [32]. Therefore, under the background of high national temperature in 2018 and 2019 (the national average temperature was 0.5°C and 0.79°C higher than that in previous years, respectively), the formaldehyde concentration of YEB also increased significantly. Ju et al also reached the same conclusion when studying the characteristics of formaldehyde pollution in Jiangsu, Zhejiang and Shanghai provinces [33]. Figure 5(d) shows the correlation between HCHO and MAWC. The correlation range between − 0.5-0.81, and the positive correlation area accounts for 82% of the total area. The east of YEB shows a strong positive correlation, locating mostly in the lower reaches of the Yangtze River, while the central Sichuan and southern Guizhou show a medium correlation. The area studied has abundant annual precipitation, increased atmospheric water content, promoting the decomposition of formaldehyde polymer. In addition, the east of YEB is economically developed, with more organic compound emissions, causing significant increase in atmospheric formaldehyde column concentration. The negative correlation area accounts for 18% of the total area, mainly distributed in the southwest, concentrated in Yunnan, southern Sichuan and parts of southwest Guizhou. There is much precipitation in these areas, and rainstorm will reduce the concentration of atmospheric formaldehyde column through wet sedimentation. The increase of annual precipitation is also unfavorable for vegetation growth. This is consistent with the conclusion drawn by He when they studied the relationship between precipitation and vegetation ecosystem in Yunnan [34].
In addition to the above indicators, plant indicators such as tree crown, stomatal conductance and leaf age, and meteorological indicators such as wind direction, wind speed and precipitation are often used to measure regional HCHO column concentration changes, which is of great significance to the study of atmospheric formaldehyde pollution.

Correlation analysis between HCHO, O 3 and NO 2
Formaldehyde concentration reflects the activity change of VOCs. NO 2 is one of the most stable forms in NOx family. The concerted reaction of VOCs and NOx generates secondary pollutants such as O 3 through complex photochemical reactions [35,36]. With the rapid development of urbanization, various air pollution problems have become increasingly prominent, from single air pollution to compound air pollution. This paper discusses the relationship among YEB formaldehyde, ozone and NO 2 , and further expounds the contribution of formaldehyde to YEB. Figure 6(a) shows that the areas with high formaldehyde concentration were mainly distributed in Anhui, Hubei, Jiangsu, Zhejiang, Sichuan and Yunnan provinces, accounting for about half of the area of YEB. The forest vegetation coverage in Sichuan and Yunnan is high; The high concentration of formaldehyde column in Jiangsu, Zhejiang, Anhui and other places is related to human activities; On the other hand, the concentration of formaldehyde column in coastal areas was low. It is speculated that the frequent typhoons, concentrated rainstorms and frequent frontal rain in this area had a certain scouring effect on formaldehyde. Ozone tended to be higher in the north and lower in the south. The high concentration area mainly located in the north of YEB, concentrated in Jiangsu, Shanghai, Anhui and northern Hubei ( figure 6(b)). The distribution state of NO 2 concentration in the studied area was more condensed than that of O 3 concentration, and the mean value is lower. NO 2 level was higher in the east and lower in the west ( figure 6(c)). In general, the concentrations of formaldehyde, ozone and NO 2 in the east of YEB were high. The east of YEB was a major VOCs emission area with intensive human activities and highly concentrated energy utilization and industrial activities. In order to understand the contribution of formaldehyde to ozone in air pollutants, this paper also compared the spatial correlation diagram of formaldehyde and nitrogen dioxide to ozone (figures 7(a), (b)). Figure 7(a) shows the spatial correlation between HCHO and O 3 , which ranges from −0.55 to 0.88. The area of positive correlation accounts for 88% of the total area. Except for the negative correlation in Yunnan, western Sichuan, northern Anhui, Jiangsu and Shanghai, the positive correlation was dominant in other provinces. These results indicate that O 3 in these regions is more susceptible to HCHO than that in western Sichuan, Yunnan and Jiangsu. Formaldehyde, as the main component of VOCs, is also the carbonyl compound with the highest content in the atmospheric environment. Free radicals generated by photolysis will drive the production of O 3 , and O 3 concentration will increase in areas with high concentration of formaldehyde column. It is estimated that the biomass emission from vegetation will also increase O 3 content in Chengdu-Chongqing area [37]. The correlation range between O 3 and NO 2 was −0.70 ∼ 0.88, and the strong positive correlation area was mainly distributed in the west and northeast of YEB, such as Sichuan, Yunnan, Anhui and Jiangsu. The negative correlation area accounted for 51% of the total area ( figure 7(b)). With the acceleration of industrialization process, about 95% of NOx emissions will be converted into NO 2 in the photochemical environment involving O 3 . At the same time, NO 2 is also very likely to generate O 3 through photochemical reaction, Therefore, there is a strong positive correlation between O 3 and NO 2 in the eastern YEB, and O 3 in these regions is more sensitive to NO 2 . However, western Sichuan, Yunnan, Guizhou and other areas are vast and sparsely populated, with relatively weak human activities, relatively few industrial fixed sources and automobile mobile sources, low NOx emissions and low O 3 concentration [38]. However, there is a nonlinear chemical process between O 3 and NOx. In a specific environment, the generation of O 3 consumes a certain amount of NO 2 . In addition, when NO 2 concentration is low, the decreased titration effect also leads to an increase in O 3 [39,40]. This may be the reason for the negative correlation between O 3 and NO 2 in the central YEB region.
In general, the correlation regions in figures 7(a) and (b) generally show opposite trends. Especially in Hunan, Jiangxi, Guizhou, Zhejiang and other provinces, the difference is more obvious, as the VOCs control area. Related studies also showed that the proportion of biogenic active VOCs in these areas was large, and there was a trend of enhancement to the south [41]. In the west and east of YEB, O 3 is more sensitive to NO 2 .

Relationship between NMVOC and HCHO
NMVOC has a wide range of sources and plays an important role in the formation of HCHO and O 3 [42]. Due to lack of agricultural sources on accurate information and NMVOC data from 2018 to 2021, this paper uses industrial sources, civil sources, traffic sources and power sources to explain the impact of NMVOC on YEB formaldehyde.
In recent years, emissions from YEB have been at a high level. Figure 8 shows the emission proportion of NMVOC by subsector from 2012 to 2017. According to data statistics, the total emission of YEB NMVOC showed an overall upward trend, including 9.072 million tons in 2013 and 9.619 million tons in 2014, with an annual growth rate of 6%, and 9,636 million tons in 2017. Industrial sources and civil sources were the main sectors contributing to the increase of total emissions of YEB, As can be seen from figure 8, industrial source and civil source are the main sectors that promote the increase of total YEB emissions, and industrial source accounts for the largest proportion in the total NMVOC emissions, which exceeds 70% and shows an increasing trend. The proportion of civil sources decreased from 27% in 2012 to 20% in 2017, which still made an important contribution to formaldehyde in the study area. The emission proportion of NMVOC in each sub-sector is ranked as: Industrial source>Civil source>Traffic source>Power source. In addition, the interannual variation trends of YEB formaldehyde column concentration, emissions from civil sources, industrial sources, transportation sources and power sources from 2012 to 2017 show that except civil sources, industrial sources, transportation sources and power sources show basically the same upward trend, while the interannual upward trend of transportation sources and power sources is relatively slow. The emissions from industrial sources increased from 6.409 million tons in 2012 to 7.521 million tons in 2017. During these 7 years, the interannual variation trend of industrial sources was basically consistent with the interannual variation of YEB formaldehyde column concentration, indicating that industrial sources were the main driving source of the variation of YEB formaldehyde column concentration. Studies have shown that industrial energy production is the main source of formaldehyde emissions, for example, oxidation of olefins from petrochemical industry can release a large amount of formaldehyde [43]. On the other hand, the civilian sources have decreased year by year, with an average annual reduction rate of 4%. In 2015, there was a decrease of 122000 tons compared to the previous year, with a reduction rate of 5.3%. The decreasing trend may be related to biomass fuel transformation [44]. In addition, the contribution of transportation sources and power sources to formaldehyde in the study area cannot be ignored. With the cities of large scale in YEB, transportation resources mainly consist of the marine, land and air transportation network and logistics system. Vehicle exhaust is not only the main source of formaldehyde, but also accounts for 20% of isoprene in the atmosphere [45]. while thermal power plants also have a certain contribution to formaldehyde generation. It has since been documented that during the COVID-19 period (2019-2022), socio-economic activity has increased dramatically, with reductions in industrial production, traffic and fossil fuel use. Only in February 2020, the industrial and transport sectors (although transport emissions are small, But its activity fell sharply) contributed 66%, 88%, 70%, 90% and 62% to the drop in SO 2 , NOx, CO, NMVOCs and PM 2.5 emissions, respectively [20]. Therefore, we should optimize the industrial energy structure, effectively promote the popularization of new energy vehicles, and actively control thermal power generation.

Influence of human activity intensity on HCHO
The double cumulative curve can be used to test the consistency between two parameters and their changes. If the deviation is left, it means that the impact is increased, and if the deviation is right, it means that the impact is reduced. Different deviations mean different levels of impact [46].
In this paper, the influence of social and economic factors on the column concentration of HCHO is further clarified through the double accumulation curve. The relationship between the column concentration of HCHO and the cumulative population value is shown in figure 9(a). The relationship between the cumulative population and the cumulative HCHO column concentration was basically linear, indicating that the population has no significant impact on the concentration of HCHO column. Figure 9(b) shows the cumulative relationship between the concentration of HCHO column and the completed building area. The two accumulation curves roughly overlapped, indicating that the completed building area has no significant influence on the concentration of HCHO column. Shown in figures 9(c) and (d) respectively, the cumulative power curve and cumulative GDP curve deviated to the left, indicating that the increase of power consumption and GDP has a positive correlation with the concentration of HCHO column. The deviation of electric power is the largest, indicating that its influence on HCHO is also the largest, followed by GDP. Therefore, as far as YEB is concerned, the contribution of social and economic factors to the concentration of HCHO column can be ranked as: electricity>GDP>completed building area>population. Power industry ranked first in the six industries with the most serious air pollution emissions [47]. The electricity consumption of YEB was mainly reflected in industrial and residential power consumption, which directly or indirectly aggravated the industrial exhaust emissions, increasing the industrial energy consumption, especially the air pollution caused by thermal power plants, as well as the excessive emissions of nitrogen oxides (NO), carbon oxides (CO, CO 2 ) and dust, which promoted the generation of HCHO. GDP represents the development of social economy. The development of GDP not only increases social vehicle ownership, but also brings about energy consumption. For example, the production process of chemical enterprises, automobile exhaust and insufficient combustion of energy will lead to the increase of atmospheric formaldehyde content. With the increase of population, the number of houses and buildings will also increase, and household and building decoration materials will have an impact on atmospheric formaldehyde through indoor ventilation and production [48]. Fan et al also proved the effect of human activities on HCHO column concentration in eastern China by double accumulation curve, and further determined the reliability of the contribution of human sources to HCHO [49].

Analysis of potential source area of HCHO
The WPSCF value can be used to determine the potential source area that causes high concentrations of pollutants and their chemical components. The larger the WPSCF value is, the greater the impact of this area on the concentration of pollutants in the study area is [50].
Shanghai (121.48°E, 31.22°N) is selected as the research object in this paper. The functional index of Shanghai downtown is far higher than that of other cities in YEB. It is the polar core of YEB's social development, and its driving role and service radiation capacity cannot be underestimated. From March 2020 to February 2021, 72 h backward trajectory data at a height of 500 m above the ground is selected for calculation. The distribution of WPSCF values obtained is shown in figure 10. It can be seen that there were obvious differences in the characteristics of potential sources of formaldehyde in different seasons in Shanghai.
In spring, the areas with high WPSCF value are mainly distributed in Shanghai, Zhejiang and Jiangsu, with Shanghai and its neighboring cities as the main high value cities; The WPSCF value in most areas of Anhui, Shandong, eastern Jiangxi and Hebei exceeds 0.7 ( figure 10(a)). These areas are with large agricultural cultivation, and agricultural sources such as spring ploughing and fertilization have a greater impact on formaldehyde. There is also a major source area of formaldehyde in the eastern sea area of Shanghai, which may be caused by the transport of sand dust and pollutants to the Yellow River sea area with the monsoon in spring, and then back to Shanghai [51]. Affected by distance, the contribution of northern Hunan and eastern Guangdong is weak. In summer, the potential source areas of Shanghai are mainly located along the southeast coast and its sea area, with a large distribution range, mainly in Shanghai, Zhejiang, Fujian and central Guangdong ( figure 10(b)). In autumn, a large area of high WPSCF value is distributed in the Yellow Sea and its coastal areas (figure 10(c)), indicating that ocean currents makes a great contribution to Shanghai. In addition, Shandong and other places also have sporadic high value areas. It is speculated that it may be related to the longdistance transport of pollutants affected by the south downwind. Compared with other seasons, the track of strong source areas in winter is significantly longer, showing northwest to southeast distribution ( figure 10(d)), which is due to the movement of cold air in the northwest, carrying man-made pollutants from the mainland to the south; In winter, the area of high WPSCF is the smallest and concentrated, mainly in Shanghai, eastern Jiangsu and northern Zhejiang.
In general, Shanghai, a typical area with high intensity of human activities, is not only the emission source of formaldehyde and other pollution species, but also the gathering place of air compound pollution species, playing two roles: source and sink. Taking formaldehyde as an example, the dynamic evolution of HCHO concentration is mainly influenced by the local emission source of Shanghai (as a sources), as well as being affected by the cross regional potential sources, the northwest air mass and the ocean air flow in different seasons of the year, serving as a sink. The regions with high contribution rate to formaldehyde are characterized by intensive human activities, developed industries, and high anthropogenic VOCs emissions. In addition, there are high WPSCF values in the coastal areas of the Yangtze River Delta all the year round, which may be related to developed shipping, which requires further study.

Conclusions
In terms of spatial distribution, the overall distribution of HCHO concentration in YEB is uneven, with high value areas concentrated in northeastern YEB and Yunnan, and low value areas distributed in western Sichuan, central YEB and coastal areas.
In terms of time series, the average concentration of HCHO in YEB is 13.98 × 10 15 molec cm −2 , with the highest concentration of 14.93 × 10 15  Among the natural sources, the influence of vegetation on the HCHO of YEB is more obvious. The correlation coefficients of HCHO with NDVI and LAI were −0.91 ∼ 0.97 and −0.9 ∼ 0.95, respectively, and MAT showed mainly positive correlation for HCHO in YEB. positive correlation between O 3 and HCHO was mainly distributed in the central part of YEB, and negative correlation areas were concentrated in the northeastern and western parts of YEB, and the correlation between O 3 and NO 2 showed opposite spatial distribution patterns.
Among the anthropogenic sources, industrial and residential sources are the main sources of HCHO column concentrations in YEB. Although the civil sources are decreasing year by year, the high emissions of NMVOCs still have an important impact on HCHO. The double accumulation curve further clarifies the degree of contribution of human activities to YEB HCHO: electricity consumption > GDP > completed building area > population.
The potential source area of HCHO in Shanghai is affected by local emission, trans-regional transport, northwest air mass and ocean airflow. In spring, the high-value WPSCF areas were mainly distributed in Shanghai, Zhejiang and Jiangsu, mainly in Shanghai and its neighboring cities. In summer, it is mainly affected by oceanic easterly monsoon. In autumn, a large area of high WPSCF coverage is distributed in the Yellow Sea and its coastal areas, indicating that the ocean airflow has a great contribution to Shanghai. In winter, the track of the strong source region is obviously longer, showing a northwest to southeast distribution, which is caused by the movement of cold air in the northwest and the southward discharge of pollutants from the mainland. In addition, the high value area of WPSCF in the Yangtze River Delta may be related to the developed shipping.

Data availability statement
All data that support the findings of this study are included within the article (and any supplementary files).