High-dimensional Statistical Analysis and Its Application to an ALMA Map of NGC 253

In astronomy, if we denote the dimension of data as d and the number of samples as n, we often find a case with n ≪ d. Traditionally, such a situation is regarded as ill-posed, and there was no choice but to discard most of the information in data dimensions to let d < n. The data with n ≪ d is referred to as the high-dimensional low sample size (HDLSS). To deal with HDLSS problems, a method called high-dimensional statistics has rapidly developed in the last decade. In this work, we first introduce high-dimensional statistical analysis to the astronomical community. We apply two representative methods in the high-dimensional statistical analysis methods, noise-reduction principal component analysis (NRPCA) and automatic sparse principal component analysis (A-SPCA), to a spectroscopic map of a nearby archetype starburst galaxy NGC 253 taken by the Atacama Large Millimeter/submillimeter Array (ALMA). The ALMA map is an example of a typical HDLSS data set. First, we analyzed the original data, including the Doppler shift due to the systemic rotation. High-dimensional PCA can precisely describe the spatial structure of the rotation. We then applied to the Doppler-shift corrected data to analyze more subtle spectral features. NRPCA and R-SPCA were able to quantify the very complicated characteristics of the ALMA spectra. Particularly, we were able to extract information on the global outflow from the center of NGC 253. This method can also be applied not only to spectroscopic survey data, but also to any type of data with a small sample size and large dimension.


Introduction
In the past decade, the term big data has been widely recognized and observed in almost all fields of scientific research, and of course also in astronomy.Current astronomical instruments provide us with overwhelmingly large data, providing quantitative information on their physical conditions.The most typical big data in astronomy is the product of large surveys, such as the Sloan Digital Sky Survey (SDSS),10 the Dark Energy Spectroscopic Instrument Survey,11 Gaia, 12 the Panoramic Survey Telescope And Rapid Response System surveys,13 among many others.
We should note, however, that there is another type of big data in astronomy.Detailed observations are often very timeconsuming, and it is not easy to map objects to obtain a large number of independently sampled measurements.Such a situation is frequently found in integral field spectroscopy (e.g., UVES at the Very Large Telescope,14 SAURON on the William Herschel Telescope,15 SDSS MaNGA data, 16 among others.Another typical example of this type is spectroscopic mapping surveys by radio interferometers (e.g., the Atacama Large Millimeter/submillimeter Array (ALMA)) or by satellite instruments at X-ray, ultraviolet, or mid/far-infrared wavelengths.In such a case, if we denote the dimension in the wavelength (or frequency) with d and the number of samples with n, we often find that n = d.
Classical statistical analysis implicitly assumes a sample size n is much larger than data dimension d (n ?d).As we mentioned, this is not true for many cases in scientific research.Traditionally, in astrophysics, such a situation is regarded as an ill-posed problem, and there was no choice but to discard most of the information in the wavelength direction to let d < n.This was more or less the same for other research fields before 2010.Many researchers have simply given up further analysis, or even did not find a need to deal with such problems before the advent of the big data era.For example, Hand et al. (1994) published a well-known compilation of statistical data, containing more than 500 samples, but most of them satisfied the classical condition of the assumption d < n.However, in the late 1990s, data with d ?n suddenly started to appear along with the development of information science.One of the epoch-making examples of such kind of research was that of Golub et al. (1999).They used the gene expression monitoring by DNA microarrays applied to human acute leukemias as a test case.The dimension of the DNA microarray was d = 7129 and the sample size n = 72.Statistics in the 1990s could not guarantee the accuracy of the estimation in such a case since it was based on d < n.
Various fundamental problems were known for data with the condition of n = d: 1.The sample covariance matrix is usually numerically unstable or does not exist, and classical multivariate analysis cannot work.2. We meet various types of the curse of dimensionality.

Calculation costs are often excessively large.
A method to handle such a problem has been desired since then.In order to study such type of data with a very high dimension and small sample data, we had to push back the boundaries of the traditional technique of multivariate statistical analysis.This is because traditional methods are not able to capture the characteristics of the high-dimensional data space, and hence, not able to make use of the rich information that the data intrinsically contained.
The situation changed drastically in the 2000s.A turning point was introduced in the field of probability theory and theoretical physics based on the random matrix theory.In these studies, the asymptotic behavior of eigenvalues of the sample covariance matrix in the limit as d → ∞ was explored under Gaussian or Gaussian-type assumptions on the population distribution when d and n increase at the same rate, i.e., n/d → c > 0 (c: const.)(e.g., Johnstone 2001;Baik et al. 2005;Baik & Silverstein 2006;Paul 2007).In our current interest, however, the assumptions n/d → c > 0 and Gaussian(-type) are actually unrealistically restrictive.A seminal paper by Hall et al. (2005) introduced a new concept, high-dimensional low sample size (HDLSS) asymptotics 17 The limits they studied were taken as the dimension of the data d → ∞, while the sample size n was fixed.At that time, this was a revolutionary departure from the traditional asymptotics based on n → ∞ with d fixed.Hall et al. (2005) and Ahn et al. (2007) discussed an important representation of the HDLSS data, but still under the assumption of Gaussian(-type).In contrast, Yata & Aoshima (2012) developed a new theory by eliminating this assumption and discovered a completely different type of representation when the Gaussian assumption is not satisfied (see Section 2 for the details).These works were the foundation of research in mathematical statistics on very high-dimensional data analysis, and evolved into an important part of the current trend in the area of research, big data science.This new insight shed light on various areas of study such as genomics, medical research, neuroscience, and image and shape analysis.Then, in their sequence of papers, M. Aoshima, K. Yata, A. Ishii and collaborators further developed a new framework of statistical methodology to tackle this type of problem (e.g., Aoshima & Yata 2011;Yata & Aoshima 2012, 2013;Aoshima & Yata 2014, 2015;Ishii et al. 2016;Aoshima & Yata 2019, and references therein).Readers who are interested in the mathematical background of high-dimensional statistical analysis itself are guided to some reviews, (e.g., Aoshima & Yata 2017;Aoshima et al. 2018).
Despite the great developments in other research fields, this new methodology of statistics seemed to be completely unknown in astronomy.High-dimensional data such as ALMA data are regarded as a sum of informative signals and noise, but we cannot separate them quantitatively without prior information.Supposing that we know the frequencies of emission lines is a kind of such prior information that has been traditionally used in traditional astronomy, but this is not what we prefer.While some frequency regions are considered to contain no information, we proceeded to extract information not by the traditional procedure, but by an objective statistical method without prior information.High-dimensional statistical analysis is one of such objective methods.
Thus, in this work, we introduce the method of highdimensional statistical analysis for HDLSS data in astronomy.In order to examine the performance of the high-dimensional methods to astronomical data, we apply it to a spectroscopic map of the central region of NGC 253.NGC 253 is a nearby prototypical starburst galaxy (Rieke et al. 1980), and has been observed in various wavelengths.We apply high-dimensional principal component analysis (PCA) to the ALMA spectral map (Ando et al. 2017, hereafter A17).A17 presented an 8 × 5 pc resolution view of the central ∼200 pc region of NGC 253, based on ALMA Band 7 observations.As we introduce in detail, the ALMA maps can be regarded as being typically HDLSS.The spectrum of NGC 253 is very rich in molecular lines (e.g., A17; Martín et al. 2021).Another focus of this work is to classify the spectra full of spectral line features by PCA.The evolution of galaxies is mainly driven by star formation, a transition from interstellar medium (ISM) to stars.Various phases of the ISM are related, and the evolution of the ISM is key to completing the understanding of the galaxy's evolution.Spectroscopic observations are of vital importance to extract and interpret the information of matter in galaxies (e.g., Martín et al. 2021).Then, this classification is potentially a powerful tool for the study of galaxy evolution if it turns out to be useful.
This paper is organized as follows.In Section 2, we introduce the concept of HDLSS data and high-dimensional statistical analysis, with an emphasis on high-dimensional PCA.There, we present some important theorems that characterize the surprising features of HDLSS data, without detailed mathematical proof.Then, in Section 3, we move on to the actual analysis of astronomical data.As already mentioned, we used an ALMA map of NGC 253.We summarize the data properties in this section.Our data analysis is twofold: first, we examine if the high-dimensional statistical analysis is really useful for the analysis of spectroscopic mapping data in Section 4. After confirming its excellent performance in this type of analysis, we further go to a deeper analysis of the NGC 253 molecular lines in Section 5. Section 6 is devoted to the conclusions of this work.Some technical details related to both theoretical and practical methods in this work are presented in the appendices.Appendices A and B describe the properties of the dual sample covariance and its geometric representation.In Appendix C, we present the methodology for non-Gaussian data that was not discussed in the main text.A comparison between the results of traditional and highdimensional PCAs is shown in Appendix D. The method to correct the Doppler shift is presented in Appendix E.

High-dimensional Statistical Analysis
The HDLSS data have a sample size n and data dimension d with n = d.In this work, we mean the newly developed statistical methodology to deal with HDLSS data by highdimensional statistical analysis.

Unusual Behavior of High-dimensional Statistics
As a pedagogical example, we demonstrate that typical unusual properties only appear in high-dimensional statistical analysis, known as strong inconsistency.We consider a sample mean x , Suppose that x i s are drawn from a d-dimensional distribution with a mean μ.Under the usual condition often supposed in traditional statistics, d/n → 0, we have where the superscript P stands for the convergence in probability. 18Equation (2) is referred to as the consistency of the sample average.However, in high-dimensional statistics, d/ n → ∞, and shockingly we have This is the so-called strong inconsistency in the HDLSS framework.What Equation (3) means is that the noise in the observed data drastically increases with the increase of the dimension d.Traditional statistical techniques suppose that the noise is negligible compared to the data information.Hence, traditional statistics can never handle HDLSS data.
In order to have a more concrete image of such counterintuitive phenomena, we examine the behavior of this huge noise in high dimensions.For this, however, some mathematical background is necessary.In the following, we introduce the background mathematical concepts.We do not try to introduce them strictly to try to give rigorous mathematical proof, even if the following sections seem to be rather mathematically inclined.

Dual Sample Covariance
Consider a parent distribution of dimension d, with a mean vector μ = 0 and a covariance matrix S > ˜Õ d (i.e., all the eigenvalues are positive).We can adopt this constraint for the sake of simplicity.Let λ i (i = 1,...,d) be the eigenvalues of S ˜, and sort them out as Then, we can make an eigenvalue decomposition of is an orthogonal matrix of the corresponding eigenvectors.Note that these are all outrageously huge matrices.Now we draw a set of n samples from this parent population (d > n), x 1 , x 2 , ..., x n .Suppose that we describe this set of data by a and x i (i = 1, ..., n) are independently and identically distributed (i.i.d.).We should note that the i.i.d.condition is not always satisfied in practice.However, this condition is often required for the rigorous construction of the mathematical methodology.The sample covariance matrix (d × d) is expressed as . 8 Here, we define the dual sample covariance (n × n), A schematic description of a sample covariance matrix S and its dual matrix SD is shown in Figure 1.If we consider the sample eigenvalues of SD , its n eigenvalues are common with the first n eigenvalues of S. See Appendix A for the merit of SD .An important piece of the initial exploration of HDLSS asymptotics in Hall et al. (2005) was the revelation of a surprisingly rigid geometric structure that naturally emerges from the asymptotics in an increasingly random setting, referred to as the geometric representation by them.Recall the eigenvalue decomposition of the covariance matrix Then, Z is a (d × n) sphered data matrix from a distribution with the identity covariance matrix.Here, we write , where  and  denote the expectation and variance of a random variable, and Ĩn is the n-dimensional identity matrix.We assume that the fourth moments of each variable in Z are uniformly bounded.Note that if X is Gaussian, the z ij s are i.i.d.standard normal random variables.We also define the eigenvalue decomposition of SD by Here, u i denotes a unit eigenvector corresponding to l ˆi.See Appendix B for the details of the geometric representations of SD .
Let ĥi be the ith eigenvector of S. Note that ĥi can be calculated by

High-dimensional PCA
We adopt the same setting as the previous sections for the data distribution: we suppose a random variable with a mean μ = 0 and a covariance matrix S > ˜0 d .We assume the following model, referred to as the generalized spiked model, for the eigenvalue distribution of S ˜d, where a i (> 0), c i (> 0), and α i (α 1 ... α m > 0) are unknown constants that preserve the order λ 1 ...
See Appendix C for details of the consistency when Z does not satisfy Condition (16).
Theorem 2.1 gives the condition that the estimation by traditional PCA would have consistency.The condition described by d → ∞ and n → ∞ in (1) of these theorems is a mild condition in the sense that one can choose n free from d or that n may be much smaller than d such as = n d log .However, in (2), n is heavily dependent on d.Hence, we should be very careful to apply conventional PCA to the HDLSS data.In order to overcome this fundamental problem of conventional PCA, Yata & Aoshima (2010a) and Yata & Aoshima (2012, hereafter YA12) proposed two new PCAs for HDLSS data.In this work, we focus on one of them, the noise-reduction (NR) methodology (YA12).

NRPCA: A New Method for HDLSS Data
Here, we introduce high-dimensional PCA.First, we informally present the concept of high-dimensional PCA, which is schematically described in Figure 2. Because of the enormous noise that concentrates on the sphere in the data space, the intrinsically important characteristics of the data are completely buried under the noise sphere and obscured.However, it is not hopeless to estimate the characteristics even under such a situation, thanks to the fact that the noise behavior around the sphere can be well analyzed and essentially subtracted.The method to concretely realize this idea is the NR methodology proposed by YA12.
We try to explain it in a more rigorous manner.YA12 proposed the NR methodology derived from the geometric representation (Equation (B2)).First, we decompose SD as å å The second term represents the noise.Then, from Theorem B.1, under Condition (16), we obtain the following geometric representation for the noise: i.e., the noise part becomes deterministic under the HDLSS condition.This result enables us to remove the noise part and 19 One may be skeptical as to whether such a restrictive model can really describe the general properties of the eigenvalue distribution of real data.However, as we see in Sections 4 and 5, the generalized spiked model well represents the actual eigenvalues. 20In this context, consistency stands for the situation as follows: consider an estimator t(n) (n: sample size) is a consistent estimator for a parameter θ if and only if, for all ò > 0, we have q An estimation of m is given by Aoshima & Yata (2018).We have the following theorem.
Theorem 2.2 YA12 II.When the components of Z satisfy Condition (16), it holds that for  i m, If we compare Theorem 2.2 with Theorem 2.1, we see that the consistency holds for l s i  under milder conditions than l (under Condition (16)). 21ext, we apply the NR methodology to the principal component , and examine its performance as the estimator of the PC direction showed the following theorem.
We also consider the PC scores.The ith PC score of x i is given by l Then, we have yet another important theorem.
Theorem 2.4 YA12 V. Assume that l ( )  i m i has multiplicity one.Then, under the same conditions as Theorem 2.3, All these theorems hold for the cases with nonzero mean, by replacing the setting with zero-centered variables.Thus, now we have demonstrated that the NR methodology yields high performance for the estimation of the PC eigenvalues (contributions), PC vectors, and PC scores for HDLSS data.In the following discussion, the presented eigenvalues are obtained by NRPCA.

Automatic Sparse Principal Component Analysis
In this section, we introduce (thresholded) sparse principal component analysis (SPCA) methods for high-dimensional data.SPCA can be summarized as follows: let for all i.Given a sequence of threshold values ζ > 0, define the thresholded entries as However, the estimator heavily depends on a choice of ζ and does not hold consistency properties for HDLSS data.Recently, Yata & Aoshima (2024) proposed an SPCA method by using the NR method as follows: for the PC direction by the NR method, for the sake of simplicity.For a given constant ω ä (0, 1], we consider PC directions whose cumulative contribution ratio is greater than or equal to ω.There exists a unique integer . 27 We define that  automatic sparse principal component analysis (A-SPCA).They showed some consistency properties of w h i  in the HDLSS data and investigated the performance in actual data analyses.Here, A-SPCA uses the same eigenvalues estimated by NRPCA.In this paper, we set ω = 1/2. 22Intuitively, A-SPCA can extract the most important components of the eigenvectors obtained by NRPCA.Namely, these components practically control the PCs.

Data
Our target galaxy, NGC 253, is a nearby barred spiral galaxy at a distance of 3.5 Mpc, i.e., 1″ corresponds to 17 pc (Rekola et al. 2005).Morphological type Sc is assigned.It has a systemic heliocentric velocity of 243 ± 2 km s −1 (z = 0.000811: Kamphuis et al. 2015), mainly due to the cosmological expansion.This galaxy is one of the brightest sources of far-infrared emission except for the Magellanic Clouds and has been extensively studied at many wavelengths.It is known as a pure starburst, whose star formation rate (SFR) in the central molecular zone is estimated to be SFR ∼ 2 M e [yr −1 ] ( Rieke et al. 1980;Keto et al. 1999;Bendo et al. 2015).Massive, dense, and warm ISM (e.g., Sakamoto et al. 2011) was identified in the central region.Super star clusters are identified at the optical and near-infrared bands (Fernández-Ontiveros et al. 2009).An intense outflow is also observed (Matsubayashi et al. 2009;Bolatto et al. 2013).The super star clusters and outflow are also associated with the starburst activity.

ALMA Map of NGC 253
The data used in this work is a spectroscopic map of a nearby archetype starburst galaxy NGC 253, obtained by A17, at Band 7.  The A17 original map consists of 864 × 864 points in the spatial domain (R.A and decl.), and 2248 components along the frequency domain (in units of hertz) in total.However, since the pixel scale was smaller than the synthesized beam size of the map 0″45 × 0″3 (oversampled), each spatial grid in the map is not independent of each other.To overcome this issue, we regridded the map in the spatial dimension.The equivalent circular beam size corresponding to the elliptical beam of 0″ 45 × 0″3 is 0″37.We used this effective beam size to regrid the image map.The resulting map with independent grids consists of 230 × 230 pixels.We then should cut out low-signal regions.For this, instead of estimating the signal-to-noise ratio for some lines, we integrated the intensity over the whole range of frequency and calculated the total intensity I [Jy beam −1 ] and chose the regions with I > 1.0, based on the rms level.We show this mask region, together with the significantly bright regions in Figure 3.The number of remaining pixels n is 231.

Doppler Shifts due to the Systemic Hubble Velocity and Rotation
The central region of NGC 253 is strongly inclined (i ∼ 78°).This causes a fairly coherent rotation pattern on the spectroscopic map through the Doppler shift.We made a Gaussian fit to three prominent emission lines HCN(4-3), HNC(4-3), and CS(7-6) in the spectra to estimate the line center wavelengths.The map of the velocity field is presented in Figure 4. We determined the amount of the shift by averaging the shifts obtained from the three lines.Then, we shifted the frequencies with respect to the systemic velocity of NGC 253 caused by the Hubble velocity.After this correction, we have a map without the systematic effect of the coherent rotation and we can analyze more subtle properties of the submillimeter emission lines.
We demonstrate an example of the emission lines in Figure 22 in Appendix E. We have frequency gaps in our ALMA data cube.If we try to shift the spectrum according to the velocity, the boundaries of the spectra are also shifted.We simply adopted frequency ranges common to all the pixels.The final size of the data is 231 along the spatial position, and 1971 in spectral dispersion after all corrections.

Spectral Map as HDLSS Data
The resulting map of NGC 253 has spatial dimension 231 × spectral dimension 1971, i.e., n = 231 and d = 1971, and clearly n = d.Though it is not as extreme as DNA microarray data (typically n ∼ 10-100 and d ∼ 10 4 -10 5 ), our ALMA map is typical HDLSS data. 23Problems from the astrophysical side are that a very large amount of information is contained in the spectra, and a large variety of spectral lines are observed compared to n.This situation is not special in the high-dimensional analysis.
However, the application of high-dimensional methods to spectroscopic data is not completely straightforward.Spectral lines are broadened and/or shifted by a turbulent motion and other physical mechanisms in the ISM.Then, the information carried by each spectral resolution unit is not independent. 24Such a complication has never been examined in other fields of science about HDLSS data.However, since it is commonly seen in spectroscopic data in any field of astronomy, before going to the detailed analysis, we first examine if the highdimensional statistical analysis really works for the astronomical spectroscopic map, with line shifts and broadening.

Preparatory Analysis
As we have already mentioned above, there are some potential problems specific to astronomical spectroscopic data, which are different from typical examples of HDLSS data in genomics and other fields.The most fundamental issue is that when we regard spectroscopic data as vector data = , the information on the neighboring frequency (or wavelength) index i and i + 1 (i = 1, ..., d − 1) is not independent.Namely, we often have internal motion in the ISM of a galaxy that leads to a Doppler broadening along the frequency axis.We also find a systemic rotation of the system over the whole observed region of an object (Section 3).In order to examine if high-dimensional statistical analysis would work for such type of astronomical spectroscopy data, we made a preparatory analysis of the original spectral mapping data of NGC 253 with high-dimensional PCA.By the term original, we mean that we started this analysis without correcting the Doppler shift on the map due to the internal systemic rotation.Namely, we blindly applied high-dimensional PCA methods to the data and examined what feature would be detected.For this, we sort the two-dimensional coordinates along the R.A.direction and then the decl.direction, and renumbered them so that we can regard the data as a one-dimensional vector.Though we can apply PCA directly to a two-dimensional image, we adopted this procedure throughout this paper, as a first simplistic analysis.

Eigenspectra
In the following part of this work, we mainly focus on the A-SPCA results.However, since NRPCA deals with all the spectral features without extracting only the most important ones, we also present the eigenspectra (eigenvectors constructed from the observed spectra by PCA) from the NRPCA.The upper panel of Figure 5 shows the eigenspectra constructed by NRPCA from the ALMA spectral map of NGC 253. 25 We observe that specific features are commonly found in similar frequency ranges of the spectra for all the PCs.This suggests that some spectral features contain richer information on the whole spectra than other spectral regions.
A-SPCA specifies and localizes the most important components that correspond to these particular features.The lower panel of Figure 5 shows the eigenspectra obtained by A-SPCA.As seen in Figure 5, A-SPCA only leaves the values if the features are important to determine the spectra, and forces the other components to be zero.Hence, the A-SPCA eigenspectra are not used for the reconstruction of the spectra with all features, and NRPCA is used for this purpose.We make use of this outstanding advantage of A-SPCA to specify the controlling spectral features of PCA in the following analysis.

Distribution of the Contribution of PCs
First, we show the eigenvalue distribution of the PCs sorted with their contribution obtained by NRPCA in Figure 6.First, several eigenvalues are prominently larger than the rest.This behavior is referred to as spiked in the context of high-dimensional statistical analysis.Recall that the eigenvalues of the PCA represent the contribution of each PC, namely, the PCA decomposes the whole covariance matrix into contributions from each PC.Since the obtained eigenvalue distribution shows this spiked structure, we see that high-dimensional PCA detected some characteristics of the ALMA spectral map of NGC 253.This guarantees that the current data can be safely regarded as typical HDLSS, and that the assumptions of high-dimensional PCA apply.
Astrophysically, Figure 6 demonstrates a very promising possibility of the high-dimensional statistical analysis of the spectral mapping data.A17 identified more than 30 molecular/radical lines on the map (see their Figure 1), as well as some broad or continuum features.Even if we restrict our interest to the lines, it is implausible that we classify them only by physical intuition.Traditionally, for example, a line ratio has been used to extract physical information from spectral data.However, brute-force trials may not work if we try to diagnose many lines simultaneously because the number of line ratios would increase prohibitively with a number of features, known as a combinatorial explosion.In clear contrast, Figure 6 implies that the whole information of the complicated spectral map can be represented by the first several PCs.This means that the complicated spectral features including more than 30 lines are controlled by several bases represented by the PCs.This advantage itself is a well-known benefit to using PCA in general as a dimensionality reduction method (e.g., Galaz & de Lapparent 1998;Ronen et al. 1999;Wang et al. 2011;Pace et al. 2019;Portillo et al. 2020, among many others), but all these previous applications were on integrated 1D spectra with a sufficiently large number of n, namely, not HDLSS data.We stress that the analysis in this work is substantially different from these works: the current method can be applied to a spectral map, which is a typical HDLSS data set and traditional PCA would not have worked due to the huge noise sphere, as mentioned in Section 2.
Since we confirmed that the methodology of high-dimensional statistical analysis worked well for the astrophysical 23 Note that > = n d log 7.59 when d = 1971 and n = 231, so that the NR methodology works well for the HDLSS data.See footnote 21. 24 The existence of continuum radiation makes the situation even more complicated, but we do not try to examine this problem in the current work. 25We evaluated the advantage of NRPCA compared to traditional PCA through the eigenvalue estimation.This is presented in Appendix D.
spectral amp data, we safely proceeded to further analysis by high-dimensional PCA.

Physical Meaning of the First Two PCs
The next step is to examine what PC1 and PC2 represent.Figure 7 shows the scatter of PC1 and PC2.The most striking feature is the butterfly-like pattern symmetric with respect to the x-axis.It reminds us of the coherent rotation of the central region of NGC 253.We reconstructed the two-dimensional structure to map PC1 and PC2 in Figure 8.
It is clear that PC1 represents the intensity of lines, and PC2 represents the symmetric and coherent pattern of the center of NGC 253.We observe a nice agreement between the lower panel of Figure 8 with the directly estimated velocity field of the same region in Figure 4.
To have a deeper look at the relation, we present the correlation between PCs and observed properties in Figure 9.The upper panel of Figure 9 is a scatter plot between PC1 and the total intensity integrated over the whole frequency range for each pixel.We indeed observe a good correlation between them (Figure 9).The correlation is not linear because PCA only extracts the linear features of the data, while the relation to the physical quantities would not be necessarily linear.We also examined the relation between the peculiar velocity and PC2 in the right panel of Figure 9.As we already presented in a twodimensional map, PC2 approximately delineates the velocity field in Figure 9.However, we see some outliers where the PC2 and the peculiar velocity are not coherent.These regions may be affected by some local phenomena, for example, a strong outflow.We will revisit this issue later with a detailed analysis.

Spectral Features Corresponding to PCs: Original Data
It is interesting to examine what features are concretely responsible for PCs.Actually, A-SPCA can specify the responsible features to characterize the PCs.We show the responsible spectral features in Figure 10.The spectra are integrated over the whole analyzed regions for both figures.In Figure 10, stars represent the features responsible for PC1, and triangles are for PC2.We do not show higher-order contributions for clarity.PC1-related features are assigned to the HCN (4-3) and HNC(4-3) lines, particularly close to the central part of the lines.PC2-related features are related to HCN(4-3), but in contrast to PC1, they are connected to the wing part of the line.Since the spectra are integrated over the analyzed regions, the wing is actually made by the Doppler shifts.This is clearly consistent with the fact that the PC2 map (Figure 8) agrees well with the Doppler velocity field of the NGC 253 map (Figure 4).
The performance of A-SPCA to the ALMA spectroscopic map as HDLSS data is certainly guaranteed by this preparatory analysis.We can safely proceed to the detailed analysis of the molecular lines, as in the next section.

Results and Discussion
After examining the performance of the high-dimensional statistical analysis in Section 4, now we can proceed to a further physical analysis.We apply high-dimensional PCA to the Doppler-corrected map of NGC 253, exactly in the same manner as we did in Section 4. Note that, though we use the same notation for PC2, now PC2 is the value obtained from the Doppler-corrected map and different from those in Section 4 from NRPCA.We should also note that since we applied NRPCA and A-SPCA to the Doppler-corrected map, all the contributions of PCs are affected by the correction.
The same as the preparatory analysis, we first show the eigenspectra of the Doppler-corrected spectral map in Figure 11.Compared with Figure 5, the meaning of the shape of each eigenspectrum becomes clearer.We stress that since we have already subtracted the global shift of the whole spectra, it does not represent the systemic velocity of the position anymore.Thus, the velocity structure is always the first-order deviation from the systemic rotation pattern, and appears as a broadening or asymmetric distortion of the line profiles.Another important difference is that an important spectral line is identified in PC5 (see PC5 in the bottom panel in Figure 11), which was not specified in the original data.This is because the dominant effect of the systemic rotation was successfully removed from the map, and more subtle features appear more prominent in the Doppler-corrected map.Now very clearly PC2 represents the wing of the HCN line and some other lines, indicating that it reflects a very local expansion of the gas.PC3 and PC4 describe the blueward distortion of the lines, caused by a gas motion toward us faster than the expansion velocity.Though PC5 is less prominent, but similarly to PC3 and PC4, it represents the redward distortion of the lines.To examine the effect of the choice of ω, we also present the eigenspectra with ω = 0.25 and ω = 0.75 in Figure 12.
We present the distribution of the contribution of PCs in Figure 13.Now the contribution of new PC2 is much smaller than old PC2 in Figure 6.This is also because of the successful removal of the systemic rotation.Instead, we have a smaller but nonnegligible contribution of a new PC2.Now PC2 would describe much more subtle and local features of the spectral line map of NGC 253, which is consistent with the implications from the eigenspectra.This is more drastically represented on the scatter plot of PCs.We demonstrate the relations between PCs in Figures 14-16.Figure 14 shows the two-dimensional distribution of PC1 and PC2.The butterfly-like pattern seen in Figure 7 completely disappeared.Figures 15 and 16 are the relations between PC1 and PC3, and PC2 and PC3, respectively.Again we do not find any symmetric pattern in Figures 15 and 16.From these figures, we do not have an obvious symmetric pattern in the intrinsic spectral features in the map.Both in 2D and 3D, we observe a continuous distribution among PCs.

Map of PCs of the NGC 253 Spectra
We reconstructed the two-dimensional structure to map PC1, PC2, and PC3 in Figure 17.Again, PC1 represents the intensity of lines.In contrast to Figure 8, new PC2 shows a more complicated, localized pattern in NGC 253.The map of PC3 also shows smaller-scale structures.Recalling that PC2 suggests the existence of small-scale outflow.According to A17, the spatial scale (diameter) of the most prominent    region with large PC2 is ∼10 pc, namely, it corresponds to a small-scale phenomenon, indeed.
Particularly interesting is the region with negative PC3 in the bottom panel of Figure 17.The eigenspectra show that PC3 and partially PC4 describe a local but larger-scale flow toward us, which appears in the blueshift of the lines.Bolatto et al. (2013) reported a global molecular outflow from the center of the radio continuum.More recently, Walter et al. (2017) and Krieger et al. (2019) performed detailed ALMA observations of this region.They made a detailed analysis and named this outflow the SW streamer.This outflow is going out toward us, and the position of the root of the SW streamer precisely agrees with the region with negative PC3 in Figure 17 (Krieger et al. 2019, see their Figure 1).The anomalous region specified by the peculiarity in velocity in Figure 18 also agrees well with the root of the SW streamer.In addition to the molecular observations, an outflow is also observed in Hα and other optical emission lines by Fabry-Perot spectroscopy (Matsubayashi et al. 2009).Though this delineates the flow of ionized gas, the position, shape, and direction of the outflow agree fairly well with the molecular outflow.All these clues indicate that high-dimensional PCA efficiently extracts an outflow phenomenon from an ALMA spectroscopic map, in a purely objective way.

Spectral Features of the Doppler-corrected Data
Corresponding to the PCs In Figure 19, stars and triangles represent PC1-and PC2related responsible spectral features similarly to Figure 10, and filled circles are PC3-related.First, we observe a global shift of lines compared to Figure 10.This is, of course, due to the recession velocity.Since the boundary between ALMA bands is fixed at the observer's frame, some parts of the spectra go out of the boundary after the Doppler shift correction.As a result, we lose significant parts of the spectra in Figure 19.After the Doppler correction, PC1 describes the intensity of HCN(4-3) and HNC(4-3) lines more specifically, which is reflected in the fact that responsible features concentrate on the peak of these lines.PC2 is assigned to the part of the profile slightly far from the line center for HCN(4-3).However, it is related to the lower-frequency side of the HNC(4-3) line, though it is difficult to see clearly because this side is cut out by the correction.PC3 further delineates the side of the higherfrequency side of the HCN(4-3) line, while it has some overlap with PC2-related features on the lower-frequency side.These properties were already suggested by eigenspectra (Figure 11), and we confirm them by the detailed information from A-SPCA.Thus, even if the spectra look formidably complicated, high-dimensional PCA methods like NRPCA and A-SPCA can disentangle the information and pick up the controlling spectral features from the spectroscopic map.Rather surprisingly, the detailed profile of HCN and HNC lines govern almost all spectral characteristics of the ALMA map of NGC 253, including the outflow phenomenon.Further investigation on detailed astrophysical processes from the ALMA map of NGC 253 is left to our future work.

Conclusion
The evolution of galaxies is mainly driven by star formation, a transition from ISM to stars.Various phases of the ISM are related, and the evolution of the ISM is key to completing the understanding of the galaxy's evolution.Spectroscopic observations are of vital importance to extract and interpret the information of matter in galaxies.
Though spectroscopic mapping and integral field spectroscopy are fundamentally important to reveal ISM physics, detailed observations are often very time-consuming, and it is not easy to map objects to obtain many independently sampled measurements.Consequently, we often find a situation where the dimension of data d is much larger than the sample size n, i.e., n = d.This is referred to as HDLSS.The HDLSS data are different from typical big data in astronomy (usually n is huge), but d is extremely large.In traditional studies in astronomy, the HDLSS problem has been regarded as being ill-posed and simply given up for analysis.However, at the end of the last millennium, a strong need for an analysis method for such data arose (e.g., Golub et al. 1999).Then, during the subsequent decade, a new statistical method to tackle HDLSS problems was significantly developed, and is known as the highdimensional statistical analysis (e.g., Aoshima & Yata 2017;Aoshima et al. 2018).This paper is organized twofold.First, we introduced the general framework of high-dimensional statistical methods, and then we applied the methodology to the ALMA spectroscopic map data.
For the HDLSS data, the full information on the sample covariance is expressed by a huge d × d covariance matrix, which is implausible to handle in various aspects.This specific difficulty of HDLSS data is often referred to as the curse of dimensionality.High-dimensional statistical analysis overcomes this problem by introducing the dual representation of the sample covariance matrix, which is a much smaller, n × n dimension square matrix.The remarkable property of the dual matrix is that it contains practically the same information on the first n eigenvalues of the original sample covariance matrix.Nevertheless, the dual sample covariance matrix is small enough to handle without the curse of dimensionality.In addition to this most difficult issue, many unusual behaviors are known for HDLSS data, such as strong inconsistency, spherical concentration for Gaussian variables, and axis concentration for non-Gaussian variables.High-dimensional statistics is the framework to make use of these specific aspects of HDLSS data and construct a high-dimensional counterpart of traditional methods.
After introducing the concept of HDLSS data and the corresponding methodology, we applied the high-dimensional version of PCA on the NGC253 ALMA spectral map (A17).ALMA mapping data are typically HDLSS in general, and as for the data in this work, n = 231 and d = 1971, clearly n = d even though it is less extreme than the case in genomics, for example.The ALMA spectra are very rich in various molecular lines, and there is a very large variety in the molecular line intensities (A17).
We first applied NRPCA and A-SPCA, one of the latest methodologies in high-dimensional statistical analysis, to the original ALMA map of NGC 253.We found that highdimensional PCA worked very well, and it could describe the global trend of the spectra only by the first two PCs (the contribution of the first and second PCs is >50%).PC1 approximately represents the total intensity of the spectra, and PC2 describes the Doppler shifts due to the systemic rotation of the observed region.Each PC consists of ∼5-20 elements, much fewer than d.We note that practically the number of the controlling factors reduces further since these contributing elements are part of the same spectral features.The controlling features were the HCN(4-3) and HNC(4-3) rotational lines.This result strongly guarantees that the high-dimensional statistical analysis methodology works appropriately for the spectral map as HDLSS data.
Then, we applied high-dimensional PCA to the Dopplercorrected ALMA map.The resulting eigenspectra of the NRPCA correspond to more specific spectral features, especially to the core or wings of the emission lines.While PC1 again represents the total intensity of the spectral map, PC2 appears to delineate a 10 pc scale outflow phenomenon of molecular gas.More impressive is that PC3 and PC4 represent a larger-scale coherent flow causing a blueshift of emission lines, and PC5 the redshift.The region of the ALMA map where PC3 is negative concentrates around the position of the  radio continuum of NGC 253.This map then implies that the blueshift reflects the global larger-scale outflow from the radio center of the galaxy moving toward us.To examine this, we compared the region with the observed molecular outflow reported by Krieger et al. (2019).We find that the negative PC3 region agrees well with the root of the molecular outflow, named the SW streamer.This was also confirmed by the anomalous region of the velocity field in the Doppleruncorrected map.Further, this region also gives a close agreement with the outflow of ionized gas (Matsubayashi et al. 2009).All these suggest that the PCA indeed extracted the dynamical features from the ALMA spectroscopic map, typical HDLSS data.
We stress that this result is not obtained by choosing a handful of features by hand, but by making use of the full information on the high-dimensional data.Spectroscopic maps are typical HDLSS data, and we can apply highdimensional statistical analysis to various unsolved problems in astrophysics.As mentioned in Section 1, the condition also applies to spectroscopic mapping with satellite instruments, mainly due to the strong limitation of observation time and mission lifetime.The high-dimensional methodology will shed light on the spectral map of future space telescopes.For example, such surveys of nearby galaxies with SPICA have been discussed in detail.Though SPICA has been canceled, this methodology will remain useful for a future and/or possible replacement project.High-dimensional statistical analysis can also be applied to any type of data with a small sample size and large dimension, e.g., spectroscopy of rare objects.that is, the surface concentration of the unit n-sphere.From the geometric representation by Equation (B4), we observe that the If we generate a sample n = 3 from this distribution, we obtain a sample distribution as displayed in Figure 20.For a low d, we have a sample distribution not very far from our intuition, i.e., the sample data concentrate on the origin and have a certain dispersion around it.However, for a higher d, the sample data concentrate on a sphere with a radius of ( ) d n 1 2 (left panel of Figure 20).This clearly shows the situation of high-dimensional data, namely, the intrinsic feature of the data is completely overwhelmed by a tremendous disturbance from the huge noise sphere.
The behavior of the sample is even more counterintuitive for a non-Gaussian case.If we generate a sample from a ddimensional t-distribution t d (0, I d , ν), with a mean of μ = 0, a covariance matrix I d, and degrees of freedom ν = 5, we observe a very particular behavior (right panel of Figure 20).All the data concentrate on one of the axes for the high-d cases.This is referred to as the axis concentration, typically seen for a non-Gaussian density distribution in high dimensions.
Thus, we understood the mathematical reason why we have the counterintuitive behaviors of the HDLSS data in Figure 20.This also gives the important basis of the central method we apply in this work.

Figure 1 .
Figure 1.Schematic description of the relation between a sample covariance matrix S and its dual matrix SD .The top panel describes the sample covariance matrix, and the bottom panel describes the dual matrix, respectively.
λ d , and m is an unknown positive fixed integer.19 m stands for the number of spiked eigenvalues in the sense of quadratic contribution ratio.The details of the spiked model including the generalized spiked model are considered and given inYata & Aoshima (2013).The eigenvalues λ i (i m) describe the intrinsic information of the data, and λ i (i m + 1) represents the noise.Note that ., the noise satisfies the sphericity condition (Equation (B1)).We set the sample covariance matrix = -  ˜˜S n XX 1 as above.Here, we define n(d) as the size of a sample depending on d.2.2.1.Limitation of Traditional PCAYata & Aoshima (2009) investigated the properties of traditional PCA when it is applied to HDLSS data, and demonstrated the condition so that the estimation by traditional PCA would have the consistency in the form of n(d) = d γ 20 for a positive constant γ and the limitation of traditional PCA.They have given the following two theorems on the eigenvalues, depending on whether the condition would satisfy Theorem 2.3 YA12 III.Assume l ( )  i m i has multiplicity one.Under conditions in Theorem 2.2, and the components of Z satisfy Condition (16), it holds that

Figure 2 .
Figure 2. A schematic description of high-dimensional PCA.Even if we have significant important features because of the existence of a huge noise sphere, they are completely embedded in the observed data.High-dimensional PCA can subtract the noise sphere and enable the extraction of the desired features.
They obtained the 340-365 GHz (λ ; 0.85 mm) spectra covering a total frequency range of 11 GHz, compiling the results of two projects in the ALMA Cycle 2 observations: 2013.1.00735.S (PI: Nakanishi) and 2013.1.00099.S (PI: Mangum).The field of view of ALMA at these wavelengths is 16″-17″.The observations cover the frequency range of 340.2-343.4GHz and 352.5-355.7 GHz for 2013.1.00735.S, and 350.6-352.4GHz and 362.2-365.2GHz are covered by 2013.1.00099.S.The uncertainty of the flux calibration in ALMA Band 7 observations is ∼10% according to the ALMA Cycle 2 Proposers Guide.Data reduction is conducted with Common Astronomy Software Applications (CASA; McMullin et al. 2007), versions 4.2.1 and 4.2.2.The image of the central region of NGC 253 was created by using the task clean.The synthesized beam size of the final images is 0″45 × 0″3, which corresponds to 8 × 5 pc at the distance to NGC 253.With a velocity resolution of 5.0 km s −1 , the rms noise levels are 1.0-2.4mJy beam −1 .Various emission lines are detected in the spectra (see Figure2 and

Figure 3 .
Figure 3.The bright regions of NGC 253 map cut out by the mask.Left: the mask region map.White regions have significant intensity signals.Center: the cutout region with significantly bright emission.Right: the bird's view of the signal.

Figure 5 .
Figure 5.The eigenspectra corresponding to the first through the fifth PCs constructed from the original ALMA spectral map of NGC 253.Upper five rows: eigenspectra obtained by NRPCA.Lower five rows: same as upper rows, but obtained by A-SPCA.Only the controlling features have nonzero values.

5. 1 .
Application of High-dimensional PCA to the Doppler Shift-corrected Map of NGC 253

Figure 6 .
Figure 6.Normalized eigenvalues (contributions) obtained from the analysis of the ALMA spectroscopic map of NGC 253 by NRPCA.

Figure 8 .
Figure 8.The two-dimensional structure to map PC1 and PC2 obtained by A-SPCA.The left panel shows the map of PC1, and the right panel describes the map of PC2.

Figure 9 .
Figure 9.The scatter plot between PCs and observed physical properties.Left: the correlation between PC1 and the integrated intensity over the whole frequency range I [Jy beam −1 ] on the ALMA map of NGC 253.Right: the correlation between PC2 and the peculiar velocity on the ALMA map.

Figure 7 .
Figure 7.The distribution of PC1 and PC2 of the ALMA map of NGC 253 by A-SPCA.

Figure 10 .
Figure 10.Responsible features to characterize PCs from A-SPCA for the ALMA map of NGC 253.Stars represent the features responsible for PC1, and triangles are for PC2.

Figure 11 .
Figure 11.The eigenspectra corresponding to the first through the fifth PCs constructed from the Doppler-corrected ALMA spectral map of NGC 253.The upper five rows present the eigenspectra obtained by NRPCA, and the lower five rows show that obtained by A-SPCA.

Figure 12 .
Figure 12.The eigenspectra corresponding to the first through the fifth PCs constructed from the Doppler-corrected ALMA spectral map of NGC 253.The upper five rows present the eigenspectra obtained by A-SPCA with 0.25, and the lower five rows show that obtained by A-SPCA with 0.75.

Figure 13 .
Figure 13.Normalized eigenvalues (contributions) obtained from the analysis of the ALMA spectroscopic map of NGC 253 by NRPCA, after correcting the effect of the Doppler shifts of the systemic rotation.

Figure 14 .
Figure14.The two-dimensional structures of the distribution of PC1 and PC2 of the NGC 253 obtained by A-SPCA, after correcting the effect of the Doppler shifts of the systemic Bottom: same as the top panel, but with PC1 and PC3.Note that the PC2 in this figure is different from that in Figure8.

Figure 18 .
Figure 18.The region with velocity peculiarity in Figure 9.

Figure 19 .where
Figure 19.Responsible features to characterize PCs from the A-SPCA for the ALMA map of NGC 253, after the Doppler shift correction due to the systemic rotation.Stars and triangles represent PC1-and PC2-related responsible spectral features similar to those in Figure 10, while filled circles are PC3 related.

Figure 20 .
Figure 20.A sample distribution drawn from two typical probability density distributions.Left: a sample generated from a d-dimensional Gaussian distribution with a mean of μ = 0 and a covariance matrix S = ˜Ĩ d d ( Ĩd : a d-dimensional identity matrix).Right: a sample generated from a d-dimensional t-distribution with t d (0, I d , ν), with a mean of μ = 0, a covariance matrix I d , and a degrees of freedom ν = 5.
Table 1 of A17).The number of spectroscopic resolution unit (sampling number) is 2248.