Simulating The Impact of High-Speed Train Line System Development on Urban Growth by Using Cellular Automata: Study Case Japan

A major question in the urban planning areas is to understand how urban areas expand. The development of transportation infrastructure such as high-speed railway system is likely to have considerable impacts on the urban growth especially to those area that have access to it. Understanding how the de-elopement will impact the urban growth will be a huge advantage in formulating the spatial plan and to tackle the potential problem. This research attempt to understand how the urban areas expand and analyse the expansion and its relationship with the development of high-speed railway systems by using Artificial Neural-Network Cellular Automata (ANN-CA) approach, utilizing the Modules for Land Use Change (MOLUSCE) plugin in QGIS. This approach has resulted in assisting us to understand the urban expansion pattern and analyse the important component that link the expansion with the infrastructure: Accessibility. By implementing the CA approach, this research also demonstrate that the ANN-CA approach is a promising approach for researcher to understand and analyse the spatial areas in the way it is intended: in spatial manner.


Introduction
One of the major topics and most alluring topic in urban planning is understanding the urban growth.Understanding the determinants of urban growth and the consequences of their interaction with other things is important in managing and realizing the idea of planning the city for the better.Many research have argued the structuring role played by the transport infrastructure and its movement in shaping the cities overtime are major component in the urban [1] [2] [3].It has also come to mind that the infrastructure, urbanization, and travel behaviour are likely to be related [4].The improvement of this infrastructure to transportation gave better access, better opportunity to activity, and opening many horizons in human activity in urban area.Conversely, the urban growth will attract more activities and generate demand, making it more prominent to the availability of movement infrastructure.
The urban growth not only brought possibility, but also prone to problems, when not responded properly.These problems may come in many forms such as congestion, over population, and urban sprawl [5].Countries in which the pace of development and urbanization is considered to be rapid will face these difficulties in coping with the demand of managing this urban growth.Japan, as one of the countries with considered rapid urban growth will surely face this problem in the future.The introduction of new railway systems-high speed train and light rail systems-have opened and push forward its mean in mobility, and further impact their direct surroundings [6].This railway system has been implemented as one of the backbones of transportation infrastructure in Japan since more than two decades in the past.Being one of the first country to introduce high-speed train railway system, the development and advancement of technology are further pursued in order to allow more efficiency in the mobility purposes.However, the need to understand its impact to their surrounding are unfortunately often neglected.
Japan have introduced its high-speed train railway system since 1964 and have been able to operate it and pursuing its advancement as one of their priorities in planning urban development.One of the recent activities are that the Japanese government planned to develop more connecting lines utilizing these railway systems.The development of this railway system will inevitably exert substantial impacts on surrounding urban land use, both spatially and quantitatively [7] [8].For example, the development of new railway system will result in the development of new station-which also resulted in the growth in activities in those specific areas where the stations are built.This growth can be induced by many aspects, such as the growing accessibility to other activities-opening their access to more broad working opportunity, attraction of new investments to the area, and all sorts of possibility.This new development will also induce new urban corridor, which defines as a connection of major urbanized area which connected by the transportation network that have been developed [9].
Understanding the development of transportation infrastructure-inducing mobility in urban areas, brings the importance of predicting the consequent of spatio-temporal dynamics of land use in the introduction of new connectivity.However, common previous studies regarding the impacts on developing new railway lines/systems mainly focuses on the impacts of its investment on land and property values [8] [10].Some other studies have tried to analyse the impacts on labour market, crime rate, and public discourse [11] [12].Regrettably, less effort has been made to stimulate or predict the impacts that it may bring to the spatial manner-typically on urban growth and land use activity changes associated with the development of railway systems.Based on this pretence, it is necessary to attempt to simulate and predict the association of the urban growth and the railway systems connectivity overtime.This attempt will be pursued by utilizing cellular automata model approach (CA) that have been growing on the popularity in the last few years to predict the urban land use land cover changes throughout the years.Hopefully, this attempt might result in in some discussion that may assist us in understanding the urban growth pattern and its association with the railway transportation system connectivity.

Data and Methodology
In this study, the objectives are to find out and understand the relationship and impact of the high-speed train railway system to the urban growth by observing the changes of urban land cover in study case area of Japan prefecture; specifically, the study case area are located in Nagano prefecture, Yamanashi prefecture, and Shizuoka prefecture.These chosen study areas are selected based: a) Urbanization status: these prefectures belong to the medium level urban area, which the growth of urban area can be measured more distinctively compared to the area that are considered metropolitan urban area or town area.b) Exposure to High-speed railway network: both of the chosen prefecture is either crossed by the existing high-speed railway networks or by the planned high-speed railway network that will be developed in the near future.The existing train station that supports the high-speed railway system (one of the major stations) is currently located in Shizuoka prefecture, while the planned future station will be located in the Nagano-Yamanashi prefecture.This station will be one of the major variables to compare the development of urban growth in the area where there are high-speed train networks and the area that are not served by the networks.As for the data utilized will be periodical data, consisting of the land cover for study case area in 1976-2016 with the time span of 10 years for each period.

Assessing the Impacts of High-Speed Railway Systems on surrounding Urban Land Use
Observing the changes on land use or land cover images will be helpful in monitoring the changes that happen to specific area over several time periods.This study intends to utilize the data specifically for land cover data in the year of 1987, 2006, and year 2016.The land cover data are gathered from the Ministry of Land, Infrastructure, Transport, and Tourism (MLIT) Japan.Although, the data currently available are in the form of mesh land cover data, therefore the spatial resolution will be in detailed maximum resolution of 100 m and need to be adjusted accordingly for it to be used as the base for this research.For this specific study, this resolution will be deemed enough to observe the changes on land cover as the study area observed will also quite huge in size area.In order to assess the impacts of High-speed railway system on surrounding urban land use, we observe the location specifically with the buffer of 15km in radius from the station that served as highspeed railway system (from now on will be called as Shinkansen line).Hopefully, by focusing on specific area and comparison between the area that is within the radius of shinkansen-served station (Shizuoka) and the one without (Yamanashi), we can observe the difference in urban growth speed with the same spatial variable considered.
The epicentre for the buffer is the Shizuoka station for area on Fig. 2 and Kofu station for the Fig. 3. Land use transition matrix will be utilized for detecting the changes that happen inside the buffer area, which then will be compared with each other.The change trends of each land use type over different time period will be observe using following equation: Where K denotes the land use/cover changes during certain period,   and   denote the area of one land use type at the  +1 time period and   respectively, and T denotes the length of the period in years.

Simulation of Land Use Changes utilizing Cellular Automata Approach
Simulation and predicting the growth for urban land use have been done commonly focusing on statistical approach.However, it is still a tough task to predict the urban growth when it is done in spatial manner.To deal with this problem there are some research that intend to observe the change in urban land cover in spatial manner utilizing the cellular automata model by training artificial neural network cellular automaton (ANN-CA), which can be explained as an iterative integration of artificial neural network and cellular automata [5].ANN-CA is a useful tool to utilize for modelling and predicting urban growth in a land use/cover manner changes [5].It works by iterating and simulating by each iteration the spatial dynamics of land use based on the conversion probability generated by the network.In order to better understand this approach, first we need to understand the cellular automata system model: Cellular automata (CA) is a discrete dynamics system in which space is divided into regular cell spatial forms and processing time at each different stage [13].Every cell generated will be assigned one condition, where this condition will be updated following local rules, given time, its own circumstances, and the state of his neighbour at the time [13].Refer at the Fig. 4 and Fig. 5 below: Cellular automata work by learning the changes of state of cell between t-n to tn+1 where not only explain about the shift in cell states, but also the rules introduced in the model.For simplified example, if the cell in time period = n is equal to one pixel or dot, and we introduce the cell in time period = n+1 5 is equal to two dots prolonging to left side and lower side on the initial dot, then the state for the cell in time period = n+2 will be three dots to left side, lower left side, and lower side on the initial dot.The process can better be described as Fig. 6 below: The example depicted by Fig. 6 is a simplified example where there are no transition rules introduced and only considering one variable or cell state at a time.To be precise, Cellular Automata process needs at least five components to be satisfied, that is: a) Cells: depicted as a pixel in rasterized image where it will be assigned one state.b) States: State is a nature that the cells belong to.For example, in land use or land cover, state can be the land cover category such as agriculture, wasteland, urban area, and so on.c) Neighbourhood: Neighbourhood are the states in which the cells are adjacent to.For example, one cell a will be easier to changes into another if the neighbouring cells is in the state b instead for the cells with neighbouring cell state c. d) Transition rules: Transition rules are rules that will be used as a base calculation in the model.
The rules can be as simple as the neighbouring example or be introducing another spatial variable.For example, in the Fig. 5, the changes can be different for each cell based on if there is a line, the size of neighbouring area, and the distance to the spatial variable.e) Time-step; Time step is the parameter for the changes to be bound into.For the model to learn the changes, it needs to be introduced to a change at t+1 so it can determine the direction of where it will be going to.
Following the requirements, the Cellular automata can be utilized to understand, learn, and predict future changes in an image.By converting and assessing the land use or land cover with the said requirements, the CA is a useful and growing in reliance to predict future urban changes.
This prediction using cellular automata (CA) approach will be done by utilizing the MOLUSCE (Module for Land Use Change) plugin in QGis software where it is possible to analyse the change in land use and also predict the future growth in urban land use by implementing Artificial neural network (ANN).In general, the ANN approach works by understanding and analysing the conversion probability from initial neuron (cell) to output neuron.This simulation model will give an output of conversion probability for each cell which then will be used to identify the change of state for each cell in the prediction simulation map model.

Results and Discussion
The analysis consists of two steps where the first step will be focused on the understanding of the impact of shinkansen railway by comparing the growth speed of urban areas in two different area: Shizuoka city and Kofu city.The second part will be focused on the attempt to simulate the growth based on the introduction of the high-speed railway system (shinkansen line system) as a spatial variable.

Assessment of the Impacts of High-speed Railway System (Shinkansen)
The analysis began by translating the urban land use with several spatial variable into raster image for it to be able to be an input in cellular automata model.The data required would be separated into two categories: spatial base data as an input data; and spatial variable data.Spatial base data would be the land use data which we wanted to understand and predict the changes, and the spatial variable data would be the introduced rules that are needed to explain the changes between each state.Before moving to analysis result, it would be important to understand the difference between each data category.

Spatial Input (Land Use
). Spatial input are the data for land use or land cover that will be base to observe on.In this study, the data used for the canvas will be using the land cover scoping on mesh area 5238 and mesh area 5338 according to mesh coding provided by MLIT Japan, that contain land cover in the mesh categorical area scoping the part of Yamanashi and Shizuoka Prefecture.The data available to utilize are ranges between 1976 to 2016 with different ranges of year gap.To be more easily to understand and following the requirements for the Cellular Automata approach, the data need to be consisted of minimal three timeframe with similar time period.Therefore, the data used in this analysis is the spatial map for 1997, 2006, and 2016 with roughly 10 years' time period.With this data, hopefully it can predict the land use for the 2026 year where the new shinkansen line is initially operated from.The spatial data canvas use can be seen on Fig. 6.
The spatial input needs to be in the form of rasterize image data for it to be able to be used in the MOLUSCE.The classification for the land cover also translated into five categories for better smooth simulation which are:

a) Agricultural Land (Red) b) Forest and Wasteland (Orange) c) Building Urban Area (Yellow) d) Open and Green Area (Green) e) Water body (Blue)
These categories also act as a state of which it will be used as a state for the cell.For this iteration, the cells will be in 0.00625 degrees or roughly 50x50 meters a pixel.The size of the cell will contribute to the variability of predicting future changes where the smaller the cell size (pixel) will be better.Although, because the initial data are in the form of meshes with 100x100m information, the details used within this ranges are still feasible to predict a detailed result.This will be proven in the second step.

Spatial
Variable.Spatial variable serves as a rule for predicting the growth.Spatial variable needs to be in the same geographical reference and extent with the same cell size for it to be able to be processed.Spatial variables are the variable that we thought would be able to explain the changing phenomenon in spatial manner.This does not mean that more spatial variable is better as there is a direction in which the correlation will be able to be used to predict future changes.For this research, the spatial variable initially used can be observed in the The spatial variable can be observed in the Fig. 7.These spatial variables initially are all used for detecting the changes.However, the in the second steps for prediction, the elimination are done for better simulation purposes.In assessing the impacts of high-speed train, we need to minimize and focus the scope of the analysis to specific part of the area.As the variable that we want to be focused on are the high-speed railway system or shinkansen railway in Japan, we focused on the area that have station that acts as a stop for the shinkansen and the one with the station development plan.In that case we focused on analysing the area in Shizuoka city and Kofu city, by generating a buffer area for around 15km in radius with the main station as its centre.The changes within that radius will then be observed and compared to assess the impact of the high-speed railway system to the urban growth.
The result presented in the Table 2 and Table 3 shows the percentage of land use for each period year where the increase and decrease of the percentage are better depicted in the Fig. 8 and Fig. 9.
Based on the result shown on the table and figured that are presented generated from the MOLUSCE modules, the result indicates that there are significant changes in the land cover especially indicated by the urban area percentage.Let us focus first on the changes that happen in the urban area land cover.The result presented in the Table 2 and Table 3 shows the percentage of land use for each period year where the increase and decrease of the percentage are better depicted in the Fig. 8 and Fig. 9.
Based on the result shown on the table and figured that are presented generated from the MOLUSCE modules, the result indicates that there are significant changes in the land cover especially indicated by the urban area percentage.Let us focus first on the changes that happen in the urban area land cover.The initial land cover percentage for urban area in Shizuoka and Yamanashi area are 9.949% and 8.165% respectively.The final percentage for urban area in that area are respectively 15.087% and 15.312% where the difference is 5.138% and 7.079% respectively and the average difference between each year of 1.713% and 2.360% for each area.Observing the changes, the urban growth in Shizuoka area is more stable within its increase between each period.Meanwhile Yamanashi area have sharp increase in 2016-2026 period.To further analysed this phenomenon, we need to try to observe and compare the change by utilizing the formula introduced before (K) to understand the urban growth change as a whole.In order to observe the magnificence of the land use change rate, the percentage of changes need to be in absolute positive value (n>= 0) because we intend to observe the magnitude of value of changes.The result of K value for each different periods depicted in Fig. 10 shows that Shizuoka area have stabilized upward changes in each period, meaning that the changes in land use area are happening constantly for each period of 10 years.Meanwhile, the K value in Yamanashi area have sharp incline in the first period and slightly stabilize in the second period.This denotes that the urban growth change in Shizuoka might have becoming stagnant in the first period where the development has already been reaching its peak.In other case, the changes might have just begun to develop in Yamanashi area in the same period.The result shown that urban growth in Yamanashi area have slightly more variances compared to the Shizuoka area.However, the first shinkansen that crosses Shizuoka line begun its first operation in 1964 therefore it might be possible that the rapid changes in urban land cover happened far before the data in this analysis are used.Although, by referring to Fig. 8 and Fig. 9 consecutively, specifically on urban and agricultural land use, we can observe that Shizuoka area have undergone its peak urbanization period in 2016, depicted from the intersect between the two lines, while Yamanashi area will have its peak at the 2026 period.This might suggest that there are specific periods of urbanization in the case of urban growth.This also might propose the hypotheses that the growth in Yamanashi area seems to be more significant as the urban growth in Shizuoka already at its peak in the several period in the past.

Simulation of Land Use Change (Predicting 2026 Land cover)
In the second step, we attempted to predict and simulate the urban growth for both study case area where the output data were also being used for the generation of the first step analysis.Utilizing the ANN-CA approach provided by the MOLUSCE modules in QGIS, we tried to predict how the urban cover will be in 2026.
After we ascertain that there is significant urban growth in both case area study by observing land use in 1976, 2006, and 2016, we intended to predict and plot these changes into the spatial manner.This approach will be likely more helpful and effective in understanding the pattern of urban growth in the observed area.By understanding the growth pattern and simulating the changes, the output might be beneficial for consideration in urban planning and development.
The steps in predicting the urban land cover for 2026 can be defined into five steps, which are: evaluating correlation of spatial variable; training artificial neural-network (ANN); simulate the first period of image; validation of the output; Prediction of future image.For detailed process can be found within this section.

Evaluating Correlation.
Spatial variable that was used as a rule might have or have not a significant correlation with the urban growth and/or between the other spatial correlation itself.Therefore, before we begin training the artificial neural network (ANN) to be used as a base for predicting future urban land cover, we need to check their significance.
We begin by inputting all of the variable into the modules, then processed it by checking their correlation with Pearson's Correlation formula.The result shown in Table 4 shows the number of R which will be used to judge the relevance of the spatial variable.As we wanted to observe the changes in the land use, we need to filter out the spatial variable that does not have significance R number (eliminating the variable that have R value close to 0) so that the prediction can be observed.
The author has tried to include all the variable in the formulation to realize that there is a major importance in filtering the spatial variable in the first steps.
First, more variables with R value closer to 0 will result in insignificance result in simulation.This happens because the ANN will predict that the variable that acts as an explanation of the urban growth will detect that there are no changes happening between the time periods.Conversely, the more variable with high significance value included in the modules will result in more significant changes in prediction.This will result in higher kappa value in the future.Kappa value significance will be introduced in the next steps.
Second, the notation of the value-negative and positive value, did not specifically react as a buffer for prediction.For example, negative value in correlation does not mean that the state in one cell is more not likely to change to other state.In order to specifically assign buffer or mid-point for better prediction, it is recommended to utilize fuzzy method to assign weight for each variable that have been acquired by using Euclidean Distance function.
After observing the variance and correlation with considering the output for each combination of variable, the variable utilized in this research is filtered into: DEM; distance to nearest railway; and distance to nearest water body.The filter is done by analysing the percentage of validation and score of kappa value.[5].These three layers contain specific neurons that respectively represent different spatial variables related to urban land use dynamics.The spatial variable that are inputted in the following step will serve as the inputs to the first layer of an ANN to determine the direction of land use conversion.According to Lin [5], the formula can be depicted as follows: Where   (, ) is the ith attribute value for cell k at time t, and T denotes transposition.Then, the signal received by the hidden layer can be calculated by using the following equation:   (, ) = ∑   ,   (, )  (3) where   (, )is the signal transmitted from cell k at time t to the hidden layer's jth neuron, and wi,j is the parameter between the input and hidden layers.
In general, only small proportion of cells will undergo changes in the certain period.Because of that, the CA model will involve many iterations to determine if a cell should be converted or not using certain threshold.This step serves to train the neural network to assign whether each cell should be conversed or stays.This procedure is represented as follow: where    denotes the land use type of cell k at time t and t+1, p(k,t,q) is the conversion probability from the current land use type to type l for cell k at time t, and  ℎℎ is threshold value determined by the total number of urbanized cells derived from the last observed land cover images.For more detailed equation process, please refer to [5].
To interpret the neural network learning curve, we need to understand that there are three dynamics to observe; underfit, overfit, and good fit.Underfit are described by flat lines with big gap between training and validation line-this means the model is unable to learn the training dataset.Overfit are described by coming closer to each other and have big spikes at some points-this means that the model learned too much information.The good fit is where the training line and validation line move together with a stabile gap between the two lines.According to this explanation, the neural network can be considered to be optimal good fit with the validation overall error of 0.025.As for overall accuracy and validation kappa value, the overall accuracy depicts the percentage of correctly classifies instances out of all instances.For the validation kappa value, the higher the kappa coefficient, then the more accurate the classification is-meaning the more acceptable the models are.As the kappa value, learning curve, and overall accuracy are all in acceptable ranges, then we move forward using this result.As observed in the validation graph, it was found that with the spatial variable as stated, the model can predict the land use with 0.76334 overall kappa value which are in the range of acceptable (>0.7 are considered good kappa value) and have 85.516% of correctness with actual data.This result shows confidence in the prediction model for further use in simulating the land use for study case area in 2026.
The simulated 2026 land use image obtained by the model shows a significant growth in urban area (yellow) in both of study case area.However, it might be more distinctive in the Yamanashi area as there are distinctive shifts from agriculture land use to urban land use (red to yellow), even though the number of urban areas is still higher in Shizuoka area.
This simulation in spatial canvas gave us a better understanding on how urban growth happen.Comparison between two areas might show that the area in Shizuoka area might only grow upward because of the sea areas compared to the growth in Yamanashi area that have more spread out in directions.This might also explain about the phenomenon where the rate of growth in Yamanashi area seems to be higher than Shizuoka area.
Furthermore, we can also discuss regarding the correlation value for spatial variable used in the model.As depicted in the Table 4, we can observe that the distance to railway variable have slightly higher correlation compared to other variables.This might denote that the spatial variable that might explain about the urban growth are more likely to be closely related in the accessibility to the railway disregarding the status of the railway.Although, as the scope of the study can only explain focused area not a broad scope of area (for example from Tokyo to Nagoya along the Shinkansen line), there might have some further analysis done to better check the impact of the variable of high-speed railway network

Conclusion
The introduction of new transportation network especially in the massive manner such as high-speed railway system will inevitably have an impact to the urban growth.Although the impact can be observed not only on statistical and theoretical manner but also in spatial practice.Analysing the impact of the high-speed railway development in urban growth changes by comparing the location that have access and the location that not yet have access to the infrastructure we can deduce that the impact will more likely related to the access for the railway lines, the current status of development for the area, and the geographical challenges for each area.
The study has shown that it is possible to predict and simulate the urban growth in spatial manner utilizing ANN-CA model approach by utilizing MOLUSCE modules, of which the result was used to analyse the impact for each spatial variability in the area correlated to the high-speed railway system development.However, this study can be further enhanced by introducing more sample of the study case data in several past period (preferably before and after the development of high-speed railway system) to further emphasize the current result.

Figure 1 .
Figure 1.Study case area of the Two Prefectural area in Central Java.

Figure 2 .
Figure 2. Study case area of Shizuoka City, Shizuoka Prefecture, Central Japan.

Figure 3 .
Figure 3. Study case area of Kofu City, Yamanashi Prefecture, Central Japan.

Figure 5 .
Figure 5. Example of Cellular Automata RulesSource:Leao, 2004 [14] ANN-CA is a useful tool to utilize for modelling and predicting urban growth in a land use/cover manner changes[5].It works by iterating and simulating by each iteration the spatial dynamics of land use based on the conversion probability generated by the network.In order to better understand this approach, first we need to understand the cellular automata system model: Cellular automata (CA) is a discrete dynamics system in which space is divided into regular cell spatial forms and processing time at each different stage[13].Every cell generated will be assigned one condition, where this condition will be updated following local rules, given time, its own circumstances, and the state of his neighbour at the time[13].Refer at the Fig.4and Fig.5 below:Cellular automata work by learning the changes of state of cell between t-n to tn+1 where not only explain about the shift in cell states, but also the rules introduced in the model.For simplified example, if the cell in time period = n is equal to one pixel or dot, and we introduce the cell in time period = n+1

Figure 6 .
Figure 6.Simplified example of Cellular Automata Generation.

Figure 8 .
Figure 8. Spatial Variable Utilized.(a) Slope and topology, (b) urban area, (c) distance to nearest station, (d) distance to nearest railway, (e) distance to nearest road network, and (f) distance to nearest water body.Black coloured areas meaning the closest.

Figure 9 .
Figure 9. Change of Land cover percentage in Shizuoka area.

Figure 10 .
Figure 10.Change of Land cover percentage in Yamanashi area.

Figure 11 .
Figure 11.Land use change rate over different periods.

Figure 15 .
Figure 15.Simulation Image of Urban Land Use in 2026.Shizuoka area (a) and Yamanashi area (b)

Table 1 .
Utilized Spatial Variable(s) Distance from a cell to nearest station Euclidean Distance function in ArcGIS Distance from a cell to nearest railway Euclidean Distance function in ArcGIS Distance from a cell to nearest road network Euclidean Distance function in ArcGIS Distance from a cell to nearest water body Euclidean Distance function in ArcGIS

Table 2 .
Percentage of Land cover for each specified year in Shizuoka area.

Table 3 .
Percentage of Land cover for each specified year in Yamanashi area.

Table 4 .
Correlation value for Spatial Variable using