Turning the existing building stock into a resource mine: proposal for a new method to develop building stock models

The construction sector is facing an important challenge to reduce its resource consumption. A promising strategy is to reduce the need of virgin resources by using the existing building stock as a resource mine. Various insights are needed to enable this. It should be clear how many materials are in the stock, when these will become available and to what extent these can be reclaimed in an environmentally and economically viable way. For this purpose a spatio-temporal building stock model is being developed and tested on the city of Leuven, Belgium. In a next step it will be assessed how these flows can be reclaimed in an environmentally and economically viable way. This paper provides a review on the methods used for building stock modelling and proposes improvements on the bottom-up archetypes scaling method. Building parameters relevant to material reuse and are introduced and a new methodology for upscaling is presented, using two data analysis techniques: a clustering algorithm and an artificial neural network.


Introduction
The construction industry is responsible for over 40% of the total European energy and resource use [1] and 25-30% of all generated waste [2]. The European Commission (EC) has therefore labelled construction and demolition waste (C&DW) a priority waste stream [3] and put a directive into place to increase its reuse and recycling [4]. Waste management is also one of the areas in its EU Action Plan for the Circular Economy [5]. In Belgium, there already is an effective C&DW recycling scheme, keeping over 90% off landfills [6,7]. This however mainly consists of energy intensive recycling processes such as melting of metals and grinding of mineral waste (e.g. bricks, concrete, ceramics) into aggregates. The latter is especially wasteful as it degrades high-grade materials into low-grade variants [8]. According to the NGO Rotor DC, reuse of deconstructed building components is limited to about 1% of all material. Deconstruction (rather than demolition) is a logistically, economically and legally complex process [9,10] and hence only a few companies offer this service. To scale the current initiatives from the small markets to a more significant part of the construction industry in a sustainable way, knowledge about the material flows resulting from deconstruction and refurbishment is needed. This requires insights in the material composition of the existing building stock, the demolition and refurbishment activities and related material flows and the environmental benefits related to potential recycling or reuse paths. This is a complex topic with highly disaggregated data that need to be connected to identify opportunities for a circular economy. Insights in the current state  2 of the art regarding the main methodological aspects (environmental modelling of the building stock, urban mining and geographical data analysis) are elaborated in the subsequent section. Based on these insights a new modelling approach is presented in section 3. In a final section conclusions are drawn and the further outlook is described.

Building stock modelling and urban environmental impact
Two main approaches can be distinguished in the field of building stock modelling: top-down and bottom-up approaches. Top-down approaches, such as input/output modelling and urban metabolism studies [11,12] allow to gain global insight in the systems at work, but are unable to uncover the mechanisms of its subsystems (essentially keeping the city as a "black box") [13]. Bottom-up approaches [13,14,15] subdivide a city in its various components which are assessed separately and then aggregated to obtain the full picture. In literature, two bottom-up approaches can be identified to assess building stocks: the archetypes approach and the building-by-building approach [14]. The former makes use of a limited set of archetypes, each representing a subset of the building stock. This trades off the benefits of a detailed analysis of these archetypes for a simplified depiction of the heterogeneous stock. The building-by-building approach models each building individually, but usually means a lower level of detail and a lower scale level (i.e. neighbourhoods) because of practical limitations.
Geographical Information Systems (GIS) provide additional insights by making the data spatially explicit. GIS allows to assess local opportunities and impacts [13,16] and to refine the bottom-up model through a more accurate inventory of the characteristics of all buildings [14,17].
Where earlier studies on urban systems were limited in scope (limited to material flow analysis, or greenhouse gas (GHG) emissions), more recent studies have adopted a more holistic approach using life cycle assessment (LCA) [11,12]. The LCA method allows to account for a wide range of environmental impacts (including GHGs, human toxicity and land use) and considers all life cycle stages (production, use and end-of-life), thereby avoiding burden shifting [18].

Materiality of building stock / urban mining
As resources are becoming more scarce and environmental burdens from the production of construction materials are increasing, attention is turning towards the existing building stock as a resource mine [19,20]. Insight in the materials available in the building stock and the related environmental implications have become important. Mastrucci et al. [21] attempted to assess the resources in an urban area through a GIS-enhanced archetype model, derived from the European TABULA project on building typologies [22]. An LCA was made for the end-of-life stage, comparing current construction waste treatment with a higher recycling fraction for concrete. The study did not include an assessment of upscaled recycling opportunities. Other studies focussed on dynamic life cycle inventory (LCI) in LCA and on temporal changes in the building stock, identifying the release of materials and volume of recycling from building demolition [23], but without considering waste flows from refurbishments and without assessing environmental impacts.
The approach is also used to assess changes in the stock due to maintenance and refurbishments [24], or due to shifts in energy supply or climate change [25]. The ongoing BBSM project includes a limited material flow model for Brussels as part of their research in urban mining, taking into account three renovation scenarios [26]. More detailed studies on the environmental impact of demolition practice and recycling potential are available at the building level [8] without providing insights in urban stocks. Innovative recovery methods for C&DW were for example investigated in HISER and an integrated approach to a circular building sector was strived for in BAMB [27,28]. A holistic building stock model incorporating circular principles should include these developing techniques.

Data analysis in GIS
The amount and complexity of building stock and GIS data require sophisticated techniques for data analysis [29]. In that regard cluster analysis, a type of data mining, has been successfully applied to partition and analyse a heterogeneous building stock regarding its energetic characteristics [30]. While data mining is used to uncover existing patterns from data, machine learning, which is a branch of artificial intelligence (AI) and a different method of data analysis, goes beyond this and can be used to build predictive models. Applications relevant for urban mining are still scarce, but the method has recently been applied to predict the energy performance of a building stock [31]. This study used artificial neural networks (ANN), a popular method of machine learning. The benefits of this nonparametric approach over statistical prediction models (e.g. as used by Mastrucci et al. for energy use predictions [32]) are a better representation of complex relations, the ability of the model to selfcorrect when more data becomes available and a lower sensitivity to researcher-induced bias by preprogrammed rules [29].

Goal of the new building stock model
The aim of the spatio-temporal model of the building stock is to gain insight in which building materials can be reclaimed at what time and how this is possible in an environmentally sound way. This model should hence allow to predict future material flows and anticipate on resulting opportunities and challenges concerning their reuse and impacts. As the model should allow to simultaneously assess material flows coming from the existing stock and material needs for refurbishments and new buildings, the model should go beyond identifying quantities of flows and facilitate a high-level recovery of materials. In order to achieve this goal a bottom-up model is proposed, validated by top-down data and combined with LCA for the environmental assessment. This is further described in the subsequent section.

Case Study
The method is developed using the city of Leuven (BE) as case study. Leuven has become an internationally recognised frontrunner in the sustainability debate and in this context, various data have been collected during the past decade which are valuable inputs for this research. Moreover, the mobilisation of citizens and enterprises alongside research institutions in Leuven 2030 shows potential for the aggregation of more data and application of the study results. The datasets that will be integrated for the research are listed in Table 1, but might be further extended with additional data sources during the research.

Categorization of buildings
To model the building stock, a set of representative buildings for both the existing stock and new buildings has been defined. In a next stepan inventory of the quantity of (reusable) materials will be made for each of these representative buildings.
3.3.1. Selection of representative building types for the stock. The various representative buildings had to cover variations in construction period, as this influences the materials used, and variations in building types, as this influences the amount of materials in a building. A literature review on archetypes of buildings identified two useful studies: the TABULA project [22] and the SuFiQuaD project [33] which provided a good basis for the characterization of the building stock. Both studies defined archetypes for the Belgian context, categorizing the stock by construction period and building typology. The archetypes have been described in detail with the aim of assessing energy use (TABULA) or environmental impact and renovation strategies (SuFiQuaD).
To define the set of representative buildings, a statistical analysis of the building stock in Leuven was made. This analysis supported the use of the five construction periods (before 1945, 1945-1970, 1971-1990, 1991-2005 and after 2005) and four main typologies (terraced, detached, semi-detached, multi-family buildings) used by both earlier mentioned studies. The results of the statistical analysis are presented in Figure 1. The four main typologies represent a more or less even share of the housing stock. The figure reveals that while terraced houses were historically more popular, this has shifted to detached houses and multi-family dwellings. For the aim of our research, the roof type of the stock was analysed to investigate if a further division of the archetypes distinguishing roof types is needed.
The analysis of the data shows that the amount of flat-roofed buildings is negligible, unless for the most recent time period and the multi-family dwellings as a whole. Accounting for these, the set of archetypes has been expanded from 20 to 28 archetypes.

Material inventory of selected types.
To assess which of the materials in the archetypes can be reused, parameters are defined that determine the reusability [34], the environmental impact and economic value of materials based on a literature study and expertise of the authors. Examples of parameters are the material age, fixation method, location in the building (accessibility), embodied environmental impact and remaining technical service life. These parameters will be applied to the material inventory of each of the archetypes to determine the amount of reusable materials.

Clustering algorithm
In parallel to the manual categorization under paragraph 3.3.1, a clustering algorithm has been applied to attempt the same. A k-means clustering has been selected, where four parameters are considered: construction year, total floor area, share of exposed façades and share of pitched roofs. The algorithm defines a set amount of "k" clusters of buildings based on their similarity, as determined in the fourdimensional space defined by the considered parameters. It results in clusters of a relatively equal internal variance, which makes it a good fit for the data extrapolation. The results are shown in figure  2 and accompanied by a semantic interpretation of these results in table 2.  The number of clusters was determined empirically. Because of the low number of flat-roofed single-family dwellings, pitched-roofed multi-family dwellings and generally post-2005 buildings, the algorithm did not recognize these as separate clusters. Because of this the algorithm was reduced from a k=28-means clustering to a k=16-means clustering. It should be noted that there is now one post-2005 cluster (cluster 16) and that clusters 14 and 15 are similar except for their roof type.
The 16 clusters determined by the algorithm are almost completely in line with the suggested partition of construction periods, with the exception of clusters 3 and 8. Given the lack of formal delineation between these partitions, this is remarkable.
The distinction is less clear for the typologies. It should be noted that these categories in the second graph are taken from the classification by the city of Leuven. At first it would seem that the parameters of floor space, ratio of exposed façade and ratio of pitched roof do not characterize the typologies well, but the final graph adds an important nuance. It shows that floor space is an important characteristic. In fact, in the case of cluster 11, the algorithm arguably has done a better job of identifying the multi-family buildings, as buildings with a floor space above 750m² are very likely to be an apartment building instead of a single-family typology.

Future outlook
In a next step of the research, the inventory from the representative buildings will be upscaled to the entire urban building stock based on GIS data. In existing studies the upscaling is most often done by linking an archetype to each single building in the stock based on its characteristics. All data is then scaled linearly based on the archetype and a weighing factor, e.g. floor area. Our study proposes a new approach using data analysis algorithms. Compared to the traditional expert-based models, which use hard-coded rules, data analysis algorithms by design allow for more complex and flexible models, being able to self-adjust their criteria when more data samples are added, or when the algorithm is applied to an entirely new data set. More specifically our proposal is to support the extrapolation of the single building data through two machine learning methods: a k-nearest neighbours classification and an artificial neural networks (ANN).

K-nearest neighbours classification (kNN)
A kNN is loosely related to the k-means clustering algorithm used under 3.4, but instead of defining clusters within a set, this method works in the opposite direction: it classifies elements into categories based on the category of similar elements. This similarity is calculated in the same way as in k-means. This way it will categorize buildings under the most similar archetype. If the archetypes were to match the centroids of the earlier defined k-means clusters, the kNN classification will return categories identical to the 16 clusters. If the archetypes don't match these centroids or a different amount of archetypes is selected, the kNN adjusts, demonstrating its flexibility.
After the classification, the data obtained for the archetypes under section 3.3.2 will be extrapolated for all buildings based on their category and weighing factors determined on the available GIS-data.

Artificial neural network (ANN)
Besides the clustering approach, a second approach, i.e. ANN, will be used for the upscaling of the material inventory of the representative buildings to the full building stock. An ANN is a computing system which can learn to perform a task (e.g. extrapolate data) without using a pre-programmed instruction set [35]. There are several learning paradigms which can be employed depending on the kind of task the ANN is expected to perform. In this case, a supervised learning paradigm is selected which uses a set of training data (i.e. the cases as basis for the archetypes) to predict values for a new set of data (i.e. the remainder of the building stock). For this, inputs (known data like floor area), intermediary nodes and outputs (unknown data like present materials) form a network connected through synapses (Figure 3). Each connection represents a random non-linear transformation of the signal. These are weighted and summed to obtain an output value. Based on the feedback from the training data, these weights are then iteratively adjusted in order to obtain a good approximation of the actual output signal. Figure 3. Concept for the proposed ANN. Data that is available for all buildings serve as input to extrapolate the unknown parameters. The network is trained on known data from the archetypes.

Interpretation of models
In the models obtained through the cluster analysis and ANN, buildings are not a priori categorised in a set of predefined archetypes, but are rather brought into relation to an expandable number of samples through their known variables. In this way, the unknown variables, i.e. quantities for the building elements, will be expressed as a sum of fractions of the inventories of representative buildings. The accuracy of these estimations can be improved through systematic interpretative workflows. The required steps will be specific to the resulting application. Relevant and feasible distinctions will be different for resource management on regional or urban scale, or even the scale of districts or building sites.
It is important to separate these interpretative steps from the development of the models themselves to assure flexibility and scalability and minimize bias.

Validation of the results of the various approaches
The results obtained through the scaling and interpretation will be validated by comparing the predictions with the reality of the building stock through test samples. Based on the outcomes of this comparison it might appear that the models require further improvements. Through an iterative process the models will be improved and more case studies will be added until the models perform well enough.
Besides validation through comparison with a random sample of buildings, the predictions of the models will also be validated by comparing these with top-down statistical data. This comparison might also reveal that one method is preferred over the other: the ANN might prove impractical in that it requires too much training data to become reliable or on the contrary demonstrates an important level of detail in its predictions. The cluster model could prove robust, but to what level it becomes inflexible for other applications because of programmed rules, remains to be seen.

Scenario modelling.
A temporal resource flow model, incorporating renewal rates of buildings and building components, allows to assess the influence of different future scenarios of expected changes in the building stock on the amount of (reusable) materials that will become available in time. These scenarios could for example represent increasing insulation levels, accelerated renovation rates or a gradual change towards the use of other construction methods (e.g. timber frame construction). The scenarios themselves will be defined based on expected future situations, covering 'business-asusual', 'targeted refurbishments', 'zero-energy standard' and 'reduced life cycle environmental impact'.

Quantification of building material flows.
In a more straightforward application of the building stock models, near-future macro or meso-scale flows can be anticipated and the material stock on building sites can be estimated.

Conclusion
In the context of using the existing building stock as a resource mine, insights are needed regarding the amount of materials in the stock, their location and when these will become available. Such insights can be derived from building stock models. For this paper a spatio-temporal building stock model is being developed. The model builds upon previous research on stock modelling using a bottom-up approach. Based on a literature review, this approach was preferred over a top-down approach as the level of granularity of the latter is too low for the aim of our research. The approach proposed departs from existing approaches using building typologies which are upscaled to model the full stock. In contrast to existing studies that use a manual categorization of the buildings, a clustering algorithm is used to allow for managing the large amounts of data in an efficient way. The approach was tested on the city of Leuven to cluster the various buildings in the stock and it could be concluded that the method is deemed adequate for this use case as it aids in interpreting the large amounts of data and complex interrelations between them.
Modelling the building stock remains a challenging task as reliable information can't be obtained for all buildings. The application of our clustering analysis exposed that there can be important semantic differences to data as was the case for the typological classification. The interpretation of intermediary forms, e.g. those between an apartment block and a terraced house, can lead to very different results depending on how they are treated in the scaling from the archetypes to the rest of the stock. In the next step of this research, a better scaling method will be pursued through two machine learning techniques: a k-nearest neighbour algorithm and an artificial neural network.