Quantitative Analysis of the Educational Infrastructure in Colombia Through the Use of a Georeferencing Software and Analytic Hierarchy Process

The distribution policies of the national budget have been showing an increasing trend of the investment in education infrastructure. This is the reason that makes it necessary to identify the territories with the greatest number of facilities (such as schools, colleges, universities and libraries) and those lacking this type of infrastructure, in order to know where a possible government intervention is required. This work is not intended to give a judgment on the qualitative state of the national infrastructure. It focuses, in terms of infrastructure, on Colombia’s quantitative status of the educational sector, by identifying the territories with more facilities, such as schools, colleges, universities and public libraries. To do this a quantitative index will be created to identify if the coverage of educational infrastructure at departmental level is enough, by taking into account not only the number of facilities, but also the population and the area of influence each one has. The above study is framed within a project of the University of the Andes called “visible Infrastructure”. The index is obtained through a hierarchical analytical process (AHP) and subsequently a linear equation that reflects the variables investigated. The validation of this index is performed through correlations and regressions of social, economic and cultural indicators determined by official entities. All the information on which the analysis is based is official and public. With the end of the armed conflict, it is necessary to focus the planning of public policies to heal the social gaps that the most vulnerable population needs.


Introduction
The distribution policies of the national budget mark an increasing trend in investment in educational infrastructure, making it necessary to identify the departments with the greatest number of facilitiessuch as colleges, higher education centres and libraries -and the most abandoned, in order to know where more government intervention is required. At first, a general context for the national and international economy and educational infrastructure investment state will be shown. This, in order to put in evidence, the relationship between a country's economic growth and its investment in education. Later, the value of the geospatial analytical software will be highlighted, as the process carried out for the analysis. Then, the results, conclusions and final recommendations will be shown.
Finally, an index will be provided. It reflects if the coverage of the educational infrastructure at departmental level is enough. Taking into account the amount of schools, the population and the area, with the final purpose of giving a categorization for each department. Is essential to consider the demographic behaviour, because it is a mean to determine if the actual infrastructure is sufficient. As reported to the World Bank, Colombia's actual population is 48.2 million habitants, from which the 24.3% are children between 0 and 14 years old [2] Colombia's infrastructure investment -¿ How the GDP have been invested in infrastructure?
According to World Bank, the investment in education has been growing in the last few years, in fact, since 2011 the investment has oscillated between 15.5 and 17 percent from the total GDP [2].
On the other hand, based on the information provided by Infraestructura Visible, it can be observed in Figure 3, the quantity of educational equipment for private and public schools, public libraries and high education institutions. reveals a better coverage of education in the territories mentioned before. So, a deeply analysis is required.

International Context
In order to compare Colombia with other countries, it was important to take one developing country (Perú) and two developed countries (Germany and the United States). This in order to demonstrate the relationship between gross domestic product, investment in education and, in turn, non-attendance at schools. Taken into account the gross domestic product of each country, it can be observed a relationship of the GDP with respect to non-schooling rate in Germany. The relationship indicates that when the GDP increase, non-schooling rate decreases. However, the same does not happen in the United States, where the phenomenon is reversed [4].
On the other hand, in Colombia the mentioned rate is increasing, despite the efforts of the government who have been increasingly investing in educational infrastructure. Between the compared countries, Colombia is the one with a higher non-schooling rate [4]. So, with the GDP decrease, the educational infrastructure investment also decreased, despite that, in comparison with developed countries, the investment mentioned has reached high GDP percentages, it has not been enough because the non-schooling rate is increasing. Finally, having this context in mind, it was necessary to do an analysis at national level with the information provided by Infraestructura Visible, in order to understand how Colombia is investing in infrastructure and how is it at department and municipality level in terms of education.

Georeferencing as an analysis tool
Is necessary to highly the importance of using a geographic information system, as ArcGIS, with the purpose of realizing an integral analysis.

Objective
The main objective of using this software, is to perform an integrated and geospatial analysis at the department and municipality level in Colombia. By taking into account the quantity of equipment such as, high education institutions, schools and public libraries along with the available demographic information. All of this with the purpose of identify which departments have the better or the worst educational infrastructure at a quantitative level.
This paper is intended to show a visual and concise analysis, providing the reader with official information, giving only by the entities in charge. Also, the objective is to integrate and to georeference the information available in the database from Infraestructura Visible. With the main purpose of performing a complete analysis of Colombia's educational infrastructure and show it on a visual way. It is worth highlighting that because of the database, only a quantitative but not qualitative analysis can be performed.

Limitations
By having only official information, some limitations can be found. At first, is worth saying that the information obtained from different sources, before it was integrated in the main database, was either unorganized, incomplete, or in some cases wrongly georeferenced. In general, there was a lot of missing or incoherent information. It was found that the databases between institutions were uneven and so, a source of error when analysing. The difficulties in completing the information occurred because some addresses were non-existing in Google Maps, which made it more difficult to georeference some points.
Lastly, in order to put the information in the georeferencing system, the ArcGis for Office program had to be used. With it, converting addresses into coordinates was possible. The next steps show the process executed, so that the program could processes the addresses.

I.
The municipalities and departments format had to be: The first letter capitalized and the following lowercase. For example: Ibagué, Tolima.

II.
The addresses format needed to be: The first letter capitalized and the following lowercase. So, when having incomplete addresses or misspellings, the coordinate would not be found. That made it necessary to manually locate some points, which extended the process more than expected.

Methodology
According to an article from the Universidad Autónoma de México, "education is one of the factors that influence the progress and improvement of people and societies", because is determining when "reaching higher levels of social wealth and economic growth; even the economic and social inequities; foster social mobility, accessing to higher employment rates, elevate the cultural conditions of the population; expand opportunities for young people, promote science, technology and innovation" [5]. For this, it would be correct to say that education is one of the pillars for a more developed society.
However, the Organization for Economic Co-operation and Development (OECD) says that Latin-America has the lowest number of students, who exceed the world average of educational quality. The same study says that, in Colombia, 43% of the students have a low academic performance. Also, as shown before, Colombia has a high non-schooling rate that surpasses the 7.5% [4].
Because of the importance that education has, and given the data dissipation of educational infrastructure, said information was integrated to identify its coverage at departmental level. The proposed methodology for this, was based in 6 chapters: a. How information was obtained? The information used was given by the database from Infraestrcutura Visible. Because this paper focus in education, only public and private schools, colleges, universities and public libraries were taken into account. The process started by gathering information from different entities, like the Ministry of Education, Governorates and the National Administrative Department of Statistics (DANE).
b. Educational infrastructure index (or education index): Once the information about the quantity of educational centers was obtained, the importance of different types of infrastructure available were stablished by an analytic hierarchic process (AHP). The final result led to educational indexes by department, which will be shown later.
c. Departmental analysis: This analysis was based on the amount of normalized infrastructure with respect to population, area and population density by department. With this analysis, it can be observed graphically the scale of the indexes built with the AHP methodology.
d. Municipality Analysis: Once the classification of the department was given by the index, an analysis was performed on the municipalities with the higher and the lower educational index. As made at departmental level, the educational objectives were graphically shown.
e. Index correlation: With the objective to validate the constructed index, a series of correlations among other indexes from other institutions were performed.
f. Illiteracy: A final comparison was performed between the indicators used in the verification of the last step and the illiteracy rate. The objective was to verify the correlation between education and infrastructure. If the indexes were significant in the linear regression with the illiteracy rate as the dependent variable, the hypothesis could be confirmed.

Available information
The information available at Infraestructura Visible was divided in four different sections:

A. Demographic Information:
The georeferencing attributes table includes the population and the area in squared kilometres by municipality. With that information, the following map was generated.

B. Schools:
The attributes table in which georeferencing was performed, has the name, localization (department, municipality and address) and the contact (e-mail phone number and fax) of each school. Map 2 shows the schools over the population density cape. In this case, it can be observed that the majority of schools are placed in the zones with more density. But, is worth saying, that there is at least one schools by municipality.

C. High educational institutions:
The attributes table in which georeferencing was performed, has the name, localization (department, municipality and address) and the contact (e-mail phone number and fax) of each center. Map 3 shows the institutions over the population density cape. These high educational centers can be easily found in the developed Colombian departments such as Antioquia, Valle del Cauca and Cundinamarca. Contrarily, there are some departments with no center.

D. Libraries:
The information available for libraries, includes name, localization (department, municipality and address) and the contact (e-mail and phone number) of each one. As in the cases shown before, the majority of institutions are located in the zones with the higher population density. The map shows a lack of libraries in some departments. Because the map only shows the public libraries, is important to notice that other libraries like the ones in schools or universities are being omitted.

Quantitative educational infrastructure index
The index of educational infrastructure was constructed from the AHP, which was divided in two stages: 1. In this stage, the valuation of the different institutions (schools, libraries, high educational centers) took place. Because the valuation is subjective, it was important to develop a survey for 20 professionals. It was asked to engineers, economists and architects. In order to measure the importance and impact given to each element.
A. Based on your experience, which of these three types of infrastructure has a greater impact at educational level in a department? -Libraries -Schools -High Educational Center For this question, the professionals were asked to fulfill a matrix, where they had to rate the importance of each institution compared with the other two. The following matrix, shows the average rates from the survey: 1.27 0.55 1 A value lower than 1reflects a lower weight for the educational center compared. A value greater than 1 reflects otherwise and a value equal to 1 reflects an equivalent weight for both of them. For example, it can be said that schools are three times more important than libraries, and at the same time, 1.8 times more important than the high educational centers in terms of support of infrastructure in a department.
B. When you have to perform a national high educational level infrastructure analysis, how would you normalize the number of institutions for each territory?
-By area -By population The result is shown below: The majority of the people surveyed agreed that the most relevant factor, when normalizing, is the population.
2. The second stage consisted on the calculation of the weights based on the results from the surveys, with the AHP method. For this, six variables were defined, which expressed each normalized educational institution by population and area. The results are shown below:   In can be observed that Bogota is taking the lead in terms of educational infrastructure with a 0.35 index. Similarly, the most developed departments take the first eight places. As expected, departments located at the south-east side of the country, show an extremely low index compared with the departments shown in figure 7. This could happen because of the lack of equipment.

Departmental analysis
With the ArcGis software, it was possible to map the educational infrastructure indexes found, the result can be seen in 5 th map.

Map 5. Educational Index
As mentioned before, the lower indexes are located at the south-east side of the country, while the higher ones are either located in the center, or in the most developed departments.

Municipal analysis
This analysis was performed for two departments, the one with the highest index and the one with the lowest index.

i. Cundinamarca:
Taking into account that Bogotá has the best index, Cundinamarca was selected as the most developed department in terms of educational infrastructure. In order to perform the analysis, the georeferencing software was used, which made possible the mapping of the education focus and

ii. Vaupés:
On the other hand, Vaupés had the worst index of educational infrastructure around the country. The same methodology was performed. The results are shown in map 7. Is worth highlighting that not even one municipality has a high education center. However, it can be observed that the focus in infrastructure are the schools, because each municipality has at least one.

Colombia's heat map
The map for all the departments is shown below. Here can be seen with more detailed, the education coverage at national level. In the map, the green zones indicate the highest coverage of education, while the red zones show a low coverage.

Index validation
In order to prove the veracity of the educational infrastructure quantitative index obtained, a series of tests were made. On the one hand, economic indicators were taken from the Ministry of commerce and industry and tourism, such as GDP per capita and quantity of exports and imports. On the other hand, indicators for education were also taken, such as number of municipalities, schools, high education institutions and libraries by department, and as an especial index the illiteracy rate.
Similarly, indicator for competitiveness was obtained from the Economic Commission for Latin America (ECLA). The scale provided by said indicator, is divided in 5 factors, which are explained below: Strength of the economy: This, examines the availability of resources, the level of skill development and departmental economic achievements vis-à-vis macroeconomic, structural and demographic constraints [6] . II.
Infrastructure: This evaluates the availability, quality and efficiency of the infrastructure, understood as the set of permanent facilities that support the needs of production, communication and welfare [6]. III.
Human capital: This examines the aggregate availability of knowledge, skills, competencies and social attributes, as well as its production through education and health systems [6]. IV.
Science, technology and innovation: This assesses the level of skills development, achievement and the availability of resources of both academic and productive innovation systems based on science and technology [6].

V.
Institutions, management and public finances: This examines the territorial management focused on the strengthening of departmental public finances, and with it, the degree of autonomy [6]. Now, the competitiveness of each department can be seen in parallel with a graph that compares the position given to each one with respect to the index of educational infrastructure and the scale of competitiveness: Figure 9. Indicators vs Competitiveness The evidence shows a narrow relationship between the quantitative index of educational infrastructure and the levels of competitiveness obtained from ECLA. Also, by comparing with the other indicators a strong correlation was found, except with exports per capita, number of municipalities and libraries. The last one, might be because of the low weight that it has in the index.

Relationship with illiteracy
Despite the government's efforts by investing in educational infrastructure, the illiteracy rate hasn't been eradicated. In order to understand this, is important to prove the relevance of the index shown above and the other mentioned indexes, in the illiteracy rate. To do this, the national illiteracy rate is first shown graphically. In map 9, the white zones represent no information available. In the green zones, when they are darker, it means that the illiteracy rate is higher.

Map 9. National Illiteracy
It is worth highlighting that the index with more correlation to illiteracy rate is Human Capital. Which according to ECLAC "evaluates the aggregate availability of knowledge, skills, competencies, training and personal and social attributes, as well as their production and protection through education and health systems, related to the potential capacity to perform productive work to generate economic value" [6].

Figure 10. Education Index vs Illiteracy
On the other hand, some linear regressions were running. In each one, the independent variables were each indicator separately, and the dependent variable was the illiteracy rate. Is important noticing that the values in the table, correspond to the p-value for each coefficient. This means that values below 0.05 indicate a causal relationship between illiteracy rate and the indicators. The table shows, that the indexes that have a stronger relationship with the illiteracy rate, are the ones provided by the ECLAC.
Likewise, it can be seen that the quantitative index of education infrastructure is adjusted to illiteracy rate within a 90% confidence interval. The similarity between these values is shown in Figure 10. From which we can say, that for most departments with information, the amount of infrastructure reflects the level of illiteracy. With this we come to say that the number of education centres influences in the amount of population who access to education in the country

Conclusions and recommendations
I. Despite the efforts of the government, the investment in educational infrastructure haven't stopped the increase of the non-schooling rate. II.
The evidence shows that there is a high demographic and infrastructure concentration in the country's capital, Bogotá. The index verifies that this city is the most developed one in terms of educational infrastructure. III.
There is a significant lack of educational infrastructure in the south-east of the country. To the point where some territories don't have high educational institutions. IV.
Cundinamarca by being in the center is the most developed department. Also, thanks to Bogotá and its neighbouring municipalities. V.
Developed departments such as Cundinamarca, Antioquia and Valle del Cauca, have a higher index mostly because of the scored of the capital. VI.
The lack of information in some regions limits the analysis. VII.
Based on the correlation and the individual linear regression, it can be affirmed that the educational indexes previously explained, have a direct relation with the dependent variable: illiteracy. VIII.
Some of the indexes taken initially, are insignificant when related to illiteracy. For example: "exports per capita" and "number of municipalities". IX.
The most significant index within the variable illiteracy, is "human capital". X.
It is necessary to relate the educational index with a social indicator such as illiteracy to verify its validity. XI.
It is imperative to follow the database in order to complete it and to be able to do educational quality analysis. XII.
To establish a state of infrastructure in education, it is not only necessary to take into account the quantity of equipment, but also the quality of the equipment. For example, teachers per student, international ranking of schools according to Pisa tests, resources provided to students, facilities, among others.