A Review on Environmental DNA (eDNA) Metabarcoding Markers for Wildlife Monitoring Research

Environmental DNA or eDNA utilizes traceable genetic materials in the environment for monitoring the presence of organisms in a given area and it is now gaining popularity as an alternative for traditional monitoring methods. Thus, the selection of genetic markers is crucial for identification of species in wildlife monitoring. This paper aims to review several DNA markers which are appropriate and reliable for detection of organisms from the environmental samples. We performed systematic literature search from SCOPUS database to review all molecular markers of eDNA. This study focuses on the importance of markers selection which can be utilized by next-generation sequencing (NGS) for biodiversity monitoring. Cytochrome C oxidase Subunit I (COI) are noted as the most widely used marker in metabarcoding research for detection of targeted species.


Introduction
In the 21 st century, earth's biodiversity shows a declining trend which become a major crisis and challenge for environmentalist to protect the ecosystem especially in identification of organisms . Traditionally, the conservation methods to access the distribution of wildlife can only be relied on morphological identification of species by visual surveys and manual counting the individuals. As time goes by, Sanger sequencing was introduced by Sanger et al [1] in 1970s with the development of polymerase chain reaction (PCR) have helped to identify the DNA of species by amplifying the genetic region of interest [2]. The amplification of DNA using PCR provides aid in species identification either for one or multiple species simultaneously depending on the genetic marker used. The genetic markers or DNA markers are identifiable DNA sequences found in genome that can be used to track the inheritance of the organism [3].
However, to recover DNA of thousand specimens from various environmental sample, usage of DNA metabarcoding approach may give a valuable insight for quantifying biodiversity [4] using highthroughput sequencing (HTS) or next-generation sequencing (NGS). NGS has the potential to generate several hundred thousand to multiple millions of sequencing reads for identification of species. Around 2000s, NGS evolved rapidly with various platforms released, where it initially started with Titanium pyrosequencer (Roche) followed by Illumina technology [5]. These platforms have been used in order  [6]. To overcome the problem in monitoring large-scale biodiversity, eDNA are used to extract the DNA in order to identify the species in an ecosystem using NGS technologies [4]. DNA from the environment holds various genetic material of the organism living in that habitat and they can be found and collected easily [7] which later can be used for analysis of the environment. With thousands of samples, NGS can compare the obtained sequences to a growing standard reference library of known organisms and taxa present in an environmental sample can be identified precisely [8].
Regardless of platform used, the development marker for eDNA metabarcoding is also important for eDNA assays if the amplified region is relatively short, effective and specific for the target species [9] as eDNA can be degraded in a short time [10]. Different primers and regions will give distinct results in coverage, resolution, and it can lead to bias between taxa in describing the communities in environment [11]. Hence, this review focuses on the suitable markers used in wildlife monitoring according to the eDNA research taxon. The findings can be used for future references during selecting the potential markers in biodiversity management and monitoring.

Materials and methods
The data mining on the selection of markers in eDNA study were done using SCOPUS database. According to the indexed title, abstract and keywords by using "eDNA" as the main keyword, the search of articles was conducted by adding some keywords involving markers used for detection of eDNA in wildlife monitoring, namely "COI", "16S", "Cytb", "D-loop", "18S", "NADH" and "12S". These markers were selected because most of the previous study exploited the region of mitochondrial DNA and nuclear DNA for detection eDNA among wildlife. In addition, "wildlife" keyword was included in data mining in order to determine the acceptable paper that can be used in this review paper. The selection of paper for wildlife in eDNA includes mammals, invertebrates, bird, reptiles and amphibian groups as presented in Figure 1.

Results
From the systematic review, the search of database on eDNA was sorted according to the types of markers used and according to the different taxon of animals. A total of 150 from 254 research articles related to eDNA of wildlife were included (Table 1). Figure 1 shows the literature count for all the studies of eDNA in wildlife from SCOPUS database. The suitable peer-reviewed articles were chosen according to the information that was related with wildlife monitoring, including information on detection of the invasive species and diseases among wildlife.
The result shows that most publications on eDNA are related with fishes and invertebrates compared to other taxons. Cytochrome C oxidase Subunit I (COI) marker is the most widely used genetic markers which correspond to 45 out of 150 articles across all the taxons. Relatively, the number of related studies for wildlife monitoring are far lower compared with the COI marker.

Discussion
COI serves as the standard DNA barcode marker for animals and it has an extensive reference library such as Barcode of Life Data Systems (BOLD) and GenBank (NCBI), as well as taxonomic discriminatory power and predictable sequence variation [12]. Previously, most researchers used COI marker because it is capable to discern between closely related animals [13] which can be used in eDNA metabarcoding. However, from the systematic literature review of eDNA, for some targeted specific taxonomic groups in environmental samples, COI marker still requires a lot of research as the utilization of COI has yet to be fully explored [14]. Besides, the limitation of COI mini-barcode primers is still difficult to prove as 130 bp of COI by Ficetola et al [15] are not sufficiently conserved to cover a broad range of taxa [16,17]. This is also supported by Horton et al [18] in which DNA metabarcoding are not able to efficiently utilize the protein coding genes such as COI because the interspecific genetic variation impedes the use of universal primers.
Ribosomal RNAs (12S, 16S and 18S) are also strongly suggested to be used for DNA metabarcoding as rRNAs is highly conserved regions and evolve more slowly than the mitochondrial genome as a whole [19] which is important for ribosomes to provide a species-specific signature sequence [20,21]. 12S and 16S are able to amplify highly variable mitochondrial from various animal species sequences but 12S rRNA might yield different results compared to 16S rRNA due to the conserved regions for both of the genes at the 3' ends of sequences [22]. The short hypervariable regions of 12S and 16S are most commonly used as genetic markers in identifying the wide range of animals in environmental samples [23,24] especially for easily degraded samples [25]. As for 18S rRNA region, the references database is still considered as the limiting factor due to the insufficient references for taxonomic identification of many species [26,27]. Cytb and D-loop genes may be the most mtDNA markers used for identification of vertebrates especially in conventional PCR [28] but the exploration on using these markers are still lower in eDNA field. Therefore, multi-species approach using Cytb and D-loop regions are needed in the future to test the accuracy for metabarcoding which is highly dependent on marker choices. Eventually, there is no specific DNA marker that has all the features to be used as a standard metabarcoding marker and chosen as best marker with extensive reference databases. So, choices of marker regions for metabarcoding can be said solely depending on the type of animal taxons involved. Furthermore, shorter region of eDNA markers will be a suitable alternative but it will still unable to provide broad species-level resolution. Hence, a standard universal primer for mini-barcode amplification is difficult to design [29] in order to amplify the degraded DNA from the environment [30] as preferable amplicons length suggested by using Illumina or Ion Torrent platforms is only around 100-200 bp [31].

Conclusion
This systematic survey data helps the researchers to decide on the selection of markers for eDNA metabarcoding. COI is mostly used for invertebrates and fish, while other rRNAs markers (12S, 16S and 18S) are also used across other taxons in eDNA study. However, the validation of markers should be performed as the potential and limitation of each markers varies according to the targeted species. Future studies should focus on strengthening DNA database and experimenting with various new short primer for eDNA metabarcoding in order to unlock the immense potential of eDNA research in wildlife monitoring.