PaperThe following article is Open access

Machine learning classification of permeable conducting spheres in air and seawater using electromagnetic pulses

, , and

Published 5 August 2024 © 2024 The Author(s). Published by IOP Publishing Ltd
, , Citation Ryan Thomas et al 2024 Meas. Sci. Technol. 35 116106DOI 10.1088/1361-6501/ad678a

0957-0233/35/11/116106

Abstract

This paper presents machine learning classification on simulated data of permeable conducting spheres in air and seawater irradiated by low frequency electromagnetic pulses. Classification accuracy greater than 90% was achieved. The simulated data were generated using an analytical model of a magnetic dipole in air and seawater placed 1.5–3.5 m above the center of the sphere in 50 cm increments. The spheres had radii of 40 cm and 50 cm and were of permeable materials, such as steel, and non-permeable materials, such as aluminum. A series RL circuit was analytically modeled as the transmitter coil, and an RLC circuit as the receiver coil. Additive white Gaussian noise was added to the simulated data to test the robustness of the machine learning algorithms to noise. Multiple machine learning algorithms were used for classification including a perceptron and multiclass logistic regression, which are linear models, and a neural network, 1D convolutional neural network (CNN), and 2D CNN, which are nonlinear models. Feature maps are plotted for the CNNs and provide explainability of the salient parts of the time signature and spectrogram data used for classification. The pulses investigated, which expand the literature, include a two-sided decaying exponential, Heaviside step-off, triangular, Gaussian, rectangular, modulated Gaussian, raised cosine, and rectangular down-chirp. Propagation effects, including dispersion and frequency dependent attenuation, are encapsulated by the analytical model, which was verified using finite element modeling. The results in this paper show that machine learning methods are a viable alternative to inversion of electromagnetic induction (EMI) data for metallic sphere classification, with the advantage of real-time classification without the use of a physics-based model. The nonlinear machine learning algorithms used in this work were able to accurately classify metallic spheres in seawater even with significant pulse distortion caused by dispersion and frequency dependent attenuation. This paper presents the first effort towards the use of machine learning to classify metallic objects in seawater based on EMI sensing.

Export citation and abstractBibTeXRIS

Original content from this work may be used under the terms of the Creative Commons Attribution 4.0 license. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.

1. Introduction

Metallic object density is growing rapidly in both land and ocean environments. On land, this includes electronic devices, scrap metal, and discarded waste [1]. In the ocean, this includes anchors, shipping containers and shipwrecks [2]. Detection and classification of these metallic objects on land and in the sea will aid pollution removal. For example, seawater detection and classification of lost shipping containers will enable the recovery of lost goods. Metallic object detection may also assist with search and rescue of missing submersibles, some of which may have humans onboard.

Sensor technologies have been used to detect metallic objects in seawater for several decades, which includes side scan sonar, synthetic aperture sonar (SAS), lidar, magnetometers, and EMI systems. Side scan sonar has a small aperture and uses high frequencies to achieve high resolution [3], which is useful for proud (sitting on the seabed) objects but cannot penetrate the seabed and therefore is unable to detect buried metallic objects [4]. SAS uses low frequencies and has high resolution due to a synthetic aperture that is much longer than the physical aperture [5]. Therefore, SAS can penetrate the seabed and localize buried metallic objects, an example being the Buried Object Scanning Sonar [6] that was able to detect a 36-cm-diameter stainless steel sphere filled with silicone oil buried just below the seabed surface. While SAS can detect some buried objects, it is not used to classify an object's material composition. Lidar of green-blue frequencies can penetrate seawater [7] but struggles with turbidity [8], which significantly decreases its range. Lidar cannot penetrate the seabed due to the skin effect and therefore is unsuitable for detection of buried metallic objects. Magnetometers are passive and may be used to detect metallic objects both proud and buried [9] but can only detect permeable objects such as steel, and are unable to detect non-permeable objects such as aluminum. EMI sensors have the advantage over other underwater sensors of being able to classify both steel (permeable) and aluminum (non-permeable) objects, both proud and buried [10]. Although EMI sensing has some advantages over other sensors, the unique nature of electromagnetic wave propagation in seawater presents difficulties.

Seawater is a dispersive medium and severely attenuates high frequency electromagnetic waves. This inhibits the ability to map, with high fidelity, complex contours of objects at range as typically seen with ground penetrating radar [11]. This motivates the use of low frequency electromagnetic waves in this paper due to greater propagation distance in seawater. Low frequency electromagnetic waves have the additional advantage of deeply penetrating metallic objects.

Metallic object detection of buried objects using EMI methods in terrestrial environments often use an 'air' approximation as conductivity effects are negligible. This assumption reduces the modeling complexity of metallic objects buried in the ground. However, it is incorrect to assume seawater may be approximated to have the electromagnetic properties of air. The conductivity of seawater means the skin effect and dispersion significantly alter the electromagnetic scattered response of a metallic object, increasing difficulty of classification. In this paper, these effects are incorporated in a seawater model to accurately account for electromagnetic-environment interactions.

EMI metallic object classification is typically done via inversion of the magnetic polarizability tensor (MPT) [12], but is computationally onerous [13] and therefore limits or prohibits real-time object detection and classification. Alternatively, machine learning methods, although taking considerable time to train a model, offer real-time metallic object detection and classification [14]. This is important for underwater robotic sensors that often have limited communication bandwidth with any surface-based or shore-based control system.

Machine learning algorithms have become widespread in object classification [15] since the introduction of AlexNet (a type of CNN) in 2012 [16]. CNNs have several advantages over alternative machine learning algorithms. CNNs are capable of automated feature extraction, which removes the need for expert human operators, thereby improving productivity by reducing costs. Additionally, the features do not have to be known in advance. Rather, the convolution process is able to find the most important features used for classification by automatically optimizing filter weights [17]. CNNs have the disadvantage of often requiring millions of weights whereby identifying the significance of weights is challenging, although plotting feature maps offers some insight [18]. Often data sets must be large to obtain high classification accuracy. Despite the difficulty in obtaining large data sets, good progress has recently been made in using machine learning for metallic object classification using EMI sensing.

Research in EMI sensing systems has primarily been conducted in the terrestrial environment, with limited investigation in seawater. Of the research conducted in seawater, most researchers use a Heaviside step-off voltage in the transmitter coil [10, 19]. A step-off voltage is used as metallic objects have characteristic eddy current decay times [20, 21], which are a unique signature and may be used for object classification [22]. The step-off voltage, while widely used, does have some limitations. It has a fixed frequency spectrum and undefined energy, which means it lacks the flexibility required to adjust the pulse according to the properties of the environment and the metallic object. There is capacity for investigation of optimal waveform types other than a step-off voltage, which have controllable frequency spectra and defined energy. Additionally, to date there is currently no research on machine learning on EMI systems in seawater.

Although much research has been done on land-based EMI classification there are gaps in the literature, especially for seawater-based systems. This paper expands the literature through the following points:

  • (1)  
    Investigating the propagation of electromagnetic pulses in seawater. Most of the pulses covered in this paper have been used in air and soil but not in seawater. In the literature, the pulses used in seawater are the Heaviside step-off, Gaussian and modulated Gaussian for detection. This paper expands the literature by investigating the use of the two-sided decaying exponential, Heaviside step-off, triangular, Gaussian, rectangular, modulated Gaussian, raised cosine, and rectangular down-chirp pulses in seawater for metallic object classification.
  • (2)  
    Metallic object classification using EMI methods has primarily focused on the inversion of the MPT, which requires a physics-based model. While a widely utilized technique, inversion has the disadvantage of being computationally onerous and limits real-time classification. This paper uses raw EMI data and machine learning without the use of a physics-based model. This has the advantage of metallic object classification in real-time.
  • (3)  
    This paper uses the early time signal ( 0.1 ms) in addition to the intermediate time signal for classification. Other work typically excludes the early time signal and uses the intermediate and late time signal. The early time signal may improve classification accuracy as it contains high frequency effects that are typically associated with the shape of the metallic object being irradiated.
  • (4)  
    There is currently no research in the literature on the use of machine learning to classify metallic objects in seawater based on EMI sensing, which is the focus of this paper.

The paper is structured as follows. Section 2 contains the literature review. Section 3 introduces the theory of a point magnetic dipole above a permeable conducting sphere in an infinite conducting medium, RL and RLC circuits, pulse energy, electromagnetic pulse propagation in seawater, the numerical experiment, the Ansys® [23] software program model, the data set, and the structure of the machine learning algorithms. Section 4 contains the experimental results and discussion and section 5 concludes the work and provides recommendations.

2. Literature review

The detection of metallic objects using electromagnetic pulses is dependent on the object response function, the environment and pulse characteristics. The object response function is the scattered magnetic field produced by induced eddy currents in the metallic object and has a real part called the 'in-phase' and an imaginary part called the 'quadrature'. These are a function of the object's size, shape, conductivity and permeability [24] and are used for classification purposes [25, 26]. The environment surrounding the metallic object also influences the scattering response with significant differences between air and seawater. A magnetic dipole near a metallic sphere in air induces an eddy current response (ECR). In seawater there is also a current channeling response (CCR) [27, 28] due to perturbation of the induced currents flowing in the seawater when they encounter the sphere, which creates an electric dipole. The environment also changes pulse propagation, resulting in pulse distortion due to dispersion and frequency dependent attenuation. These effects make the choice of pulse type important, with the energy and frequency spectrum of a pulse influencing propagation distance and distortion. A pulse with frequency spectra side lobes extending to infinity, such as a rectangular pulse, undergoes pulse distortion moreso than a pulse with no side lobes, such as a Gaussian pulse, and this impacts detection and classification accuracy. Much of the research using electromagnetic pulses for metallic object detection has focused on the Heaviside step-off pulse, with limited research using other pulses including rectangular, triangular, sinc, Gaussian, modulated Gaussian, and linear up-chirp. These previous studies were primarily conducted in air and low conductivity soils.

2.1. Pulse types used in EMI sensing

The Heaviside step-off pulse is an idealized pulse that, in practice, is achieved through alternative methods such as a voltage step-on and linear ramp-down pulse [29] as realized in a Geonics EM63 [30] sensor. The voltage step-on results in an exponential ramp-up in current due to the inductance of the RL circuit transmitter coil. The Geonics EM63 typically ignores early time data and measures the time domain response from 0.18 ms onward [29]. A Heaviside step-off pulse and the resulting decay of the magnetic field of a conducting object has been used extensively for classification of metallic objects [21, 31, 32]. Classification is typically done via the inversion of the MPT from decay data [33]. The step-off pulse is widely used in analytical analysis and includes the step-off current response of a conducting permeable sphere in a dipolar field [34] with applications in geophysics such as detecting massive sulfide ore bodies.

The switching on and off of a current in a loop, which is a rectangular pulse and is known as the time domain electromagnetic method [35], has been used for detection of metallic spheroid-like objects in soil [33]. The time domain EMI response of aluminum and steel spheres below a receiver coil and transmitter coil and excited by a rectangular waveform was investigated in [36]. In other work, the pulsed induction method, which uses a rectangular voltage pulse, was used to detect buried subsea pipelines and cables [37].

An EMI sensing system driven by a triangular current wave was suggested by West [38] and has a theoretical advantage of emphasizing highly conducting targets buried in a less conductive environment. A continuous triangular waveform was used to detect metallic objects in the time domain [39] and clearly distinguished between ferrous and non-ferrous objects. The triangular waveform is theoretically equivalent to an integration of the voltage measured by a conventional EMI system that utilizes a rapid current turn-off in a transmitter loop [40], which measures the target step response (derivative of a triangle pulse) rather than the more common impulse response (derivative of a step-off pulse) [41]. Use of the triangular waveform results in much smaller early-time voltages induced in the receiver loop, thereby reducing the dynamic range demands of the receiver.

The raised cosine pulse has been suggested as a bandwidth and power efficient modulation technique for satellite communications where narrow signal spectra with power concentrated within a given bandwidth help reduce intersymbol interference [42]. Intersymbol interference occurs when one pulse distorts in time and interferes with neighboring pulses, increasing noise [43]. Electromagnetic wave propagation in seawater has a non-linear, frequency dependent phase velocity [44] that may result in intersymbol interference, and the raised cosine pulse is investigated in this paper as a means to reduce this unwanted effect. A raised cosine pulse becomes a sinc pulse, with a corresponding brick-wall filter in the frequency domain [45], when the roll-off factor is equal to zero. A sinc function was used as a driving voltage in an eddy current metal detector to classify metallic spheres [46, 47]. However, an ideal sinc pulse is not realizable in practice as the required brick-wall filter has infinite impulse response in both positive and negative time directions. In this work the sinc pulse (unrealizable in practice) used in [46] was replaced by a raised-cosine pulse (realizable in practice).

A linear up-chirp from 1 to 45 kHz was used to distinguish between spheres of different metals and sizes in air [48]. The spheres were ferromagnetic steel UNI 100Cr6 (diameters 10, 15 and 25 mm), stainless steel AISI 420 (diameters 10, 20 and 25 mm) and bronze (diameters 12 and 20 mm). The spheres were distinguishable through use of a polar plot of amplitude and phase. An analytical solution of the response function of an homogeneous sphere and a finite element numerical simulation was derived in [49]. The frequency response using a 1–45 kHz up-chirp was computed for a steel and copper sphere of 20 mm diameter and a significant difference between the spheres was observed on a polar plot. A 3–30 kHz up-chirp of 10 ms duration was used to drive an induction balance metal detector, and discrimination was possible between aluminum and AISI52100 alloy objects using a polar plot of chirp phase and amplitude [50]. Chirped systems are also used for buried object detection in ground penetrating radar systems [51], where increased dynamic range and accurate control of the frequency spectrum of electromagnetic waves is required [52].

A Gaussian pulse current excitation of a horizontal electric dipole on the interface of air and fresh water was investigated in [53]. The generated electric and magnetic fields travel upward into the air, downwards into the fresh water, and horizontally along the boundary as a surface wave. This work was expanded by replacing the fresh water with seawater and determined that a metal cylinder in seawater could be detected with a Gaussian pulse [54]. It was found that a Gaussian pulse was well suited to propagation in seawater, where exponential attenuation increases with frequency, due to the absence of frequency side lobes extending to infinity. A modulated Gaussian pulse with full width at half maximum of 1 s and a carrier frequency of fb = 25 Hz was also investigated and it was found through numerical simulation that a metal cylinder in seawater may be detected. Gaussian pulses have also been used in ground penetrating radar applications with broadband frequency content [52] and are known as impulse radar systems. A high power, ultra-wideband Gaussian pulse transmitter was used for sensing and imaging of buried objects [55]. However, the Gaussian pulses used in ground penetrating radar are typically high frequency and therefore are unsuitable for the seawater environment.

Two-sided decaying exponential has had limited use as an electromagnetic sensing pulse but has been used to describe natural phenomena including intermittent fluctuations in the boundary of a magnetically confined plasma [56], the pulse shape required for the energy of a laser pulse to be fully absorbed by an optical resonator [57], and the temporal profile of pulses in bright gamma-ray bursts [58].

All above pulse types, used with machine learning, are potentially able to detect and classify metallic objects in seawater and will be compared in this work.

2.2. Electromagnetic pulse propagation in seawater

As mentioned, the environment surrounding a metallic object influences the propagation of an electromagnetic pulse, and therefore influences the ability of a remote electromagnetic sensing system to detect and classify a metallic object. Understanding of pulse distortion in seawater is important for determining the optimal pulse for detection and classification purposes. There is a long history of analysis, both theoretical and experimental, of electromagnetic pulse propagation in dispersive and attenuating media. In 1914 the theory of electromagnetic pulse propagation in a linear, causally dispersive medium was presented by Sommerfeld [59] and Brillouin [60]. A Heaviside step-on signal modulated by a constant carrier frequency propagating in a single resonance Lorentz [61] medium resulted in pulse distortion. The propagation of a transient electromagnetic pulse was studied theoretically in 1953 by Wait [62], where a magnetic dipole in seawater was energized by a step function current.

2.3. Statistical classification of metallic objects in seawater using EMI sensing

Detection of metallic objects in seawater using EMI sensing has its beginning in analytical modeling. An analytical model of the electromagnetic scattering response of a permeable conducting sphere in an infinite conductor near a point magnetic dipole was first formulated several decades ago [63, 64], and originally had applications in geophysics for the detection of buried ore bodies [65]. This work was extended in [27], which presented the mathematics of the EMI response of perfectly conducting spheres and spheroids, and a nonconducting target, immersed in a weakly conducting medium such as seawater. It was shown that in seawater there is both an ECR and CCR. Although useful, this work was limited to fixed shapes.

The case of an arbitrarily shaped metallic object in infinite seawater was treated numerically using The Method of Auxiliary Sources [19], and researchers deduced EMI detection techniques used in air would hold in seawater. The infinite medium used in previous work, while useful, does not account for boundary transitions between differing electromagnetic materials that affects electromagnetic wave propagation. A layered model, which included the conductivity of air, seawater and the sea floor with an embedded metallic object, was investigated in [10]. Detection of a metallic object in the layered medium using EMI methods was considered feasible. In subsequent work, the diffusion of time domain EMI signals in the marine environment with layers including air-sea and sea-sediment boundaries was investigated [66, 67].

Researchers have also used finite element models to verify analytical models and experimental results. Conducting objects in air were detected and characterized using EMI and a fluxgate magnetometer [68], with experimental results found to be in good agreement with a COMSOL Multiphysics® [69] software program model. An analytical model of the magnetic flux density of a permeable spherical shell with an internal current band in Earth's magnetic field was found to be in good agreement with a finite element solution utilizing COMSOL Multiphysics® [70].

In addition to analytical models, experiments on the electromagnetic response of metallic objects in seawater have been conducted. The effects of seawater on the EMI response of a 3 inch diameter stainless steel sphere and a steel and aluminum pipe [71] using a GEM-3 [72] sensor was measured from the side of a wooden pier. CCR effects were only apparent at greater than 10 kHz. Metallic object classification was done using spectral matching of a metallic object data library collected in air. An underwater GEM-3 sensor was also used in [73] to detect a 12 inch hollow stainless steel sphere suspended in seawater.

The influence of salt water (σ = 4.8 S m−1) on the physics required for processing underwater EMI signals was conducted by [74]. The experiment involved a transmitter and receiver coil above and below, respectively, a 3 m diameter, 60 cm height pool filled with salt water and showed that conducting environments distort both the primary and secondary magnetic fields at early times corresponding to high frequencies. Signal distortion was found to be a function of separation distances between the metallic spheroid used as a target and the EMI coils. This work was extended in [75] with analysis in the very early time of the EMI response ( 0.1 ms) where salt water effects were shown to be large.

Classification, rather than detection, of metallic objects typically depends upon EMI data inversion. A nonlinear least squares method estimates an obscured object's location and MPT from EMI and sensor positional data. The MPT can then be matched to a library [22] for classification. A magnetic tensor sensor for gradient-based localization of ferrous objects in a geomagnetic field was demonstrated in [76]. The use of planar coil EMI sensors have successfully detected obscured objects using inversion modeling [77]. The inversion method has the advantage of inherently estimating location and orientation information in addition to classification. As mentioned previously, a disadvantage of MPT inversion is that it is iterative and time consuming [78], which limits its use in a real-time classification system. An alternative to inversion, which promises to enable real-time classification, is machine learning.

2.4. Machine learning classification of metallic objects using EMI sensing

Machine learning methods often require significant compute time for training of a network, however, once training has finalized classification of new metallic objects is quickly completed. This makes machine learning an alternative to physics-based EMI data inversion for the purpose of metallic object classification in a real-time system.

A 1D CNN was used on EMI data, including metallic spheres, cuboids, and cylinders, to classify time-domain MPT features [79]. Multiclass and binary 'threat or non-threat' classification of the objects achieved 98% accuracy (with zero false negatives), and was trained on simulations and tested on measurements. A 1D CNN was used to estimate the depth of metallic objects using a pulse induction metal detector [80, 81]. In [82], linear (perceptron, multiclass logistic regression) and nonlinear (neural network, 1D CNN, 2D CNN) machine learning algorithms were used to classify metallic objects using EMI data generated by a Heaviside step-off pulse. Machine learning algorithms and electromagnetic pulses including a two-sided decaying exponential, Gaussian, triangular, raised cosine, rectangular, and rectangular chirp, were used to classify metallic objects [83]. Thirty-three classification strategies based on eleven dimensionality reduction methods were used on simulated data of the time domain decay of metallic ellipsoids nearby an EMI system [84]. Accuracy of 99% for material-based and shape-based classification was achieved. A neural network was used to identify foreign metal objects near wireless power transfer systems [85]. A support vector machine was used to detect buried metallic objects [86].

Machine learning algorithms have also been applied to engineering problems. Corrosion and coating defect assessment of coal handling and preparation plants using an ensemble of deep convolutional neural networks and decision-level data fusion was presented in [87]. Compressive strength evaluation of cement-based materials in a sulfate environment using optimized deep learning technology was covered in [88]. In these studies, it was shown that an advantage of using machine learning techniques versus physical methods is improved prediction accuracy.

EMI methods typically utilize a Heaviside step-off pulse. However, other pulse types have been investigated along with machine learning classification, such as the use of a triangle waveform in [89]. Field observations were used to train a self-organizing map (a type of unsupervised artificial neural network), and target parameters including depth, azimuth, inclination, object type and weight were determined by iterative minimization of topographic error vectors. Target classification was achieved through evaluation of histograms of the estimated parameter.

In other work, non-probabilistic (support vector machines, random forests, decision trees) and probabilistic (gradient boost, multi-layer perceptron, logistic regression) machine learning algorithms were trained to classify metal objects using a dictionary of MPT spectral signatures in the frequency domain [90]. Authors found that the best performing probabilistic classifier was the gradient boost algorithm for a data set of coins and a data set of 'threat and non-threat' objects.

Table 1 contains a subset list of EMI sensing papers covering pulse types, environment, and machine learning.

Table 1. EMI sensing papers covering pulse types, environment, and machine learning.

PaperMethodPulse typeEnvironmentSensorObjectsClassificationAccuracy (%)
Das [36]Analytical and experimentRectangularAirCo-located circular coilsMetallic spheresDetection by human interpretationN/A
Svatoš [48]ExperimentLinear up-chirpAirATMID metal detectorMetallic spheresClassification by human interpretationN/A
Svatoš [46]ExperimentSincAirATMID metal detectorMetallic spheresClassification with SVM and Naive Bayes86–100
Šimić et al [79]ExperimentRectangular with 50 μs widthAirPulse induction MD VMF4Metallic spheres, prisms, rodsClassification using 1D CNN on MPT data98
Wilson et al [90]ExperimentFrequency sweepAirSimulated EMI sensorMetallic threat objectsClassification by machine learning90–100
Wan et al [84]NumericalStep-offAirSimulated EMI sensorMetallic spheroidsClassification using machine learning models60–100
Wright et al [41]ExperimentTriangularSoilALLTEM metal detectorMetallic spheroid-likeDetection by inversionN/A
Pasion [29]ExperimentStep-offSoilGeonics EM63Metallic spheroid-likeDetection by inversionN/A
Friedel et al [89]ExperimentTriangularSoilALLTEM metal detectorMetallic spheroidsClassification by unsupervised neural network95
King et al [54]AnalyticalGaussian and modulated GaussianAir-seawaterHorizontal electric dipoleMetallic cylinderDetection by human interpretationN/A
Bell et al [91]ExperimentRectangular with 25 ms widthSeawaterMultiple EMI coilsMetallic spheres, cylinders, spheroidsDetection by human interpretationN/A
Shubitidze [19]Analytical and numericalFrequency sweepSeawaterTEMTADS sensor arrayMetallic spheres and spheroid-likeDetection by inversionN/A
Billings and Song [10]Numerical and experimentRectangular with 20 ms width estimating step-offSeawaterGap LPTX-25Metallic spheroid-likeDetection by inversionN/A
San Filipo et al [71]ExperimentFrequency sweepSeawaterGEM-3Metallic spheres and rodsClassification by library spectral matching90–98

While machine learning methods have been applied to the classification of metallic objects using EMI data, there remains the challenge of understanding the salient features of the data used by a CNN. The black-box nature of CNNs may be unraveled through plotting of the weights of the convolution and pooling layers, and this was first done for 2D CNNs [92]. In [18] the input pattern that originally caused a given activation in the feature maps was displayed by mapping network activities back to the input pixel space using a deconvolutional network. Early feature map visualization was conducted on the ImageNet [93] data set, but recent research has focused on the signal processing domain. In [94], the learned features of a 2D CNN were visualized for 2D B-scans of GPR data. Feature extraction and visualization has also been applied to 1D CNNs on time domain data, including fault detection and diagnosis of industrial processes [95], spectroscopy analysis [96] and analog circuit fault diagnosis [97].

In this paper machine learning algorithms are used to classify permeable conducting spheres below a magnetic dipole. Square-integrable pulses including a two-sided decaying exponential, Gaussian, modulated Gaussian, triangular, rectangular, raised cosine, and rectangular down-chirp were investigated. Additionally, a step-on and linear down ramp pulse is investigated as a realizable Heaviside step-off pulse. An analytical model is used to generate frequency domain EMI data of metallic spheres, which was multiplied by a pulse and converted into the time domain and additive white Gaussian noise was added. The analytical model is verified against an Ansys® finite element model. The time domain electromagnetic signal is analyzed using 1D machine learning techniques, and is also converted into a spectrogram and classified using a 2D CNN. The merit of using a 2D spectrogram instead of a 1D time domain signal is the ability to use a 2D CNN with plotted feature maps offering explainability of the salient features of the signal used by the classification algorithm. The spectrogram also incorporates both frequency and time components into a single 2D structure.

3. Method

3.1. Radial magnetic dipole in seawater above a permeable conducting sphere

A diagram of a radial magnetic dipole above a permeable conducting sphere, which was used as the analytical model in this work, is shown in figure 1. A magnetic dipole with magnetic moment m is at position (x = 0, y = 0, z = h) and is oriented in the z-direction. The sphere has radius a, conductivity σ2, permeability µ2, and permittivity ε2. The infinite medium surrounding the sphere has conductivity σ1, permeability µ1, and permittivity ε1. The receiver coil is located at (r, θ, φ) where r is the radial distance from the origin, θ is the polar angle to the receiver coil, and φ is the azimuthal angle to the receiver coil.

Figure 1. Refer to the following caption and surrounding text.

Figure 1. Simulation setup used to calculate the analytical solution. Electromagnetic pulses are generated by a magnetic dipole with magnetic moment m located at height h from the center of the sphere. The sphere is solid, with radius a and has electromagnetic properties denoted by σ2 for the conductivity, µ2 for the permeability, and ε2 for the permittivity. The receiver coil detects the scattered electromagnetic pulse and is located at point (r, θ, φ) in the medium of either air or seawater with electromagnetic properties denoted by σ1 for the conductivity, µ1 for the permeability, and ε1 for the permittivity.

Standard image High-resolution image

The scattered magnetic flux density of a radial dipole above a permeable conducting sphere is [64, 65, 98]

where n is the multipole number, Pn is the Legendre polynomial, is the associated Legendre polynomial, m =  is the magnetic moment, N is the number of turns of wire loop, A is the area of the magnetic dipole wire loop and is the frequency dependent current. A sufficiently small loop, at a distance, approximates a point magnetic dipole. The magnetic flux density in the z-direction is found using the vector spherical to Cartesian coordinates transformation . The negative frequency values are given by  = .

The response function is:

where k1 is the complex wavenumber of the medium, k2 is the complex wavenumber of the sphere, jn is the spherical Bessel function, is the Riccati–Bessel function of the first kind of order n, means ,  = , is the spherical Hankel function, and means , U = , and under the quasi-static assumption  = , C2 = . Justification of the quasi-static assumption is explained in section 3.8. Equation (1) was validated in [99] where the theoretical model of March [64] was found to be in good agreement with an experiment using the GEM-3 [72] EMI sensor near a metallic sphere in seawater. In addition to studying scattering in seawater, scattering in air was also studied in this work.

3.2. Radial magnetic dipole in air above a permeable conducting sphere

The magnetic flux density for a radial magnetic dipole in air above a permeable conducting sphere is also given by equation (1). The response function may be simplified using the long wavelength approximation, which requires that displacement currents be neglected (σ2), the sphere has a radius much less than a wavelength ( 1), the dipole is located much less than a wavelength from the sphere ( 1), and the receiver coil is located much less than a wavelength from the sphere ( 1) [98]. These requirements are met if the infinite medium is air with σ1 = 0 S m−1,  = 1,  = 1.

Using the long wavelength approximation, the response function is

This simplification cannot be used in the seawater case as the wavelength of the electromagnetic wave generated by the magnetic dipole in seawater is comparable to the diameter of the permeable conducting spheres used in this work. Figure 2 shows the scattered magnetic flux density external to a D6ac steel sphere for a magnetic dipole at z = 1.5 m, R = 0.5 m, f = 100 kHz in air and seawater. The electromagnetic wavelength is much shorter in seawater compared to air at f = 100 kHz, and hence scattering in seawater is wave-like. The propagation of an electromagnetic pulse in seawater with no sphere present is used in this work to understand pulse distortion effects.

Figure 2. Refer to the following caption and surrounding text.

Figure 2. Scattered magnetic flux density, 20  of a D6ac steel sphere with radius R = 0.5 m generated by an oscillating current with frequency f = 100 kHz in a magnetic dipole at h = 1.5 m in air and seawater. The scattering in seawater is wave-like due to the significantly shorter wavelength of the electromagnetic wave in seawater (λ = 5 m) compared to air (λ = 3 km).

Standard image High-resolution image

3.3. Radial magnetic dipole in an infinite conducting medium

An electromagnetic pulse in seawater undergoes attenuation and dispersion, which is outlined in section 3.8. Studying dispersion and frequency dependent attenuation, in the absence of a metallic sphere, provides insight into the propagation distortion of an electromagnetic pulse in seawater. This distortion influences the classification accuracy of a machine learning algorithm.

The magnetic flux density in the z-direction of a magnetic dipole in an infinite conducting medium with no sphere present is [98]

Once an analytical model for electromagnetic pulse scattering is established, the properties of an electromagnetic sensing system must be understood.

3.4. Series RL circuit

The magnetic dipole in this work was modeled as an RL circuit and consists of a resistor and inductor in series. The series RL circuit driven by a voltage function has current

where is the time delayed voltage input pulse, t0 is the time delay, R is the resistance, and L is the inductance of the RL circuit. The series RL circuit is a low pass filter and is therefore useful for filtering high frequencies out of a signal. The cutoff frequency , defined as the frequency where power has reduced to half, is

The cutoff frequency of an RL circuit is

The approximate inductance L of an N-turn air core loop is [98]

where c is the radius of the winding, and b is the radius of the wire.

The resistance of the loop is

where ρ is the resistivity of the wire material.

In this work a single loop with N = 1, c = 10 mm, b = 0.5 mm, wire ρ = 26.9 nΩm was used as the transmitter coil in the RL circuit and the receiver coil in the RLC circuit. This was done to approximately match the dimensions of a transmitter coil created using Ansys®, which was used to validate the analytical model. These parameters substituted into equations (8) and (9) resulted in L = 41.79 nH and Rc = 2.16 . The small radius allows for a close approximation in Ansys®, where it is not possible to model a point magnetic dipole, to the point magnetic dipole analytical model. A cutoff frequency of 90 kHz was chosen as a compromise between minimizing pulse distortion and the frequency bandwidth required to calculate time domain signals. The total resistance of the coil was set to 23.6 mΩ to achieve this cutoff frequency, which can be achieved in practice by adding a resistor in series with the coil. The time constant of the RL circuit is 1.77 μs.

3.5. Electromagnetic pulse

A time-dependent electromagnetic pulse f(t) may be regarded as a superposition of waves of many frequencies [100]. The electrical system used to generate a pulse, such as a series RL circuit, has its own transfer function that alters the electromagnetic pulse.

A time domain pulse is obtained by convolution via the frequency domain, which involves an inverse Fourier transform of an electromagnetic pulse multiplied by the response function of the model. This method is justified as the system used to generate the pulse, and the pulse interaction with the metallic objects, respond linearly within the operating range of the experiments in this paper.

In this work each pulse had parameters set to obtain with the exception of the rectangular down-chirp pulse that had its frequency linearly decrease between , and the modulated Gaussian with  = 1.39 kHz. A total of 8 electromagnetic pulses, each with different time and frequency domain characteristics, were explored to determine the pulses with the best sphere classification accuracy.

A mathematical description of the pulses used in this work may be found in the appendix. The parameters of the pulses used in this work are shown in table 2. After determining the pulse types, energy must be considered and set equal amongst pulse types to avoid biasing the data collection process.

Table 2. Electromagnetic pulse parameters.

PulseParameters
2SDEVm = 386.1 V, τ = 47.3 μs
Step-offVm = 186.6 V, Ta = 20 ms, Tb = 50 μs
TriangularVm = 267.8 V, d = 212.6 μs
GaussianVm = 254.8 V, τ = 104.0 μs
RectangularVm = 186.6 V, d = 147.6 μs
Modu. GaussianVm = 246.15 V, τ = 224.0 μs, fb = 6.7 kHz
Raised cosineVm = 220.7 V, γ = 130.4 μs, ξ = 0.8
Down-chirpVm = 75.215 V, T = 2 ms, f0 = 3 kHz, f1 = 0 Hz

3.6. Energy

The energy of an electromagnetic pulse in a transmitter coil circuit is [101]

where v(t) is the voltage, i(t) is the current, is the power, is the real part of the admittance, is the Fourier transform of v(t), and is the energy spectral density, and the admittance of the series RL circuit is .

All pulses had voltage amplitude set to ensure energy of 214.9 J, equal to the energy of a rectangular down-chirp pulse with parameters given in table 2. The rectangular down-chirp voltage 75.22 V was chosen to obtain magnetic moment m ≈ 1 m2A at the point of maximum current in the RL circuit. The width of the pulses ensures that the majority of the energy is contained below 3 kHz, with the exception of the modulated Gaussian pulse. This enables deep penetration into metallic objects and seawater, which is not possible at higher frequencies due to the skin effect. The transmitter coil has now been covered and consideration of the receiver coil is undertaken.

3.7. Magnetic induction sensor

The receiver coil in this work was modeled as a magnetic induction sensor, which may be constructed by connecting a receiving loop of wire to a preamplifier [98]. The equivalent circuit is given in figure 3. This parallel RLC circuit contains a resistor, inductor, and capacitor, and the voltage measured across Ra is a low pass filter. The voltage induced in the coil is [98]

Figure 3. Refer to the following caption and surrounding text.

Figure 3. Circuit diagram of the magnetic induction sensor used in this work, which is a receiving loop connected to a preamplifier and is modeled as a parallel RLC circuit. The inductance, resistance, and capacitance of the loop are L, Rc, and Cc. The voltage induced by magnetic flux through the loop is Vi. The combined capacitance of the preamplifier input and the wire connecting the loop and preamplifier is CL. Ra is the input resistance of the preamplifier. The loop may be tuned using the added capacitance CT.

Standard image High-resolution image

where N is the number of turns in the coil and A is the area of the wire loop.

Combining all the capacitors into an equivalent capacitor C, the ratio of the voltage at the input of the preamplifier Va to the induced voltage Vi is

In this work the receiver coil has the same parameters as the transmitter coil with L = 41.79 nH and Rc = 2.16 . Additional parameters are Ra = 7.9 mΩ and C = 80.827 μF. These parameters resulted in a cutoff frequency of  = 30 kHz, which reduced the frequency range required for an inverse fast Fourier transform (IFFT). In seawater, the attenuation of an electromagnetic wave at higher frequencies naturally reduces the frequency range required for an IFFT. In air, this is not the case as the induction limit remains constant at high frequencies. The cutoff frequency of  = 30 kHz ensured the air and seawater time domain signals could be calculated and compared using an IFFT.

The time domain voltage measured across Ra for a time delayed electromagnetic pulse in a magnetic dipole detected by a magnetic induction sensor oriented in the z-direction is

Both the transmitter and receiver coil properties have been covered. A detailed understanding of pulse propagation in seawater is necessary to understand pulse scattering between the transmitter and receiver.

3.8. Pulse propagation

Pulse propagation of an electromagnetic wave is determined by the frequency dependent wavenumber [102]

In the quasi-static regime the conductivity dominates the propagation properties of the electromagnetic wave as and therefore . In seawater, for the frequencies used in this work, the quasi-static approximation is valid.

In the quasi-static regime the approximate wavelength of an electromagnetic plane wave is

the attenuation coefficient is

the skin depth is , and the phase velocity is

As the electromagnetic wave frequency increases, the speed of the wave in the medium also increases.

Dispersion of an electromagnetic pulse occurs when pure plane waves of different wavelengths have different phase velocities, so that a wave packet of mixed wavelengths tends to spread out in space. A time domain electromagnetic pulse in seawater undergoes broadening, delay and attenuation due to the frequency dependent phase velocity and attenuation.

An electromagnetic plane wave attenuates exponentially in seawater [103], described by

where Am is the wave amplitude and l is propagation distance. The power ratios are

where is the initial power and is the power at distance l. In decibels

and dividing through by l (meters) gives the attenuation rate as 8.69 α dB/m.

Figure 4 shows (a) attenuation rate and (b) phase velocity of electromagnetic wave propagation in seawater. Attenuation increases significantly with frequency and limits any seawater based electromagnetic system to low frequencies at large distances. The frequency dependent velocity creates dispersion, and frequency dependent attenuation causes attenuation distortion, of an electromagnetic pulse in seawater, whereby the pulse shape broadens with propagation distance.

Figure 4. Refer to the following caption and surrounding text.

Figure 4. (a) Attenuation rate and (b) phase velocity of electromagnetic wave propagation in seawater. Per (a), attenuation increases significantly with frequency and limits any seawater based electromagnetic system to low frequencies at large distances. Electromagnetic pulse distortion occurs due to the frequency dependent attenuation of (a) and the frequency dependent velocity (dispersion) of (b) in seawater.

Standard image High-resolution image

Figure 5 shows pulse propagation, in terms of the magnetic flux density Bz, in air calculated using Ansys® (light blue line), in seawater calculated using equation (4) (solid line), and in seawater calculated using Ansys® (circle markers). The inset plot is the magnitude of the Fourier transform of the current in the transmitter coil, which is modeled as an RL circuit. The Ansys® setup used to calculate the magnetic flux density is described in section 3.11. There is good agreement between the analytical model and the Ansys® model for pulse propagation in seawater. The pulses are (a) two-sided decaying exponential, (b) Heaviside step-off, (c) triangular, (d) Gaussian, (e) rectangular, (f) modulated Gaussian, (g) raised cosine, (h) rectangular down-chirp.

Figure 5. Refer to the following caption and surrounding text.

Figure 5. Propagation of electromagnetic pulses of equal energy generated by an RL circuit in air and seawater, from the origin (y = 0 m, z = 0 m) to (y = 0.3 m, z = −6.7073 m) using equation (4) showing the magnetic flux density Bz. In each plot, the light blue line (Air) is the pulse in air calculated using Ansys®, the solid line (Seawater(m)) is the pulse in seawater calculated using the analytical model, and the circle markers (Seawater) plot the pulse in seawater calculated using Ansys®. The inset plot is the magnitude of the Fourier transform of the current in the RL circuit. The pulses are (a) two-sided decaying exponential, (b) Heaviside step-off, (c) triangular, (d) Gaussian, (e) rectangular, (f) modulated Gaussian, (g) raised cosine, (h) rectangular down-chirp.

Standard image High-resolution image

The propagation distance used in figure 5 is from the transmitter coil to the center of the sphere (3.5 m) plus the distance from the center of the sphere to the receiver coil (3.214 m), which equals 6.714 m. The same method was used by King to analyze the distortion of a rectangular pulse propagating in seawater [104]. Assuming that the radial magnetic dipole is located at the origin (y = 0 m, z = 0 m), the receiver coil must be located at (y = 0.3 m, z = −6.7073 m). In seawater, after propagating 6.714 m a 1 Hz electromagnetic plane wave has amplitude equal to 97.6% of its original value (caused by attenuation due to seawater while excluding spherical spreading), and a 3 kHz wave has amplitude equal to 26.5% of its original value. A 1 Hz wave travels at speed 1.74 km s−1, and a 3 kHz wave travels at speed 95.35 km s−1 in seawater.

The pulses in air and seawater undergo equal spherical spreading loss as they propagate the same distance. The pulses in air do not undergo distortion and only decrease in amplitude due to spherical spreading loss. The pulses in seawater undergo dispersion and frequency dependent attenuation, in addition to spherical spreading loss, which results in a decrease of the peak amplitude and a delay and spreading of the pulse to the right of the plots (to later times) relative to pulse propagation in air.

The frequency spectra of the pulses dictate the scattering response of metallic objects and have some differences including the Heaviside step-off having most of the energy concentrated close to 0 Hz, the rectangular pulse having side lobes extending to infinity, the raised cosine having very small side lobes, and the modulated Gaussian having frequency spectrum energy concentrated around 6.70 kHz.

A pulse with a sudden voltage change in time, such as the rectangular pulse with side lobes extending to infinity, experiences significant distortion due to dispersion and frequency dependent attenuation across a broad range of frequencies. This is evident in figure 5(e), where the rectangular pulse has distorted and appears somewhat like an ocean wave. In contrast, although the Gaussian pulse in figure 5(d) has a decreased amplitude and delay relative to the same pulse in air, the shape of the pulse still appears approximately Gaussian. It is evident from figure 5 that seawater distorts an electromagnetic pulse over small distances and must be accounted for by a machine learning classification algorithm.

Once propagation effects have been established, the metallic objects and the environment must be characterized.

3.9. Metallic objects and the infinite medium

In this work the electromagnetic properties of the seawater used as the infinite medium are , and σ1 = 4 S m−1. When the infinite medium is air , and , where µ0 is vacuum permeability, and ε0 is vacuum permittivity.

The electromagnetic properties of the objects used in this study are given in table 3. Two radius values of 40 cm and 50 cm were chosen to determine the ability of the classifier to distinguish between different sized objects with the same electromagnetic properties. Nickel and D6ac steel have distinct eddy current responses compared to 304L steel and aluminum due to their large permeability. This effect means that an electromagnetic pulse has a different response for permeable steel and non-permeable aluminum, providing an advantage over a static magnetometer that is unable to detect aluminum. All objects have different conductivity. Aluminum and 304L steel have the most similar response functions as they both have similar permeability values. Once the metallic object parameters were chosen, the properties of the analytical model were selected.

Table 3. Properties of metallic objects.

ObjectR (cm)Materialµrσ (MS m−1)
140Nickel [105]13511
240D6ac steel [106]773.6
340304L steel [107]1.114
440Aluminum [108]137
550Nickel13511
650D6ac steel773.6
750304L steel1.114
850Aluminum137

3.10. Analytical model

Equation (1), which is the analytical model used in this work, theoretically requires the calculation of an infinite number of multipoles to obtain the magnetic flux density. However, in practice, because the magnetic dipole is some distance away from the sphere only a subset of multipoles must be calculated to approximate magnetic flux density. Magnetic flux density was calculated using a multipole expansion up to n = 6, as the contribution from n > 6 terms for and were less than 0.01% of the total contribution of the sum of n terms from 1 to 6 at h = 1.5 m.

The frequency domain magnetic flux density was calculated over the range −10 MHz f MHz with 1 Hz steps, then converted into the time domain using an IFFT. The low pass filter effect of an RL circuit on the transmitter coil and RLC circuit on the receiver coil ensured that this range was sufficient to include all relevant frequencies in the calculation of magnetic flux density in the time domain. The analytical model was then verified by comparing with an Ansys® model.

3.11. Ansys® model

A 2D axisymmetric model of a current loop above a permeable conducting sphere was created using the finite element program Ansys® Maxwell 2D to verify the analytical solutions in the frequency and time domain. As the analytical solution has no φ component, a 2D axisymmetric Ansys® model is an accurate representation of the 3D analytical model. Both the analytical model and the finite element model in Ansys® are approximations of the actual solution. The analytical model is approximated by summing a subset of multipoles, rather than infinity multipoles. The finite element solution is approximated by minimizing an associated error function via the calculus of variations. The finite element model is also sensitive to mesh density and quality. Hence, the analytical model and Ansys® finite element model are not equal but may be close enough to verify that both methods provide an accurate estimate of the scattering of a permeable conducting sphere below a magnetic dipole in air and seawater.

The mesh used in the Ansys® model is shown in figure 6. The gold box shows the meshing around the coil. The green box shows the meshing at the surface of the sphere. The coil was created with a square shape of 0.1 mm width, and had a radius of 10 mm. A current excitation was assigned to the square surface of the coil. Inside the coil a dense mesh with a maximum element length of 25 μm was created, which also creates a dense mesh around the coil upon building the finite element model. This ensured accurate modeling of the magnetic flux density close to the coil known as the near field. Unlike the transmitter, which is a coil, the receiver is a point in Ansys® and returns the magnetic flux density. The material properties of the spheres were set according to table 3, and the seawater was assigned conductivity of 4 S m−1.

Figure 6. Refer to the following caption and surrounding text.

Figure 6. Ansys® Maxwell 2D mesh of a 10 mm radius coil located 3.5 m above the center of a sphere with radius R = 50 cm showing (a) all mesh components including the sphere (mesh appears black due to density of elements), the coil, and the surrounding medium (air or seawater), (b) coil mesh (gold border) and (c) sphere mesh (green border).

Standard image High-resolution image

The maximum element length in air, seawater, or the metallic sphere was dictated by the shortest wavelength in the materials of table 3, which is described by equation (15) and occurs in nickel. To be modeled correctly, the mesh element size in each material must be less than the shortest wavelength in that material. In the Ansys® model the frequency was swept from 0.1 mHz to 1 MHz. This gives smallest wavelengths (at 1 MHz) of 299.8 m in air, 1.6 m in seawater, and 82 μm in nickel. Hence, a maximum element size of 25 cm was set for the air and seawater surrounding the sphere and coil, which is much smaller than the wavelength in air and seawater.

Meshing the interior of the sphere with an element size smaller than 82 μm would create a large model with a large computation time. This was avoided by recognizing that the skin effect severely attenuates high frequency signals in metals, and a dense mesh is only needed in the outer layer of the sphere to accurately model the scattering response at high frequencies. The skin depth in nickel for a 1 MHz electromagnetic wave is 13 μm and almost all signal energy is attenuated after propagating 1 cm within the nickel sphere. The longest wavelength occurs in 304L steel and at 1 MHz is 806 μm and has a skin depth of 125 μm, well within the 1 cm thickness of the densely meshed outer layer of the sphere. Therefore the 1 cm thick outer layer of the sphere was meshed with a maximum element size of 25 μm.

At low frequencies the electromagnetic wave penetrates the 1 cm outer layer of the metallic sphere with a non-negligible amplitude and must be accounted for in the mesh of the interior of the sphere. For example, a 1 kHz electromagnetic wave in 304L steel has a wavelength of 25.5 mm and a skin depth of 4.1 mm. To correctly account for low frequency signals, the maximum mesh element size for the sphere interior was set to 5 mm, well below the wavelength of any electromagnetic wave penetrating the 1 cm thick outer shell of the sphere for the materials used in this work.

A 'balloon' boundary condition was used in the Ansys® model whereby the region outside the meshed space is infinitely large and magnetic flux lines are neither normal nor tangential to the boundary. The boundary had length of 32 m in the horizontal (y-axis), and 37.5 m in the vertical (z-axis) and was symmetrical around the coil and sphere setup.

The same Ansys® model setup with the same meshing and boundary conditions was used to model transient responses including the propagation of a pulse in seawater. In the transient case with a sphere, pulses were modeled as input current excitations in the coil according to equation (5). In the propagation case without a sphere, the sphere material was set to seawater.

Once the Ansys® model was shown to verify the analytical model, consideration was given to the data set.

3.12. Data set

The scattered electromagnetic responses were generated for 8 permeable conducting spheres in air and seawater. The magnetic dipole was placed at z-axis heights of 150 cm to 350 cm with 50 cm increments. At each vertical height 50 data sets with additive white Gaussian noise were recorded. The noise was generated by applying a base case signal-to-noise power ratio (SNR) of 1 to the rectangular down-chirp signal at 3.5 m for the D6ac steel object with 0.4 m radius. The same noise was then applied to all pulses. The noise was then changed following a base 2 logarithm scale and the process was repeated for different SNRs. This resulted in 15 noisy data sets of SNR from 2−6 to 28. As there are 8 objects, 5 distance intervals, and 6 pulse types, there are 12 000 records in total for each noise level data set. Data input into the perceptron, multiclass logistic regression, neural network, and 1D CNN was normalized and down-sampled to 4000 samples per record to ensure a reasonable compute time.

The scattered electromagnetic response data were recorded in a 1D time domain format. The 1D data were converted into 2D spectrogram data with frequency on the vertical axis, and time on the horizontal axis. An additional layer was added to the spectrogram data to enable the use of automated feature extraction by a 2D CNN. Each scattered electromagnetic response data record was 2 ms duration except the rectangular down-chirp, which was 4 ms duration. The short-time Fourier transform window was set at 240.5 μs duration with 4680 overlaps, except for the rectangular down-chirp, which had 4516 overlaps. The window was of the Hamming type. The frequency points were calculated at 120 Hz intervals between 0 Hz and 30.7 kHz. The spectrogram data were of size 256 × 256. After understanding the data set, consideration must be given to the machine learning classifiers.

3.13. Machine learning algorithms

The machine learning algorithms used in this work include both linear (perceptron, multiclass logistic regression) and nonlinear (neural network, 1D CNN, 2D CNN) varieties and may be divided into non-probabilistic (perceptron) and probabilistic (multiclass logistic regression, neural network, 1D CNN, 2D CNN) varieties. A probabilistic machine learning algorithm ensures that all outputs sum to 1 [90] and may be analyzed using a receiver operating characteristic (ROC) curve. The machine learning hyperparameters are set after choosing the machine learning classifiers.

3.14. Classification algorithm hyperparameters

The linear separability of the data was tested using the Perceptron function from the Python scikit-learn library [109]. The perceptron uses a stochastic gradient descent algorithm [110] as a solver with an L2 regularization. Other inputs were α = 0.0001, maximum epochs = 5000, and stopping criterion tolerance of 0.001.

Multiclass logistic regression classification was implemented using the LogisticRegression function from the Python scikit-learn library with maximum epochs = 5000, and stopping criterion tolerance of 0.0001. The L-BFGS [111], a limited-memory quasi-Newton code for bound-constrained optimization, was used as the solver with L2 regularization with a bias added to the decision function.

A conventional neural network, using the TensorFlow [112] library, was applied to the data set. It consisted of three hidden layers (4096, 512, 8), a learning rate of 5 × 10−5 in seawater and 1 × 10−4 in air, batch size of 32, dropout of 0.1 in seawater and 0.5 in air applied after the 512 neuron layer, rectified linear unit (ReLU) activation function on the hidden layers, a softmax function on the output layer, a sparse categorical cross-entropy loss function, Adam optimizer, and was trained for 200 epochs.

Figure 7 shows a diagram of the 1D CNN used in this work, which was created using TensorFlow. It consisted of, in order, a 1D convolution layer of 16 filters of kernel size 64 with ReLU activation, a second 1D convolution layer of 16 filters of kernel size 64 with ReLU activation, a densely connected flatten layer, and a softmax layer on 8 outputs. The convolutional layers had padding applied to ensure an input and output length of 4000 data points, which enabled 1D CNN feature maps to be plotted with identifiable features. The learning rate was set to 0.0001, batch size of 32, and the model was trained over 300 epochs for the modulated Gaussian pulse and 200 epochs for all other pulses.

Figure 7. Refer to the following caption and surrounding text.

Figure 7. Illustration of the 1D CNN model used in this work. The input to the 1D CNN is a simulated time signature, with additive white Gaussian noise, of the scattering response of a D6ac steel sphere in seawater with R = 0.5 m, receiver coil position x = 0.0 m, y = 0.3 m, z = 3.2 m, transmitter coil position h = 3.5 m generated by a Gaussian pulse. Two 1D convolutional layers are used and these have the same properties – 16 filters of kernel size 64 with padding that ensures the input and output length remains constant at 4000 samples, and a ReLU activation function. These convolutional layers are stacked together and plotted as a surface in the illustration and are known as feature maps. In the feature maps red indicates large weight assigned by the 1D CNN and blue indicates small weight. The output from the second convolutional layer is flattened and densely connected to 8 neurons with a softmax activation function and the sphere is classified using an argmax function.

Standard image High-resolution image

Figure 8 shows a diagram of the custom 2D CNN used in this work, which was created using TensorFlow. The input to the 2D CNN is a spectrogram (256 × 256 × 1). A total of 32, 3 × 3 filters were used in the convolutional layer. The 3 × 3 filter size was chosen for its small computational load while providing horizontal and vertical edge detection and smoothing [113]. A maximum pooling operation of 2 × 2 was applied in the convolutional layer. A ReLU activation function was applied to the output of each layer excluding the final layer. The model has three layers with convolution plus max pooling in each layer. The spectrograms were flattened into a single column array. This completes the feature extraction process.

Figure 8. Refer to the following caption and surrounding text.

Figure 8. Illustration of the 2D CNN model used in this work. The input is a spectrogram of the simulated scattering response, with additive white Gaussian noise, of a D6ac steel sphere in seawater with R = 0.5 m, receiver coil position x = 0.0 m, y = 0.3 m, z = 3.2 m, transmitter coil position h = 3.5 m generated by a Gaussian pulse. The input spectrogram has a 3 × 3 filter convolved over all pixels, and then max pooling of spatial size 2 × 2 is applied. The convolution using 3 × 3 and max pooling of 2 × 2 is repeated two times with ReLU activation functions on the output. The data is flattened and densely connected to 128 neurons for the seawater data, and 256 neurons for the air data, with a ReLU activation function. This is then connected to 8 neurons with a softmax function and finally an argmax function is applied to classify the sphere.

Standard image High-resolution image

The flattened array was then densely joined to a neural network of 128 neurons for the seawater data, and 256 neurons for the air data, with a ReLU activation function. This was connected to 8 neurons with a softmax function and finally an argmax function was applied to classify the spheres. The learning rate was 0.001 in seawater and 0.0003 in air. The data set was trained for 15 epochs, except for the modulated Gaussian, which was trained for 40 epochs. Dropout was randomly applied, before the softmax layer, to ten percent of the weights to avoid overfitting [114] in the seawater data, and twenty percent in the air data. These values were chosen to maximize classification accuracy while also reducing model classification variance, which was larger for the air data. The Adam optimizer was used with the exponential decay rate for the 1st moment (0.9) and 2nd moment (0.999). Batch size was 32.

A 10-fold cross-validation [115] was used for all algorithms to calculate classification accuracy and standard deviation. A 10-fold cross-validation was chosen as it is commonly used in practice [17]. Additionally, a higher number of folds, k, results in each model being trained on a larger training set and tested on a smaller test fold, which theoretically should lead to a lower prediction error as the models are trained on more of the available data. All 2000 records per pulse type and SNR were randomly shuffled and arranged so that an equal number from each metallic object were present in each 10-fold interval (200 records). This means that there were 25 records of each object per 200 records. The first test set consisted of records 1–200, and the train set was 201–2000. The next test set was from 201–400, and the train set was the combined 1–200 and 401–2000 records, and so on according to the rules of 10-fold cross-validation.

The hyperparameters of the machine learning models were optimized using a validation set. The validation set was created by randomly shuffling 2000 records per pulse type and SNR. The first 10% of the data were used for validation, and the remaining 90% was used for training. The hyperparameters were adjusted, and the model retrained, until large classification accuracy was achieved. The data set was reshuffled again once hyperparameters were determined and 10-fold cross-validation was used to create the training and test sets.

4. Results and discussion

4.1. Verification of the analytical model for the electromagnetic response of metallic objects

Figure 9 shows results generated using the analytical model and Ansys® finite element model for R = 0.5 m metallic objects at the origin, transmitter coil position x = 0.0 m, y = 0.0 m, z = 3.5 m, and receiver coil position x = 0.0 m, y = 0.3 m, z = 3.2 m. The current in a Gaussian pulse in the frequency domain is shown in figure 9(a), and has almost all energy below 10 kHz. The real and imaginary parts of the response function are shown in figures 9(b) and (c), respectively.

Figure 9. Refer to the following caption and surrounding text.

Figure 9. Electromagnetic scattering response of metallic spheres in air and seawater with radius  0.5 m, receiver coil position y = 0.3 m, z = 3.2 m, and transmitter coil position h = 3.5 m with (a) absolute value of a Gaussian pulse with τ  = 104 μs and Vm = 254.8 V, (b) real part of the steady state frequency response function calculated using equations (1) and (2) in seawater given by the solid line, equations (1) and (3) in air given by the dotted line, and the circle markers were calculated using a swept frequency Ansys® model for the seawater case, (c) imaginary part of the steady state frequency response function using the same method described in (b), (d) time domain response function calculated using an IFFT on equations (1) and (2) in seawater, with circle markers calculated using a transient Ansys® model in the seawater case, (e) time domain voltage in seawater induced on a magnetic induction sensor calculated using equation (13), (f) spectrograms of all objects in seawater with SNR = 64 and generated by the Gaussian pulse shown in (a), where the large permeability spheres made of nickel and D6ac steel have a sustained scattering response compared to the non-permeable spheres.

Standard image High-resolution image

The dotted line in figures 9(b) and (c) shows the response in air, which reaches an inductive limit and remains constant at high frequencies. The seawater case (solid line) has a large CCR response in the vicinity of several kilohertz, and approaches zero at high frequencies due to attenuation caused by the skin effect. Ansys® results are also in seawater, and are in good agreement with the analytical results.

The response function in figures 9(b) and (c) shows clearly different profiles in the frequency domain, which means the metallic objects may be distinguished between one another using electromagnetic waves across multiple frequencies. Aluminum and 304L steel have a similar response function, with relative permeability of 1 and 1.1, respectively. When frequency approaches zero, the real part of the response function also approaches zero. Although they have a similar response function, they are nevertheless differentiated by the machine learning algorithms as outlined below, likely due to conductivity of aluminum being approximately twice that of 304L steel. Both nickel and D6ac steel have a large relative permeability and have a positive response function as frequency approaches zero. Although the response functions are similar, the machine learning algorithms, as outlined below, are able to distinguish between these permeable materials. The greatest difference in response functions is between the permeable and non-permeable materials.

The magnetic flux density is shown in figure 9(d), and there is a clear difference between permeable and non-permeable metallic objects. The D6ac steel response has a lower peak than aluminum, but then overshoots the zero point and sustains its electromagnetic scattering at a greater intensity and for a longer period of time.

The magnetic induction sensor response is shown in figure 9(e), with additive white Gaussian noise of SNR = 64. These were the signals used to calculate the spectrograms shown in figure 9(f). The spectrograms show a sustained voltage response for approximately 1 ms for the highly permeable nickel and D6ac steel, whereas the non-permeable materials 304L steel and aluminum have a sustained voltage response for less than 0.25 ms.

4.2. Time signatures

Figure 10 shows the normalized receiver coil simulated time domain responses calculated using equation (13), without additive white Gaussian noise, for a D6ac steel and aluminum sphere in seawater with radius R = 0.5 m, transmitter coil positions h = 150 cm and h = 350 cm, and receiver coil position x = 0 m, y = 30 cm, z = h–30 cm for pulses (a) two-sided decaying exponential, (b) Heaviside step-off, (c) triangular, (d) Gaussian, (e) rectangular and (f) modulated Gaussian, (g) raised cosine, (h) rectangular down-chirp.

Figure 10. Refer to the following caption and surrounding text.

Figure 10. Normalized receiver coil simulated time domain responses (time signatures) calculated using equation (13), without additive white Gaussian noise, for a D6ac steel and aluminum sphere in seawater with radius R = 0.5 m, transmitter coil positions h = 150 cm and h = 350 cm, and receiver coil position x = 0 m, y = 30 cm, z = h–30 cm for pulses (a) two-sided decaying exponential, (b) Heaviside step-off, (c) triangular, (d) Gaussian, (e) rectangular and (f) modulated Gaussian, (g) raised cosine, (h) rectangular down-chirp.

Standard image High-resolution image

The plots show delay (a) shift rightward in the plot) in the signal between 150 cm and 350 cm due to greater propagation distance and dispersion of the electromagnetic pulse in seawater. In the Heaviside step-off pulse of (b), there is observable pulse distortion at 350 cm compared to 150 cm, where the pulse has become narrow around the peak. A machine learning algorithm must account for this distortion when classifying the spheres. There is an observable difference in (b) between the time domain response of D6ac steel and aluminum spheres at height h = 150 cm, where the D6ac steel response overshoots the zero-crossing point whereas the aluminum response is much closer to the zero-crossing point.

The Heaviside step-off pulse contrasts to the modulated Gaussian in (f). The modulated Gaussian has similar shape at 150 cm and 350 cm, meaning pulse distortion is not as extreme compared to the other pulses for the propagation distances used in this work. Although the pulse shape remains fairly constant, there is only a very small difference in the time signatures of the aluminum and D6ac steel spheres. This has implications for the classification accuracy of machine learning algorithms, with the results in this work showing that the modulated Gaussian has lower classification accuracy compared with other pulse types as presented in section 4.4. The choice of pulse type (with controllable frequency spectrum) matters for the classification of metallic spheres.

4.3. Spectrograms of scattered pulses

Figure 11 shows simulated scattered pulse responses of a D6ac steel sphere in seawater with radius R = 0.5 m, receiver coil position x = 0.0 m, y = 0.3 m, z = 3.2 m, transmitter coil position h = 3.5 m for pulses: (a) two-sided decaying exponential, (b) Heaviside step-off, (c) triangular, (d) Gaussian, (e) rectangular (f) modulated Gaussian, (g) raised cosine, and (h) rectangular down-chirp.

Figure 11. Refer to the following caption and surrounding text.

Figure 11. Simulated scattered pulse responses of a D6ac steel sphere in seawater with radius R = 0.5 m, receiver coil position x = 0.0 m, y = 0.3 m, z = 3.2 m, transmitter coil position h = 3.5 m. Magnetic flux density in the z-direction, Bz, in the time domain was calculated using an IFFT of equations (1) and (2) and is the solid line (top plot), and the circle markers show the Ansys® model Bz values. There is good agreement between the analytical and Ansys® models. The time domain voltage response of the magnetic induction sensor, Vz, is given in the middle plot and was calculated using equation (13). The spectrogram was calculated using the Vz signal and is shown in the bottom plot. All spectrograms were calculated using parameters fmax = 30.6 kHz, Δf = 120 Hz, Hamming window 240.5 μs wide, 4680 overlaps, except the rectangular down-chirp, which had 4516 overlaps. All spectrograms have height and width dimensions 256 × 256. Additive white Gaussian noise was applied to all pulses to obtain a SNR of 64 before generating the spectrograms.

Standard image High-resolution image

Magnetic flux density in the z-direction, Bz, in the time domain was calculated using an IFFT of equations (1) and (2) and is the solid line (top plot), and the circle markers (top plot) show the Ansys® model Bz values calculated according to section 3.11. There is good agreement between the analytical and Ansys® models for the magnetic flux density Bz.

The time domain voltage response of the magnetic induction sensor, Vz, is given in the middle plot and was calculated using equation (13).

The spectrogram was calculated using the Vz signal and is shown in the bottom plot. All spectrograms were calculated using parameters fmax = 30.6 kHz, Δf = 120 Hz, Hamming window 240.5 μs wide, 4680 overlaps, except the rectangular down-chirp, which had 4516 overlaps. All spectrograms have height and width dimensions 256 × 256. Additive white Gaussian noise was applied to all pulses to obtain a SNR of 64 before generating the spectrograms..

The spectrograms of all pulses display a sustained voltage response for D6ac steel, which is attributable to its permeability and is associated with the low frequency part of the pulse. The exception is the modulated Gaussian pulse, which does not have a sustained voltage response, due to having a carrier frequency of 6.7 kHz and therefore most of the pulse energy is in the early time part of the time signature. These spectrograms were used as the input to the 2D CNN used in this work.

4.4. Classification accuracy of machine learning algorithms

Figure 12 shows the classification accuracy and standard deviation percentage for the perceptron, multiclass linear regression (MLR), neural network (NN), 1D CNN, and 2D CNN. The data were generated using the hold-out method [84] by randomly shuffling the data set and then running the algorithms 5 times. The perceptron is the least accurate algorithm, where the rectangular pulse in seawater has the largest classification accuracy of 30.1 ± 4.6%. The low classification accuracy of the perceptron in air and seawater suggests the data are not linearly separable. The MLR outperforms the perceptron, however it too has limited classification accuracy as it is a linear algorithm and the data is not linearly separable. The neural network outperforms the linear algorithms for most pulses, suggesting that a nonlinear classifier is better suited to the data set. The 1D CNN outperforms the neural network on average across most pulses, likely attributed to the automated feature extraction of the convolutional layers. Finally, the 2D CNN is the best performing classifier. This is possibly attributed to the preprocessing method of converting the time domain data into a spectrogram, which enables the convolutional step to find features in both the time and frequency domain simultaneously. A key takeaway from these results is that a nonlinear algorithm, rather than a linear algorithm, is required to accurately classify metallic spheres in the air and seawater environment. As the 2D CNN had the largest average accuracy of all classifiers, it was chosen to do additional detailed analysis including multiple SNR, 10-fold cross-validation, and confusion matrices.

Figure 12. Refer to the following caption and surrounding text.

Figure 12. Mean classification accuracy and standard deviation (%) for SNR = 28 in air and seawater using the hold-out method with 5 repetitions.

Standard image High-resolution image

There is also a difference in classification accuracy between pulses. The Heaviside step-off pulse performs best in seawater, and is followed by the raised cosine and Gaussian pulse. The Heaviside step-off, raised cosine and Gaussian have most of their energy in the low frequency part of the spectrum and evoke a strong ECR in the spheres in both air and seawater. The modulated Gaussian pulse performs worst of all pulses. Most of the spectral energy of the modulated Gaussian pulse overlaps the CCR of the spheres in seawater, with little energy in the ECR and this likely explains the poor performance of this pulse. Based on the results in figure 12, the pulse type used to probe a sphere has an impact on the classification accuracy of a machine learning algorithm.

4.5. Classification accuracy of the 2D CNN with additive white Gaussian noise

4.5.1. Seawater.

Figure 13 shows the classification accuracy in seawater of the test data set for (a) average classification accuracy using 10-fold cross-validation data with varying SNR, (b) box and whisker plot for SNR = 28 showing overlaid scatter plot of 10-fold cross-validation results as colored circles, (c) box and whisker plot for SNR = 21 showing overlaid scatter plot of 10-fold cross-validation results as colored circles. The modulated Gaussian pulse has been excluded from figures 13(b) and (c) for illustration purposes.

Figure 13. Refer to the following caption and surrounding text.

Figure 13. Classification accuracy in seawater of the test data set for (a) average classification accuracy using 10-fold cross-validation data with varying SNR, (b) box and whisker plot for SNR = 28 showing overlaid 10-fold results as colored circles, (c) box and whisker plot for SNR = 21 showing 10-fold results overlaid as colored circles.

Standard image High-resolution image

Classification accuracy in figure 13(a) increases nonlinearly as SNR increases. The Heaviside step-off pulse has the largest mean classification accuracy at low SNR from 2−6 to 21, and high SNR from 25 to 28. As noise is introduced into the signal, the classification accuracy varies widely between pulses. At SNR = 20, all pulses have classification accuracy less than 50%. At SNR = 21, only the two-sided decaying exponential, rectangular down-chirp, and modulated Gaussian pulses have accuracy less than 50%, and there is a classification accuracy difference of greater than 10% between the Heaviside step-off and two-sided decaying exponential pulse. Evidently, pulse type plays a role in improving classification accuracy in a noisy environment.

Figure 13(b) shows that the Heaviside step-off pulse has the largest median classification accuracy of 99.0% and, with the exception of two data points, has most of its classification accuracy data points above the maximum of the other pulses. This is possibly due to limited dispersion due to a narrow spread of frequencies, and having most of the spectral energy at low frequencies around the ECR of the spheres. The median classification accuracy of all pulses, excluding the modulated Gaussian, is greater than 89.0%. The modulated Gaussian pulse has lower median classification accuracy than all other pulses at 58.8%, which is likely attributable to the limited scattering response of the spheres for a pulse with most of its energy concentrated around the CCR peaks of the spheres.

Figure 13(c) shows the high noise scenario of SNR = 21 where pulses have an approximately 50% classification accuracy median. The data has lower classification accuracy, and greater variance, than in the low noise scenario of SNR = 28. The Heaviside step-off pulse has the largest median classification accuracy of 58.7%, and the modulated Gaussian (excluded from the plot) has the lowest at 30.3%.

Figure 14(a) shows the confusion matrices of all pulses for SNR = 28 summed over 10-folds for classification in seawater. With perfect classification accuracy the diagonal elements would equal 250. The two-sided decaying exponential, triangular, Gaussian, rectangular, raised cosine and rectangular down-chirp pulses often confuse object 3 and 4 (R = 40 cm), and object 7 and 8 (R = 50 cm). These objects are 304L steel and aluminum and have almost identical relative permeability (1.1 and 1 respectively) but different conductivity (14 and 37 MS m−1 respectively) as given in table 3. Hence, these pulses are able to distinguish between objects of different size and relative permeability, but incorrectly classify some objects with the same size and similar permeability. Although the conductivity is different by almost double between 304L steel and aluminum, this feature does not provide enough information to accurately classify these objects.

Figure 14. Refer to the following caption and surrounding text.

Figure 14. Confusion matrices summed over 10-folds for classification in seawater for (a) SNR = 28 and (b) SNR = 21. With perfect classification accuracy the diagonal elements would equal 250.

Standard image High-resolution image

Although the raised cosine has a larger median classification accuracy than the Gaussian pulse, it does misclassify object 1 and 2, whereas the Gaussian pulse does not. Hence, some pulses may be more effective at classifying some objects despite performing worse in overall classification accuracy.

The Heaviside step-off pulse, unlike the other pulses, primarily misclassifies objects 1 and 2 (R = 40 cm), and object 5 and 6 (R = 50 cm). These objects are nickel and D6ac steel and have large permeabilities of 135 and 77, respectively. Like the other pulses, the Heaviside step-off pulse is able to distinguish between permeable and non-permeable spheres with large accuracy. The modulated Gaussian pulse has the lowest classification accuracy of all and it also distinguishes between permeable and non-permeable spheres with large accuracy.

Figure 14(b) shows the confusion matrices of all pulses for SNR = 21. This low SNR scenario has classification accuracy around 50%, but despite low accuracy there are some interesting data points. High relative permeability objects (1, 2, 5, 6) are rarely confused with low relative permeability objects (3, 4, 7, 8). Hence, the method presented in this work may be a good choice for distinguishing between high relative permeability steel and aluminum objects even in the presence of noise. Objects of similar material properties but different radii are also confused at low SNR with objects 1 and 2 often confused with 5 and 6, and objects 3 and 4 often confused with 7 and 8.

4.5.2. Air.

Figure 15 shows the classification accuracy in air of the test data set for (a) average classification accuracy using 10-fold cross-validation data with varying SNR, (b) box and whisker plot for SNR = 28 showing overlaid scatter plot of 10-fold cross-validation results as colored circles, (c) box and whisker plot for SNR = 21 showing overlaid scatter plot of 10-fold cross-validation results as colored circles. The modulated Gaussian pulse has been excluded from figures 15(b) and (c) for illustration purposes.

Figure 15. Refer to the following caption and surrounding text.

Figure 15. Classification accuracy in air of the test data set for (a) average classification accuracy using 10-fold cross-validation data with varying SNR, (b) box and whisker plot for SNR = 28 showing overlaid 10-fold results as colored circles, (c) box and whisker plot for SNR = 21 showing overlaid 10-fold results as colored circles.

Standard image High-resolution image

Classification accuracy in figure 15(a) shows the Gaussian pulse has the largest mean classification accuracy at SNR = 28. The raised cosine pulse has the largest mean classification accuracy between SNR of 2−2 and 25, and the modulated Gaussian is the worst classifier at all SNR values.

Figure 15(b) shows that the Gaussian pulse has the largest median classification accuracy of 96.75% at SNR = 28, which is closely followed by the two-sided decaying exponential, Heaviside step-off, and raised cosine pulses where the distribution of classification data points is similar. All of these pulses have most of their spectral energy at low frequencies around the ECR and have very small side lobes. The median classification accuracy of all pulses, excluding the modulated Gaussian, is greater than 89.0%. The modulated Gaussian pulse has lower median classification accuracy than all other pulses at 58.0%, which is likely attributable to the limited scattering response of the spheres for a pulse with most of its energy concentrated around the CCR peaks of the spheres, as was found in the seawater case.

Figure 15(c) shows the high noise scenario of SNR = 21 where pulses have an approximately 50% classification accuracy median. The raised cosine pulse has the largest median classification accuracy of 55.25%, and the modulated Gaussian (excluded from the plot) has the lowest at 28.25%.

Figure 16(a) shows the confusion matrices of all pulses for SNR = 28. The two-sided decaying exponential, Gaussian, raised cosine and rectangular down-chirp pulses often confuse object 3 and 4 (R = 40 cm), and object 7 and 8 (R = 50 cm). These objects are 304L steel and aluminum and have almost identical relative permeability (1.1 and 1 respectively) but different conductivity (14 and 37 MS m−1 respectively) as given in table 3. Hence, these pulses are able to distinguish between objects of different size and relative permeability, but incorrectly classify some objects with the same size and similar permeability, as was shown in the seawater case. Although the conductivity is different by almost double between 304L steel and aluminum, this feature does not provide enough information to accurately classify these objects. The triangular and rectangular pulses confuse 3 and 4 with each other and with 7 and 8, meaning these pulses are distinguishing between objects of different relative permeability but not different size.

Figure 16. Refer to the following caption and surrounding text.

Figure 16. Confusion matrices summed over 10-folds for classification in air for (a) SNR = 28 and (b) SNR = 21. With perfect classification accuracy the diagonal elements would equal 250.

Standard image High-resolution image

Although the Gaussian has a larger median classification accuracy than the Heaviside step-off pulse, it does misclassify object 7 with 8, whereas the Heaviside step-off pulse does not. Hence, some pulses may be more effective at classifying some objects despite performing worse in overall classification accuracy. The modulated Gaussian pulse has the lowest classification accuracy of all pulses, although it also distinguishes between permeable and non-permeable spheres with large accuracy.

Figure 16(b) shows the confusion matrices of all pulses for SNR = 21. This low SNR scenario has classification accuracy around 50%, and like the seawater case distinguishes between high relative permeability objects (1, 2, 5, 6) and low relative permeability objects (3, 4, 7, 8).

4.5.3. Seawater versus air.

Comparing the air to seawater scenarios in figure 13(a) and 15(a), it is clear that while both mediums have similar classification accuracy for high SNR, air has a much greater classification accuracy discrepancy between pulses as SNR is decreased. This means the classification accuracy is more sensitive to choice of pulse in air than in seawater. For low SNR, the classification accuracy as expected approaches the random selection value of , or 12.5% in this case, and was approached around SNR = 2−6. Both the air and seawater cases demonstrate a rapid increase in classification accuracy between SNR = 2−2 and SNR = 26. At high SNR, both mediums have similar classification accuracy of greater than 89% for all pulses excluding the modulated Gaussian pulse.

As noise is introduced into the signal, the classification accuracy varies widely between pulses. Considering the case of air in figure 15(a), the raised cosine pulse has classification accuracy of 48% at SNR = 20 and exceeds 50% at SNR = 21. The rectangular and triangular pulses, however, do not exceed 50% until SNR = 24. Evidently pulse type plays a major role in improving classification accuracy in a noisy environment.

Finally, it is worthwhile hypothesizing why classification accuracy was different for pulses in air and seawater. Dispersion effects, unique to seawater, are likely to have influenced the results. Dispersion creates a type of automated feature extraction before being transformed into a spectrogram for classification by a 2D CNN, and this effect is absent in air. Additionally, in seawater an electromagnetic pulse is spread out in time and provides more data points for a 2D CNN to learn, which is outlined in section 4.7.

4.6. ROC curves

Figure 17 shows the multiclass receiver operating characteristic curves using a micro-averaged one-vs-rest method in seawater at SNR = 28 for (a) Gaussian pulse with 2D CNN, 1D CNN, neural network (NN) and multiclass logistic regression (MLR) and (b) all pulses, excluding the modulated Gaussian, with classification using the 2D CNN. Values were calculated using the roc_curve function from the Python scikit-learn library. The 2D CNN outperforms all other classification algorithms, and the Heaviside step-off pulse outperforms all other pulses, across all classification thresholds.

Figure 17. Refer to the following caption and surrounding text.

Figure 17. Multiclass receiver operating characteristic curves using a micro-averaged one-vs-rest method in seawater at SNR = 28 for (a) Gaussian pulse with 2D CNN, 1D CNN, neural network (NN) and multiclass logistic regression (MLR) and (b) all pulses, excluding the modulated Gaussian, with classification using the 2D CNN.

Standard image High-resolution image

4.7. Feature map visualization

The inner workings of the hidden layers of a CNN may be clarified through the use of feature maps. Visualization of the neuron activation levels at the end of the convolution layers provides insight into the features that were used by the CNN for classification.

Figure 18 shows the simulated time signature input (top plot) and a subset of 1D CNN feature maps (convolutional layer 1 and 2) of a D6ac steel sphere in seawater with radius R = 0.5 m, SNR = 28, receiver coil position x = 0.0 m, y = 0.3 m, z = 3.2 m, transmitter coil position h = 3.5 m for pulses (a) two-sided decaying exponential, (b) Heaviside step-off, (c) triangular, (d) Gaussian, (e) rectangular, (f) modulated Gaussian, (g) raised cosine and (h) rectangular down-chirp. Figure 18 shows the stacked filter output from the 16 filters in convolutional layer 1 and 2 as illustrated in figure 7 for each pulse. In the feature maps red signifies large weight and blue signifies small weight in the 1D CNN. All feature maps show that the early time part of the time signature has the largest weight assigned by the 1D CNN, often at peaks, on the slopes between peaks, and around zero-crossing points.

Figure 18. Refer to the following caption and surrounding text.

Figure 18. Simulated time signature input (top plot) and a subset of 1D CNN feature maps (convolutional layer 1 and 2) of a D6ac steel sphere in seawater with radius R = 0.5 m, SNR = 28, receiver coil position x = 0.0 m, y = 0.3 m, z = 3.2 m, transmitter coil position h = 3.5 m for pulses: (a) two-sided decaying exponential, (b) Heaviside step-off, (c) triangular, (d) Gaussian, (e) rectangular, (f) modulated Gaussian, (g) raised cosine and (h) rectangular down-chirp. In the feature maps red signifies large weight and blue signifies small weight in the 1D CNN.

Standard image High-resolution image

Figure 19 shows the spectrogram input and 2D CNN feature maps of a D6ac steel sphere in seawater with radius R = 0.5 m, SNR = 28, receiver coil position x = 0.0 m, y = 0.3 m, z = 3.2 m, transmitter coil position h = 3.5 m for pulses: (a) two-sided decaying exponential, (b) Heaviside step-off, (c) triangular, (d) Gaussian, (e) rectangular, (f) modulated Gaussian, (g) raised cosine and (h) rectangular down-chirp. Figure 19 shows a subset of the 32 filters of convolutional layers 1, 2, and 3 as illustrated in figure 8 for each pulse. In the feature maps red signifies large weight and blue signifies small weight in the 2D CNN. Convolutional layer 1 and 2 (Conv. 1 and 2) shows that the 2D CNN has assigned small weight to the noise for all pulses and therefore noise is not used for sphere classification. Distinct shapes and patterns become apparent in convolutional layer 3 (Conv. 3), and there are 32 filters but only 1 of these is shown in figure 19. Of the filters not shown, most were entirely blue, signifying small weight. The other filters not shown in figure 19 were of similar shape to those shown in convolutional layer 3.

Figure 19. Refer to the following caption and surrounding text.

Figure 19. Spectrogram input and a subset of 2D CNN feature maps of a D6ac steel sphere in seawater with radius R = 0.5 m, SNR = 28, receiver coil position x = 0.0 m, y = 0.3 m, z = 3.2 m, transmitter coil position h = 3.5 m for pulses: (a) two-sided decaying exponential, (b) Heaviside step-off, (c) triangular, (d) Gaussian, (e) rectangular, (f) modulated Gaussian, (g) raised cosine and (h) rectangular down-chirp. In the feature maps red signifies large weight and blue signifies small weight in the 2D CNN.

Standard image High-resolution image

The following description pertains to convolutional layer 3. The two-sided decaying exponential has assigned large weight to the outer envelope of the pulse from high frequency at early times to low frequency at late times, and a similar pattern is observed for the rectangular pulse. The Heaviside step-off pulse has large weight assigned to the envelope at late time, as does the triangular pulse. The Gaussian has large weight assigned to the medium time scale corresponding to the tail of the pulse. The raised cosine has large weight in the early time, high frequency part of the spectrogram and has weight assigned to the peaks of the oscillating tail of the pulse at late times. The rectangular down-chirp pulse has large weight assigned to the early time, high frequency part of the spectrogram, and to the envelope. The modulated Gaussian has weight applied only to the medium time corresponding to the end of the pulse.

The 1D and 2D CNN are using different parts of the simulated input signal for classification for the same pulse. On average, the feature maps in the 2D CNN show large weights at late times compared to the 1D CNN where there are large weights at early times. This suggests that in the 2D CNN classifier case, the decay of the scattering of the spheres has greater influence on classification than in the 1D CNN classifier case. This may explain why the 2D CNN outperformed the 1D CNN in classification accuracy in this work.

4.8. Comparison with other research

Table 1 shows that researchers have achieved a greater than 90% classification accuracy of metallic objects in air and soil using machine learning methods applied to EMI data. Although achieving very high classification accuracy, the previous methods require inversion of the MPT and therefore are not suitable for a real-time system.

Šimić et al [79] used a 1D CNN to infer an object class from time-domain MPT features. A 98% accuracy was achieved for 7 metallic objects in air in a laboratory. MPT features were determined using a step-off pulse. In our work a step-off pulse achieved a 91.5 ± 3.0% classification accuracy for the 8 metallic spheres in air for the 1D CNN as shown in figure 12, a result less than that achieved by Šimić. The 2D CNN achieved a 95.0 ± 3.6% classification accuracy, which is comparable to Šimić. In this work, a 2D CNN rather than a 1D CNN was required to have comparable classification accuracy to Šimić.

In contrast to previous research, the classification methods presented in this paper are real-time compatible. The results show that machine learning methods, without the use of a physics-based model, may be used to detect metallic objects in air and seawater at greater than 90% classification accuracy.

5. Conclusion

In this paper permeable conducting spheres in air and seawater below a magnetic dipole were classified using machine learning algorithms. The low frequency electromagnetic pulses investigated included the two-sided decaying exponential, Heaviside step-off, triangular, Gaussian, rectangular, modulated Gaussian, raised cosine, and rectangular down-chirp. The use of low frequency pulses has the advantage of limited attenuation in seawater compared to the high frequency pulses typically used in ground penetrating radar.

The 2D CNN had the largest classification accuracy of all algorithms, and was able to classify the metallic objects despite pulse distortion effects. The perceptron, a linear algorithm, had low classification accuracy and hence the scattered data used in this work is not linearly separable.

Feature maps were plotted and offered explainability of the data, with the 1D CNN placing large weight on early time scattered data, and the 2D CNN placing large weight on late time scattered data.

All electromagnetic pulses were able to classify metallic objects to differing degrees. As SNR increased, classification accuracy increased for all pulses. There was a greater discrepancy in classification accuracy for different pulses in air than in seawater. At high SNR, classification accuracy was at least 89% for both air and seawater, excluding the modulated Gaussian pulse. The raised cosine was one of the best performing pulses and, given its minimum symbol interference properties, could be a candidate for sending multiple pulses in quick succession. Analysis of confusion matrices showed that high relative permeability objects (1, 2, 5, 6) are rarely confused with low relative permeability objects (3, 4, 7, 8) for low SNR. At high SNR the 2D CNN primarily confused 304L steel and aluminum of the same radius.

The machine learning methods presented in this work are real-time compatible algorithms and may be used in situ as a part of a robotic system for metallic object classification.

The possible applications of this work are multiple. Buried cables of offshore wind farms might be detected and localized using similar machine learning techniques. Underwater metallic objects, such as anchors or shipwrecks, might be found using these techniques. Classification algorithms that automate feature extraction, such as a CNN, promise to reduce human involvement in underwater metallic object detection.

5.1. Limitations and future work

This paper focused on numerical simulation of the scattering of electromagnetic pulses from permeable conducting spheres in air and seawater. Experimental data will differ from simulated data due to multiple factors including coil effects where coil size and number of turns affects the resistance, inductance, and low-pass filter characteristics of the coil. If the coil is attached to an underwater robot then noise effects due to electronics and electric motors will also be present in the electromagnetic data. Future work should focus on experimental data collection and the implementation of the machine learning classifiers used in this work. Although electromagnetic theory in seawater is well established, the use of electromagnetic pulses and machine learning for metallic object classification in seawater is yet to be investigated experimentally. The numerical study presented in this work should be validated using an EMI sensing experimental apparatus in seawater. Results could be compared to an analytical solution and thereby validate the theory used in this paper.

While a numerical simulation is useful for analyzing a great variety of parameters, it was limited to a sphere shape in this work. A sphere is a convenient shape for modeling electromagnetic wave and object interaction, however actual objects found in the ocean, such as anchors and shipwrecks, have a non-spherical shape. While an analytical model for unusual shapes is not feasible, finite element analysis software, such as Ansys® Maxwell, may be used to determine the scattering response of these objects.

In this work sphere radii of 40 cm and 50 cm were studied in order to determine the ability of the machine learning classifiers to distinguish between spheres with different sizes but of the same material. These radii have a strong scattering response at low frequencies. In future work, spheres with small radii of 5 cm and 10 cm should be investigated to determine the ability of this method to detect small metallic objects. When the radius is small, larger frequencies are required to evoke a strong scattering response from the sphere. Simultaneously, larger frequencies are more severely attenuated in seawater and hence a decrease in SNR will likely be observed. How this affects accuracy was not covered in this work and should be investigated in future work.

In this work the metallic spheres were simulated as suspended in an infinite medium of air or seawater. Realistically, a metallic object in seawater will either be buried or proud on the seabed. Either way, the presence of another material near the sphere will alter the scattering response of the sphere. Finite element analysis modeling could be used to explore different boundary conditions, such as objects in the presence of different seabed topologies. However, replacing the analytical model with a finite element analysis model substantially increases the required computational time, so this would be a significant undertaking.

Whether or not machine learning algorithms will correctly classify objects other than spheres in seawater was not covered in this paper. Speculatively, the machine learning classifier should be able to distinguish between different shapes as they will have unique response functions, and this should be explored.

The additive white Gaussian noise used in this work assumes constant power across the frequency spectrum. However, noise may be pink, or more complicated than white noise. The noise of a real magnetic induction sensor system should be characterized and used in numerical simulations as a more accurate representation of reality.

The frequency dependent attenuation of an electromagnetic wave in seawater limits the distance of classification from the sensor to the metallic object. It would be useful to understand the maximum distance of classification of metallic spheres in seawater. This is dependent on several factors, including but not limited to the transmitter and receiver used in the experiment, the attenuation and dispersion of the pulse, the parameters of the metallic object, background noise, impurities in seawater, and proximity to the seabed. Distance of classification may be maximized through the use of an energy maximized pulse, which may be determined by characterizing the transmitter-seawater-object-receiver channel and applying Parseval's theorem. This varies depending on the response function of the object. The optimal pulse, weighing both distance of detection and classification accuracy, is an open research question. There also arises the issue of classification using an energy maximized pulse. As the energy maximized pulse is unique to each metallic object, how this may be used by a machine learning algorithm is unknown as selecting a unique pulse for each object inherently biases a classification system.

Maximum classification distance is a function of the sensor used in the experiment or simulation. This research focused on magnetic induction sensors. Other sensors may provide greater detection range, and specialized features such as a strong resonant response around a particular frequency band. Alternative sensors, such as a superconducting quantum interference device (SQUID) may provide much greater sensitivity and detection range. Whether or not the noise of the surrounding electronics, and magnetic background noise, are a limiting factor remains to be seen. Future research can focus on the feasibility of using an alternative magnetic sensor for metallic object detection.

Classification is not the only output that a machine learning algorithm is capable of producing. The CNNs used in this research were classification algorithms, which required the final layer of the CNN to have a softmax function. The CNN can also be used for regression, i.e. it can predict numerical values. This can be achieved by making the final layer of the CNN a linear function. Predictions may be made about the parameters of the metallic object such as conductivity, permeability, radius, and depth.

A bespoke 2D CNN was used in this work with three convolutional layers and was specifically designed to illustrate feature maps for explainability purposes. Deep learning models such as the visual geometry group network (VGGNet) [113], inception-V3 [116], residual networks (ResNet) [117], long short-term memory (LSTM) [118], generative adversarial networks (GAN) [119], and transfer learning [120] were not investigated in this work but may result in larger classification accuracy and should be investigated in a future study.

Data availability statement

The data cannot be made publicly available upon publication because they are owned by a third party and the terms of use prevent public distribution. The data that support the findings of this study are available upon reasonable request from the authors.

Conflict of interest

The authors report there are no competing interests to declare.

Appendix: Pulse types

A.1. Two-sided decaying exponential pulse

The two-sided decaying exponential (2SDE) pulse subjects the metallic objects to a second order increasing rate, i.e. a positive second derivative with respect to time. A 2SDE pulse is given by [121]

where Vm is the maximum amplitude, , and τ is the full width at half maximum, and t is time.

The Fourier transform of the two-sided decaying exponential pulse is

which is a Lorentzian function.

To obtain the required cutoff frequency, the 2SDE pulse τ parameter must be set using

To obtain = 3 kHz, τ was set to 47.3 μs.

A.2. Modulated Gaussian pulse

A modulated Gaussian pulse has a carrier frequency with a Gaussian envelope and is given by [54]

where Vm is the amplitude, fb is the carrier frequency, and the full width at half maximum τ may be controlled by setting

A modulated Gaussian pulse has a Fourier transform that is also Gaussian and is given by

where .

To obtain a symmetric pulse with a maximum at the peak of the Gaussian envelope [104], the modulated Gaussian pulse should have its carrier frequency set according to [54]

where K is the number of oscillations from the amplitude peak to the point where the amplitude has been reduced to  = 0.367 of the peak amplitude.

The modulated Gaussian pulse was chosen with parameters τ = 224 μs and K = 1, giving  = 1.39 kHz and fb = 6.70 kHz. These parameters were chosen to have most of the frequency spectrum energy overlap with the CCR response of the metallic spheres in seawater with R = 0.5 m at receiver coil position x = 0 m, y = 0.3 m, z = 3.2 m, and h = 3.5 m, which have maximum quadrature response between 6.3 and 7.1 kHz as shown in figure 9(c). The Gaussian pulse without modulation has most of its spectrum energy overlap with the ECR response of the spheres in seawater at  100 Hz. This differentiates the Gaussian and modulated Gaussian pulses, and dictates whether they primarily excite an ECR (Gaussian) or CCR (modulated Gaussian) from the spheres.

A.3. Gaussian pulse

The Gaussian pulse is a special case of the modulated Gaussian pulse where there is no carrier frequency, fb = 0 Hz, resulting in only a Gaussian envelope. The Gaussian pulse is

where a is the same as in the modulated Gaussian pulse.

The Fourier transform of the Gaussian pulse is

To obtain the required cutoff frequency for the Gaussian pulse, the full width at half maximum must be set using

To obtain  = 3 kHz, τ was set to 104.0 μs.

A.3.1. Triangular pulse.

The triangular pulse is defined as [122]

where Vm is the amplitude, t is time, and d is the pulse width. The Fourier transform of the triangular pulse is

The triangular function in the frequency domain is defined from , however this is not realizable in practice. This is mitigated by modelling the transmitter coil as a series RL circuit, which is a low pass filter and therefore prevents sudden changes in the pulse in the time domain. The triangular pulse has broad frequency components but unlike a rectangular pulse does not have a sudden voltage step response.

To obtain the required cutoff frequency, the triangular pulse d parameter was calculated by plotting both sides of the equation

and determining the point of intersection. To get  = 3 kHz, d was set to 212.6 μs.

A.4. Raised cosine pulse

The raised cosine pulse is used in communications technology to minimize intersymbol interference by adjusting its roll-off factor. This pulse is included in the current work to investigate the feasibility of using multiple consecutive pulses with limited interference. The raised cosine pulse is given by [122]

where ξ is the roll-off factor, and is the reciprocal of the symbol-rate β. In the time domain the raised cosine pulse has a sinc term that ensures that it has zero crossings like an ideal brick-wall filter, but also has a term that decays in time, ensuring that the tails of the pulse approach zero and therefore may be generated in practice.

The Fourier transform of the raised cosine pulse is

where and .

To obtain the required cutoff frequency, the raised cosine pulse γ parameter was calculated by plotting both sides of the equation

and determining the point of intersection. For  = 3 kHz, ξ = 0.8, γ was set to 130.4 μs.

A.5. Rectangular pulse

The rectangular pulse is defined as [122]

where Vm is the amplitude of the step, t is time, and d is the width of the pulse. The Fourier transform of the rectangular pulse is

The sinc function is defined from , however this is not realizable in practice. There is a voltage rise time, and the perfect rectangular shape of the pulse cannot be achieved. Like the triangular pulse, this characteristic is mitigated by an RL circuit transmitter coil. The frequency distribution of the rectangular pulse provides a greater frequency range compared to other pulses. It also has a very sudden increase and decrease in voltage compared to other pulses and provides unique transient responses compared to other pulses.

To obtain the required cutoff frequency, the rectangular pulse d parameter was calculated by plotting both sides of the equation

and determining the point of intersection. For  = 3 kHz, d was set to 147.6 μs.

A.6. Rectangular down-chirp pulse

A chirp signal increases or decreases frequency with time. A linear down-chirp has a linearly decreasing frequency with time and is given by [123]

where Vm is the amplitude of the chirp, f0 is the starting frequency, f1 is the final frequency, T is the time to sweep from f0 to f1, and t is the time. A chirp that ends with zero volts is desirable as a sudden jump to zero volts has a corresponding broadband frequency change. This broadband frequency change is reflected in a spectrogram of the signal and may negatively affect the machine learning classification accuracy. To obtain a chirp that ends with zero phase, i.e. volts equal to zero at the end of the chirp, the end frequency may be chosen using the formula

where K is the number of oscillations.

A delayed chirp pulse is obtained by introducing a delay time t0 and multiplying the chirp by a rectangular function. The chirp is delayed to ensure the spectrogram created from the signal captures all pulse data. The delayed chirp pulse is given by

where .

The Fourier transform of the chirp pulse is

where ω0 = 2, and ω1 = . In this case, the value is not specifically chosen. Rather, f1 with zero phase, and as close to  = 3 kHz as possible, was chosen to avoid a broadband frequency response at the end of the chirp that would result from a sudden change from a non zero to zero current.

A down-chirp rather than an up-chirp was chosen as the phase velocity of an electromagnetic wave increases with frequency in the quasi-static regime, which is explained in section 3.8. The down-chirp starts with the highest frequency and ends with the lowest frequency, which propagate the fastest and slowest in seawater, respectively. This prevents high and low frequency parts of a pulse from overlapping with each other and hence improves detection and classification accuracy. Some pilot simulations completed in this work confirmed that the down-chirp outperformed the up-chirp in classification accuracy in seawater.

A.7. Heaviside step-off pulse

The Heaviside step-off pulse is an idealized pulse that, in practice, is achieved through a square-integral pulse. One means of achieving a Heaviside step-off pulse is a voltage step-on and linear ramp-down [29] as realized by a Geonics EM63 [30] EMI sensor. The width of the step-on and linear ramp-down pulse must be large to approximate the step-off pulse. The wider the pulse width the closer the pulse approximates an idealized Heaviside step-off pulse, and the narrower the pulse width the closer the pulse approximates an impulse response.

The step-on and linear ramp-down time domain pulse is

where Vm is the voltage amplitude, 2Ta is the time duration of the step-up, and Tb is the time duration of the linear ramp-down.

The Fourier transform is

In [10] it was shown that a long, rather than a short, Ta better approximates the scattering response of a metallic sphere in seawater actuated by an idealized Heaviside step-off pulse. We chose 2Ta = 40 ms and Tb = 50 μs to match the pulse used in [10]. The large pulse width of the step-on and linear ramp-down means setting equal energy with the other pulses is not feasible as the energy of the other pulses are much lower, which would result in a very small SNR and hence low classification accuracy. Instead, the step-on and linear ramp-down pulse voltage was set to Vm = 186.6 V equal to the rectangular pulse maximum voltage and resulted in a SNR comparable to the other pulses. Only the step-off is of interest so the data used was from 0.5 ms before, and 1.5 ms after, the start of the linear ramp-down of the pulse. The idealized Heaviside step-off pulse does not have an adjustable frequency spectrum and therefore this pulse was not set with a cutoff frequency.

undefined