Event encryption: rethinking privacy exposure for neuromorphic imaging

Bio-inspired neuromorphic cameras sense illumination changes on a per-pixel basis and generate spatiotemporal streaming events within microseconds in response, offering visual information with high temporal resolution over a high dynamic range. Such devices often serve in surveillance systems due to their applicability and robustness in environments with high dynamics and harsh lighting, where they can still supply clearer recordings than traditional imaging. In other words, when it comes to privacy-relevant cases, neuromorphic cameras also expose more sensitive data and pose serious security threats. Therefore, asynchronous event streams necessitate careful encryption before transmission and usage. This work discusses several potential attack scenarios and approaches event encryption from the perspective of neuromorphic noise removal, in which we inversely introduce well-crafted noise into raw events until they are obfuscated. Our evaluations show that the encrypted events can effectively protect information from attacks of low-level visual reconstruction and high-level neuromorphic reasoning, and thus feature dependable privacy-preserving competence. The proposed solution gives impetus to the security of event data and paves the way to a highly encrypted technique for privacy-protective neuromorphic imaging.


I. INTRODUCTION
N EUROMORPHIC cameras encode per-pixel illumination changes of dynamic scenes with asynchronous events in microseconds, posing an imaging paradigm shift against conventional modalities that generate temporally-sparse, lowdynamic-range images [1].As such, these novel cameras, which feature high temporal precision and high dynamic range, can faithfully record visual information in environments with fast motion and harsh lighting.Recently, neuromorphic cameras are being particularly regarded for the use of detection and recognition in surveillance fields.Thorny issues of privacy exposure inevitably arise when dealing with sensitive data.Earlier research believes that such an event-based solution can offer systematical guarantees for privacy and ethics considerations due to the incapability of capturing visible images [2].However, advanced processing algorithms have been realizing accurate recognition, detection or reconstruction of gray-scale images from raw event streams.Besides, relative to the framebased modality, neuromorphic imaging tends to expose more details since it can still offer blur-free recordings of dynamic (b) Neuromorphic cameras are robust to harsh illumination conditions, such that the corresponding events still have clear recordings with high-temporalprecision and high-dynamic-range states, where sensitive information is also prone to be exposed.
objects under strong or weak illumination (as shown in Figure 1).The resulting data exposure issues are thus receiving attention from the community.Neuromorphic privacy challenges were first raised in a recent study [3].In this work, we discuss more attack scenarios and approach event encryption from another perspectivediscrimination between events and noise.It is known that imaging slow-moving objects under low lighting can typically result in a great deal of noise that can obscure true visual information, and the noise is also hard to be eliminated [4].As such, encryption can be interpreted as a process of adding wellcrafted noise into raw events for making human or machine reasoning failed.Accordingly, we propose a new solution for event encryption.It recursively synthesizes the noise with strong spatiotemporal correlation and high density estimation until the information is fully obfuscated.Such a method does not lead to the loss of the original data, enabling us to conceal sensitive biometric features in encryption and enjoy high-quality visualization after decryption.Our evaluations show that the processed events can defend against attacks from visualization, reconstruction, denoising and high-level reasoning, and thus feature reliable privacy-preserving ability.
Inspired by frame-based image encryption, Du et al. [3] conducted the first investigation on this issue through incorporating two-dimensional chaotic mapping and a key updating mechanism.Compared with the counterpart, our technique enables direct encryption of four-dimensional event streams in an unsupervised manner, and also requires fewer operations for lossless decryption.Our research comprehensively studies the robustness of encrypted events to various neuromorphic analysis attacks.In addition to bringing faithful protection, the simple yet effective encryption is likely to be integrated with some existing noise filters.

II. RELATED WORK
Neuromorphic cameras record local temporal contrast with microsecond temporal precision and over a 100 dB dynamic range, and output asynchronous streams of events in response.Relative to frame-based cameras, they are more adaptive to highly dynamic scenes and less sensitive to illumination settings.Some neuromorphic cameras with an active pixel sensor embedded, such as DAVIS346 [5], can also capture intensity frames at the same spatial resolution.In recent years, researchers have been trying to design event-driven solutions for various applications, such as auto-focusing [6], optical flow estimation [7] and motion detection [8], or to reconstruct highquality images with the aid of event information [9].
The ongoing revolution of artificial intelligence technologies requires the support of an enormous quantity of visual data, whereas the raised privacy concerns have also been drawing public attention especially for the scenarios involving sensitive biometrics, such as facial expression tracking [10] and gait recognition [11].Prior research is in favor of the privacypreserving capability of neuromorphic cameras over traditional imaging.Nevertheless, existing techniques (attacks) can reconstruct proper representations of events that enable accurate machine or human reasoning.For example, purifying raw events for making them more informative [12], or transforming streams into grid-based manifestations whereby one can interpret events in the form of images [13], or performing inferences upon asynchronous events alone [14].These attacks put tremendous pressure on neuromorphic imaging from the security side.
A variety of encryption solutions are proposed for concealing sensitive visual information.Chaos-based variants leverage a pseudo-random sequence of numbers to shuffle spatial pixels [15].Due to the fast parallel processing competence, optical techniques (e.g., nonlinear optics [16], metahologram [17]) are adopted in image cryptography.Compressive sensing [18] that is primarily used for data shrinking, also contributes to low-cost security enhancements [19].Target to neuromorphic imaging, a recent proposal bridged traditional chaotic schemes with polarity flipping [3].This work continues the discussion on event encryption from a new perspectiveneuromorphic noise removal.

A. Preliminary of Neuromorphic Imaging
Neuromorphic cameras react to illumination changes with asynchronous streaming events [1], with each event being represented as where x i is a spatial pixel in which an event, indexed by i, is triggered at a certain timestamp t i .The polarity p i ∈ {−1, +1} denotes the sign of the illumination change.Since a single event carries limited information, we gather a batch of events E over a time period for a more expressive representation of scenes where I = |E| is the number of events, and the timestamp increases monotonically with the index.Such an imaging modality is sensitive to various sources of interference, resulting in the output with a great amount of noise.Background Activity noise, which comes from junction leakage (in bright scenes) and thermal noise (in dark scenes), irregularly appears in the pixels without any activity [12], [20], whereas informative events, which are triggered by the edges of moving objects, typically have strong spatiotemporal correlation and polarity continuation [1].Exploiting the nearestneighbor filter is a simple yet effective solution to discriminate between events and noise [21], which can be formulated as where B (i) is a collection of the neighbors of an event e i .f x , f t and f p are the distance measures associated with the three dimensions of an event, and T x , T t are case-dependent thresholds.Such a filter only admits the events that sufficiently approach e i in space-time while having the identical polarity.Besides, the number of events in a given space is normally far more than that of noise, leading to more densely clustered regions in which events reside [22].
Nevertheless, it still remains challenging for noise filters to have correct estimation when imaging slow-moving objects under low lighting, where the resulting events, similar to noise, have weak spatiotemporal correlation and sparse distribution, and the noise also accounts for a high proportion in quantity.In this particular case, visual information is significantly confused.As such, we rethink that whether event encryption can be regarded as an inverse process of event denoising, where a great amount of well-crafted noise is introduced into clean events until the information is totally obfuscated?In terms of feasibility, well-established image encryption methods are not adapted to asynchronous streaming events due to their inability to handle temporal information.We call for a specific algorithm that already has sufficient theoretical support from neuromorphic noise removal.Similar to denoising, unsupervised event encryption without the assistance of auxiliary data can significantly increase the practicality.Moreover, as events encode incomplete visual information of scenes, the encryption and decryption should also be lossless to avoid the further signal loss.In what follows, we detail our methodology based on the above considerations.

B. Event Encryption with Synthetic Noise
By leveraging the features of events and noise, we propose an encryption algorithm (Algorithm 1) by which we synthesize noise in the given space and disrupt the polarity continuation of the original events via a pseudo-random fashion.
The algorithm processes the raw event stream E to be the encrypted one Ẽ.It first projects E into a two-dimensional plane E x that only has spatial information.The mask M represents the pixels in which noise exists, which is initialized by a function δ that can be either static (e.g., hard-coded rules) or adaptive (e.g., neural networks) as long as M ∩ E x = ∅.For a position x i where events exist, we compute its spatial neighbors R = θ(x i , M) in the mask by which the synthetic noise can spatially correlate with the events.The central events E c contain a set of events triggered at the same position but at different timestamps.Then, given R and E c , we synthesize multiple noise events N by the function ϕ such that any one of them has the following attributes where p is randomized, t relates to t i by the spatial distance between the noise and the event e i under the constraint that they still temporally correlate with each other, and σ is a scaling factor.The Equations ( 4) and ( 5) enforce that there is strong spatiotemporal correlation between the synthetic noise and original events.At each x ∈ R, we also allows |N| = |E c | whereby the noise shares the same density estimate with the events.The algorithm recursively fills the space with noise and terminates when all the pixels in the mask have been traversed.We elaborate more on the processing in Figure 2. Finally, a reversible mapping λ disrupts the polarity continuation where ⊕ is an XOR gate, and szudzik is a pairing function that uniquely encodes a two-dimensional value into an integer [23].Since there is a setting M∩E x = ∅ in the algorithm, the key to the decoding can simply be the encrypted where ρ can be any well-established encryption method that can handle a list of numbers.Therefore, in the decoding phase Ẽ → E, we can use the key to locate the pixels where true events exist and Equation ( 6) to retrieve the original polarity.
3 N (2) 3 Fig. 2. Illustrations of our recursive encryption algorithm.There is a twodimensional plane with 3 × 3 pixels, where multiple events Ec are triggered in the center, and the rest of the pixels are on the mask for synthetic noise.
In the 1st layer of the recursion (left), the algorithm synthesizes the noise N i (i = 1, 2, 3, 4) in 4 spatial neighbors horizontally/vertically adjacent to Ec, where |N i ∪ Ec will be the input of the 2nd layer of the recursion (right).The algorithm, which is blind to N

IV. EXPERIMENTS A. Experimental Settings
In our experiments on Algorithm 1, the function δ computes the mask M = S \ E x , where S is the camera resolution.The measures f x , f t and f p calculate the L 1 distance, while the spatial threshold T x is set to 1.The scaling factor σ is configured as 0.05, and the value of the temporal threshold T t thus varies with the attributes of the events to be encrypted.The evaluations that follow are conducted based on several public datasets [14], [24]- [27].To alleviate the influence of noise inherent in the original data on task performance, we exclude the samples where noise predominates in quantity.Raw event streams are encrypted using the algorithm with the given settings and then serve as the input of downstream applications (attacks).

B. Attacks from Visualization and Reconstruction
Neuromorphic imaging records scene brightness changes in a set of tuples and was once considered privacy-preserving.Nevertheless, events can be visualized in a two-dimensional plane by counting or accumulation.Moreover, they can also bring images into being high-dynamic-range states such that more sensitive details are exposed.Figure 3 shows an example based on DAVIS 240C Datasets [24].The raw events, which are visualized in the form of an event frame [29], can offer more facial information than the underexposed image, and the associated reconstruction [28] can even deliver a high-quality image that exposes sensitive biometrics.Our encrypted data conceal all the informative features and enable the reconstruction malfunctioned, supplying sufficient privacy protection.

C. Attacks from Neuromorphic Noise Removal
Neuromorphic imaging yields noise in response to interference, and the denoising is often used to purify raw events for making them more informative.Our encryption approach aims to fake the noise that denoising algorithms cannot suppress.In Figure 4, we study whether the encrypted events can resist the attacks from existing methods, such as Zhang et al. [30], Wu et al. [31], Feng et al. [22] and the nearest-neighbor filter (NNf) [21].For comparison, we allow the events filled with random noise to be another set of input, and keep the   signal-to-noise ratio (SNR) of the two input data sets the same (SNR = 1).The figure shows that the denoising methods can remove random noise and raise the SNR, but their effects on the encrypted events are fairly negligible, indicating that they cannot distinguish the true events from our fake noise.

D. Attacks from High-level Neuromorphic Reasoning
One can use advanced techniques to perform high-level reasoning on event streams and then efficiently acquire privacyrelevant information.Evaluated on N-MNIST [25] and ASL-DVS [26], Table I presents the top-1 recognition accuracy on raw events, on raw events with 50% random noise, on the encrypted events by the competitor (Du et al. [3]) and on our encrypted events.The approaches, which are either gridbased [13], [29], [33] or graph-based [32], can still have good performance when the data are filled with random noise, but they fail for recognition on the encrypted ones.In Figure 5, we diagnose the learning process and the feature responses of a trained network.It shows that the network cannot learn any meaningful information from the encrypted events.
With similar settings on tested samples, Table II investigates neuromorphic object detection on 1 MEGAPIXEL [14] and GEN1 [27], and Figure 6 gives the visualization.The model can correctly detect person or car on the raw events, but it gives wrong results when the input is encrypted.The above experiments prove that our encryption can drastically alter the underlying pattern of events and make it more difficult for learning, and can even lead to more degraded performance compared with the counterpart.

V. LIMITATION AND DISCUSSION
While our approach exhibits satisfactory performance in the evaluations described, it is essential to recognize the constraints.The algorithm is a procedure with at least quadratic time complexity, and non-deterministic runtime, which is affected by several factors including the sample to be encrypted, camera resolution as well as the noise mask used, can intolerably increase as the mask covers a large area of pixels.In addition, the proposed encryption fails to be lossless  in the absence of a decryption key, posing a need for further enhancements.

VI. CONCLUSION
Neuromorphic imaging can offer clear recordings of dynamic objects and is thus particularly practical in surveillance systems.However, similar to frame-based modalities, it also suffers from thorny privacy exposure issues.To enhance the security level of event data, we develop an encryption method that can synthesize pseudo-random noise such that all the visual information is obfuscated.Evaluations show that our encrypted events can resist the attacks from visualization, reconstruction, denoising and high-level reasoning, and thus have robust privacy-preserving ability.Nevertheless, in terms of execution time and energy consumption, the overhead of computing on encrypted data has not been fully investigated in this work.This emerging field is still requiring more attention and further research.

Fig. 1 .
Fig. 1.(a) When the intensity image (upper left) suffers from heavy motion blur (upper right) and being overexposed (lower left) or underexposed (lower right), it loses significant details and cannot deliver informative visualization.(b)Neuromorphic cameras are robust to harsh illumination conditions, such that the corresponding events still have clear recordings with high-temporalprecision and high-dynamic-range states, where sensitive information is also prone to be exposed.

4 )
based on the adjacent events.

Fig. 3 .
Fig. 3. Attacks from grid-based visualization of event streams.(a) The raw events from a neuromorphic camera can capture some dark details that are lost in (b) the underexposed image.(c) The CF [28] bridges (a) and (b) to reconstruct an image with high-dynamic-range states, where sensitive face features (with a zoom-in view) are recovered and exposed.(d) The encrypted events processed by our algorithm conceal all the information.(e) The failed reconstruction by the CF when only the encrypted events are used.(f) The CF uses (d) and (b) to reconstruct an image, where the face features are obscured and protected.

Fig. 4 .
Fig. 4. SNR (in ratio) comparisons on denoising two kinds of event data.

Fig. 5 .
Fig. 5. (a) Training accuracy that changes with the increase of epochs.When a network makes inference on event frames, the feature responses of (b) raw and (c) encrypted events.The sample is from Letter A of ASL-DVS.
VII. ACKNOWLEDGMENTSThis work was supported in part by the Research Grants Council of Hong Kong SAR (GRF 17201620, 17200321) and by ACCESS -AI Chip Center for Emerging Smart Systems, sponsored by InnoHK funding, Hong Kong SAR.The authors have confirmed that any identifiable participants in this study have given their consent for publication.

Fig. 6 .
Fig. 6.Pedestrian detection results of (a) raw events and (b) our encrypted events.The sample is from 1 MEGAPIXEL Dataset.

TABLE I RECOGNITION
ACCURACY ON RAW EVENTS, ON RAW EVENTS WITH 50% RANDOM NOISE, ON THE ENCRYPTED EVENTS BY THE COUNTERPART AND ON OUR ENCRYPTED EVENTS.THE SYMBOL ↓ REPRESENTS THE DROP WE ACHIEVE IN COMPARISON WITH THE COUNTERPART.