Breaking boundaries in radiology: redefining AI diagnostics via raw data ahead of reconstruction

Objective. In the realm of utilizing artificial intelligence (AI) for medical image analysis, the paradigm of ‘signal-image-knowledge’ has remained unchanged. However, the process of ‘signal to image’ inevitably introduces information distortion, ultimately leading to irrecoverable biases in the ‘image to knowledge’ process. Our goal is to skip reconstruction and build a diagnostic model directly from the raw data (signal). Approach. This study focuses on computed tomography (CT) and its raw data (sinogram) as the research subjects. We simulate the real-world process of ‘human-signal-image’ using the workflow ‘CT-simulated data- reconstructed CT,’ and we develop a novel AI predictive model directly targeting raw data (RCTM). This model comprises orientation, spatial, and global analysis modules, embodying the fusion of local to global information extraction from raw data. We selected 1994 patients with retrospective cases of solid lung nodules and modeled different types of data. Main results. We employed predefined radiomic features to assess the diagnostic feature differences caused by reconstruction. The results indicated that approximately 14% of the features had Spearman correlation coefficients below 0.8. These findings suggest that despite the increasing maturity of CT reconstruction algorithms, they still introduce perturbations to diagnostic features. Moreover, our proposed RCTM achieved an area under the curve (AUC) of 0.863 in the diagnosis task, showcasing a comprehensive superiority over models constructed from secondary reconstructed CTs (0.840, 0.822, and 0.825). Additionally, the performance of RCTM closely resembled that of models constructed from original CT scans (0.868, 0.878, and 0.866). Significance. The diagnostic and therapeutic approach directly based on CT raw data can enhance the precision of AI models and the concept of ‘signal-to-image’ can be extended to other types of imaging. AI diagnostic models tailored to raw data offer the potential to disrupt the traditional paradigm of ‘signal-image-knowledge’, opening up new avenues for more accurate medical diagnostics.


Introduction
Since Wilhelm Röntgen's discovery of x-rays in 1895, medical imaging technology has undergone rapid development and become an integral part of modern healthcare services (Editors 2000).Regardless of imaging principles, modalities, equipment, and interpretation methods (radiologists/human intelligence, artificial intelligence [AI]), the paradigm of 'signal-image-knowledge' has remained unchanged (figure 1).In this paradigm, the 'signal to image' reconstruction process involves stable information-loss compression of sources.Even though obtaining visually interpretable images through reconstruction for human comprehension is imperative, operations such as interpolation and suboptimal statistical weighting in this process still lead to irrecoverable disparities between the sensor domain and image domain (Wang et al 2020).From this perspective, due to the limitations imposed by the human visual interpretation of images, a considerable amount of potential clinical diagnostic information remains untapped.
Many researchers have recognized this issue and have begun utilizing machine learning or deep learning to optimize the reconstruction process, aiming to reduce the generation of discrepancies (Yin et al 2022).These studies treat the reconstruction process itself as a predictive problem, taking the sinogram of computed tomography (CT) or the k-space signal of magnetic resonance imaging (MRI) as inputs (Ravishankar et al 2019).While these methods indeed hold promise in accelerating acquisition, adapting to under-sampled sensor data, enhancing contrast-to-noise ratios, improving resolution, and even reducing the required contrast agent dose, it remains challenging to meet the demands for the geometric shapes, subjects, and data 'completeness' upon which reconstruction relies.AI excels in automatically discovering key features relevant to prediction targets from raw and highly interconnected data (Tutsoy andPolat 2022, Zhang et al 2023).It has demonstrated capabilities in numerous clinical applications that match or even surpass human abilities (Tutsoy and Tanrikulu 2022).The fundamental reason AI can transcend humans might be its view of images as data, not just visual images, and its data-centric approach of extracting numerous features for analysis (Gillies et al 2016, Mu et al 2022).Therefore, some scholars have suggested that why not directly build a signal to knowledge mapping (Chung et al 2021)?
Analyzing raw data is not a completely novel concept.Currently, utilizing deep learning techniques for the direct analysis of raw gene data has become a hot research topic (Cosentino et al 2023).The idea of bypassing the traditional image processing workflow and directly extracting knowledge from signals was first proposed in 2016 (Wang 2016).From this insight, subsequent studies have focused on the potential value of raw data analysis in the process from signal to knowledge.A simulation study demonstrated the feasibility of using neural networks to identify and estimate the centerline of blood vessels from the signal domain by inserting vessels into the phantom (De Man et al 2019).In another study, the CT images were simulated back to raw data, and the convolutional neural network (CNN) was used to learn and extract effective features from the raw data, which preliminarily verified the effectiveness of signal domain diagnosis (Gao et al 2019).However, both of the abovementioned studies have limitations.The former used non-clinical data, and the latter designed a comparison between the raw data model and a CT image model, both of which have lost information from before the reconstruction.Differing from the above-mentioned studies, some researchers concatenated the reconstruction network, detection network, and diagnostic network, conducting end-to-end training (Wu et al 2018).The results demonstrated the superiority of end-to-end optimization, but this study did not skip the 'signal-toimage' process, indicating that there is still room for optimization.Furthermore, we also performed empirical studies using sinograms from 276 patients in authentic clinical scenarios, and our results show that the integration of unprocessed raw data greatly improves the performance of CT models (He et al 2023).In summary, researches dominated by CT images are still performed in the context of partial information loss, and the information distortion inherent in the reconstruction process cannot be retrievable.Currently, there is a lack of extensive research with a complete 'Ground truth-signal-image' process with a large dataset.To address this limitation, we take as a related reference the work on the MRI simulation reconstruction framework (Zhu et al 2018), and we served CT images as ground truth, replicating real signal acquisition and reconstruction scenarios.This approach not only simulates real clinical processes but also enhances the comprehensiveness of the experimental framework.Within this framework, our object is to verify the suitability of raw data for AI driven analysis by characterizing and modeling raw data.
In this study, we adopted the concept of manifold learning, employing a strategy of acquiring raw data through the simulation of real CT images.Subsequently, we performed secondary reconstruction on the raw data to simulate the process of obtaining real medical images.In this scheme, real CT images were treated as human bodies, while the simulated raw data and secondarily reconstructed CT images were analogized to actual raw data and clinical CT images.Within this framework, our objective clinical problem was to predict the malignancy or benignity of solid nodules.We introduced an AI model, termed RCTM, which directly models predictive diagnostic knowledge from raw data and lesion location.Experimental results reveal that RCTM demonstrates outstanding predictive capabilities, exceeding the accuracy of all models trained on secondary reconstructed CTs, and producing results on par with models constructed using real CT data.Consequently, our research disrupts the diagnostic and therapeutic paradigm based on medical imaging, opening up new possibilities for disease diagnosis.

Patients
A total of 2474 patients who underwent surgical resection for lung lesions presenting as radiologically pure-solid pulmonary nodules, with sizes ranging from 8 mm to 30 mm, at Shanghai Pulmonary Hospital between January 2011 and December 2014 were included in this study.Exclusion criteria comprised the following: (1) absence of low-dose CT data; (2) the thickness of the CT image is too thick, (3) presence of multiple lesions; (4) history of malignant tumors; (5) prior receipt of neoadjuvant therapy; (6) presence of typical benign or malignant radiological indicators; (7) lesions extend beyond the simulated projection area.Ultimately, a cohort of 1997 patients with complete clinical information, pathological information and CT images was retained for analysis, which were stratified randomly in a ratio of 14:3:3 and divided into training, validation, and testing groups through random stratified sampling.The flowchart of the inclusion and exclusion criteria is presented in figure S1.

Preparation of CT image and raw data
In the context of CT scans, the delineation of the region of interest (ROI) was meticulously achieved by means of bounding box annotations.This meticulous task was undertaken through a concerted collaboration between two junior thoracic surgeons, namely MZ (with a clinical background spanning 4 years) and HH (amassing 6 years of clinical experience).Subsequently, the accuracy and consistency of these annotations were subjected to rigorous validation by a distinguished surgeons in the realm of thoracic surgery (QC, boasting an extensive 30 year tenure in clinical practice).Moreover, it is imperative to underscore that all CT scans were undergo truncation, with pixel intensity values confined within the interval of −400 to 1600 Hounsfield units (HU).This deliberate measure is implemented to counteract the potential perturbation stemming from outliers.Subsequently, a normalization step was executed-applying the z-score transformation-to all images.This normalization procedure, anchored in the mean and variance computed from the training cohort, serves the purpose of facilitating the learning process for model.
In the context of simulation, the cone-beam CT raw data (sinogram) are simulated using Siddo's algorithm (Siddon 1985) which can be used to obtain the intersection length of a thin ray with a voxel.In our experiment, the following geometric parameters are used.The dimension of the arc detector array was assumed to be 736 × 512.The size of the detector unit is 1.2858 × 1.0947 mm 2 ; the source-to-isocenter distance and the source-to-detector distance are 490.5999mm and 1085.5999mm, respectively; the number of projections per complete rotation is 360-view.The maximum cone angle was set 28.8943°.And the corresponding image is reconstructed by the FDK algorithm with a ramp kernel, which does not possess hyperparameters to mitigate the influence of external factors.

ROI mapping of raw data and CT
The complete raw data contains a significant amount of redundant information, posing a burden on model training.Therefore, the key to modeling lies in how to extract lesion area information in the signal domain and transform it into a suitable structure.In other words, our primary approach is to use spatial voxels as anchors, extracting pixels related to their reconstruction from projection data, and directly modeling to predict diagnostic outcomes.In the experiment, the first step is to standardize the dimensions of the input data.To meet this requirement, we calculated the length, width, and height of the lesion (ROI) in millimeters (l x , l y , l z ), and sampled the region at a fixed spatial resolution of 32 × 32 × 32 (H × W × D) voxels.As each voxel is reconstructed from a set of projections, we introduce the frequency dimension (F) to extract pixels for reconstructing this voxel from projection data at different angles, creating a lesion matrix (F × H × W × D).It is important to note that although the matrix contains lesion information, it still lacks information on the size and spatial position of the lesion.Therefore, we used the center point of the lesion as the target, with the rotation center as the origin.After obtaining relative position differences (x, y, z), they are concatenated with the lesion's size for modeling purposes.

Overview of RCTM
The proposed RCTM takes a sinogram data which is the patients' CT signal data as the input and outputs a diagnosis result of lung cancer, as shown in figure 2. The proposed model consists of the orientation analysis module (OAM), the spatial analysis module, and the global analysis module (GAM), each extracting features for diagnosis at the basic, lesion, and environment levels, respectively.In addition, all structural parameters of the RCTM are described in detail in Supplement A.

Orientation analysis module
The primary function of orientation analysis module (OAM) is to complement the deficiency of lesion size and spatial position information in the lesion matrix during modeling.In its design, we initially define the rotation center as the origin.The central position of the ROI is then computed to represent the location of the lesion relative to the origin, denoted by x, y, and z, with positive and negative distinctions and units in millimeters.Additionally, we calculate the length, width, and height (l x , l y , l z ) of the lesion through the ROI.By concatenating the lesion's size and spatial position, a six-dimensional feature vector (x, y, z, l x , l y , l z ) is generated, serving as the input to this module.For the structure, OAM consists of hidden layers with 64 nodes and 32 nodes.When the vector is input into the module, it first undergoes dimensionality expansion through a hidden layer with 64 nodes, aiming to achieve a higher-dimensional fusion representation by deepening the nonlinear fitting.Subsequently, fusion occurs through a hidden layer with 32 nodes, obtaining the orientation feature vector.

Spatial analysis module
The spatial analysis module (SAM) is a crucial component of RCTM, primarily responsible for analyzing the lesion matrix extracted from the sinogram data.Firstly, we treat the frequency dimension (F) as channels and calculate the maximum and average attention for the frequency dimension using channel attention mechanism, followed by merging with the summation operator.The attention mechanism is a technique used to make the model focus more on the component with the maximum information content in the signal, proving its effectiveness in various scenarios such as sequence learning, localization, and understanding in images (Hu et al 2018).The channel attention mechanism used in this study allows the recalibration of features obtained by the different angles.It learns to use global information to capture important angle features and suppress irrelevant ones.Specifically, it first compresses different feature maps through global average pooling and global max pooling.It then undergoes dimension reduction and restoration through encoding and decoding layers, and finally, new weight allocation and weighted merging of different feature maps are achieved through the Softmax function.Subsequently, the weighted data undergoes frequency dimension fusion through a 1 × 1 convolution and is subjected to feature extraction using a feature pyramid-structured (FPN) 3D-DenseNet (Huang et al 2017, Lin et al 2017).For Densenet, the number of dense layers within each block is reduced to 6 to alleviate the model's parameter count and mitigate training pressure.As for FPN, it was initially proposed for image object detection.Its function is to combine low-resolution, semantically strong features with high-resolution, semantically weak features through a top-down pathway and lateral connections.This is done to further address the issue of varying lesion feature scales.In detailed experiments, features outputted by each dense block are dimensionally reduced using global average pooling and interpolated with trilinear interpolation to maintain consistent dimensions.Finally, features from different scales are concatenated after fusion via a fully connected network.Following the ultimate fusion with a multi-layer perceptron, a 128-dimensional spatial feature vector is obtained.

Global analysis module
To further enhance the model's performance, we conducted training on the complete sinogram data through global analysis module (GAM).In the experiments, we first resampled the raw data dimensions to half to reduce the model's computational load.Subsequently, we employed 3D-DenseNet for feature map extraction, where the inclusion of three-dimensional convolution kernels effectively fused high-dimensional information from adjacent angle projections.It is noteworthy that, similar to densenet in SAM, the backbone network also set the number of dense layers within each dense block to 6.Following this, all obtained feature maps were serialized and input into a gated recurrent unit (GRU) specialized in analyzing sequence data to further integrate information from different angle projections (Chung et al 2014).The GRU, containing 256 hidden layer nodes, and its output is considered as the global feature vector.It should be noted that the combination of CNN and GRU is not uncommon, and its efficient and precise feature fusion performance has been demonstrated in algorithms such as R-MVSNet for deep three-dimensional reconstruction (Yao et al 2019).For the complete RCTM, we concatenated orientation feature vector, spatial feature vector, and global feature vector, and after passing through a hidden layer encoding including 64 nodes, utilized a linear classification layer to determine the patient category.

Compare models and training scheme
To investigate the performance of the RCTM, we further compare our framework with existing CT-based methods for the diagnosis of benign and malignant pulmonary nodules.Three methods were included for comparison, including (1) MV-KBC (Xie et al 2018), (2) MTMR-Net (Liu et al 2019) and (3) MSCS (Xu et al 2020).MV-KBC was proposed as a knowledge-driven multi-view collaborative framework by acquiring multiview images of lung nodules, using transfer learning to improve the diagnostic accuracy.MTMR-Net introduced a multi-task paradigm that incorporated a marginal ranking loss to improve feature extraction and achieve stateof-the-art performance.MSCS utilized 3D-CNN to learn multi-level contextual features from multi-scale lung nodule images, improving the feature representation of lung nodules.
Given the significant parameter count of the entire model, there exists a risk of overfitting when employing end-to-end training.Therefore, we explore a phased training approach to construct the RCTM: (1) training of SAM: in this phase, we directly concatenate the features obtained from the SAM with a linear classification layer for benign-malignant diagnosis of nodules; (2) training of localization and SAM: we concatenate the features of the OAM with the features from the SAM.While keeping the SAM frozen, we train the OAM and the classification layer.Once the results approach stability, optimization is performed on the weights of both modules; (3) training of GAM: at beginning, we independently model the GAM to acquire a holistic representation reflecting nodule information.Subsequently, we concatenate the features from all three modules and solely train the classification layer to achieve improved model outcomes; (4) global optimization: employing a smaller learning rate and bigger weight decay, we undertake end-to-end optimization of all weights of RCTM.In each step, cross-entropy and AdamW (Loshchilov and Hutter, 2017) are chosen as the loss function and optimizer.Model parameters are chosen using a grid search approach, with validation cohort results serving as the criterion for model selection.

Quantitative evaluation of differences in the reconstruction process
The reconstruction of data can potentially compromise critical information pivotal for clinical diagnosis.To quantify the degree to which this diagnostic information is diminished, we rigorously assessed disparities in salient diagnostic features between the original and reconstructed CT images.Initially, features were extracted from both pre-and post-reconstructed CT images via the pyradiomics platform (Van Griethuysen et al 2017), a tool extensively adopted in the realm of radiomics.Subsequently, we identified features of paramount significance in differentiating benign from malignant lung nodules in the original CT images via univariate analysis, these features underscore vital diagnostic insights in radiomics investigations and are integral for AI analysis.Ultimately, we employed Spearman correlation analysis to ascertain the magnitude of alterations these features underwent post-reconstruction (Xiao et al 2016).An augmented disparity in these features directly implies a pronounced erosion of crucial diagnostic content during the reconstruction phase.

Statistical analysis
All statistical analyzes were performed in Python (v3.9.16).The Mann-Whitney U test was used to test the distribution of continuous variables across different cohorts, and the Chi-square test was used for categorical variables.Spearman correlation analysis was used to evaluate the correlation between the features.All tests were two-sided and P < 0.05 was considered to be statistically significant.

Clinical characteristics
Among the 1997 participants recruited, 1385 presented with pathological manifestations of malignant lung nodules, while 612 exhibited benign lung nodule characteristics.Notably, the benign nodule cohort displayed a significantly younger age (P < 0.001) and a reduced maximum nodule diameter (P < 0.001) in comparison to their malignant counterparts.However, no discernible disparities were observed between benign and malignant groups concerning sex, smoking history, or nodule localization.Comprehensive statistical outcomes are delineated in table 1.

Quantitative difference assessment through radiomic features
Before conducting experiments, it is crucial to assess the changes in diagnostic information caused by the reconstruction process, and extracting predefined image features serves as a valuable approach.We extracted 1409 features from both actual CT scans and CT scans subjected to secondary reconstruction.After undergoing single-factor screening, a total of 1160 features were found to be significantly correlated with benign-malignant classification (P < 0.05).We conducted cluster analysis on the above features and generated a heatmap to demonstrate the potential of the filtered features to distinguish between benign and malignant cases (figure S2).
To visualize the feature variations, we calculated the means of the features extracted from the two types of CT scans and arranged them in ascending order, as depicted in figure 3(a).Notably, 16 features exhibited no mean difference, indicating that the reconstruction process had minimal impact on them.Furthermore, we divided all samples into benign and malignant groups and assessed feature differences for each group, as illustrated in figure 3(b).Within the benign group, 65.6% of features demonstrated Spearman correlation coefficients exceeding 0.9, slightly below the 69.9% observed in the malignant group.This discrepancy might arise from the benign group encompassing benign nodules or infections with different imaging signs.When setting the coefficient truncation at 0.8, both benign and malignant groups demonstrated that 86% of features met stability requirements.Finally, both groups had 2.8% of features with correlations below 0.6, yet all P-values indicated significant correlation (P < 0.05).In summary, the reconstruction process does influence the expression of image features relevant to diagnosis, and this process presents an opportunity for enhancing model accuracy.

Diagnostic performance of RCTM
The purpose of RCTM is to explore the potential of mining information directly from raw data and lesion location and predicting the benign-malignant lung nodule, and to further compare whether the performance is improved compared to the original CT and reconstructed CT data.As shown in table 2, RCTM achieved AUC of 0.914, 0.867 and 0.863 on the training (table S4), validation (table S5), and test cohorts, respectively.In the model built based on the original CT data, the AUCs of MV-KBC, MRMR-Net, and MSCS on the test cohort were 0.868, 0.878 and 0.866.The AUC of RCTM was slightly lower than that of CT-based models by about 0.003 ∼ 0.015.Accordingly, the AUCs of the three models based on the reconstructed CT data were 0.840, 0.822 and 0.825 in test cohort.The classification performance of RCTM was higher than that of all based on the  reconstructed CT models, improving the AUC of 0.023-0.041.Finally, For the selection of RCTM structure parameters, please refer to Supplement B.

Stratified analysis and ablation study
To thoroughly scrutinize the efficacy of deep learning models built on raw data, we conducted an exhaustive assessment of their performance across diverse subgroups.For comparison, the leading CT model (MV-KBC) based on reconstructed CT images was similarly appraised across these subgroups (table 3).The results of the training and validation cohorts are documented in tables S6 and S7.Experimental findings indicated that the RCTM model, grounded in raw data, exhibited consistent performance across the subgroups (table 4).It demonstrated enhanced efficacy particularly among older individuals, females, those with a smoking history, and nodules located in the left lung lobe.Interestingly, the MV-KBC mirrored this trend concerning age, sex, and smoking history.It is salient to highlight that while RCTM and MV-KBC exhibited congruous outcomes in the older age bracket, RCTM's performance, markedly outpaced MV-KBC in younger individuals (AUC, 0.827 versus 0.800).A similar trend was discernible among the male cohort (AUC, 0.839 versus 0.804).Furthermore, for nodule location, RCTM and MV-KBC demonstrated predominant accuracies in left and right lobe subgroups respectively, suggesting potential distinctions in the features each model prioritized.In a subsequent phase, we conducted ablation studies on the proposed RCTM model, elucidating the impact of various strategies on its end performance (referenced in tables 5).Our investigations revealed that each module improves AUC by about 0.03-0.04.Incorporating all three strategies, the final RCTM model achieved peak predictive prowess across most evaluated metrics.

Discussion
In this study, we employed a combined approach of simulation and reconstruction to emulate the process of real-world image acquisition and reconstruction.Through the extraction and reconstruction of raw data, we introduced for the first time an independent AI model oriented towards raw data and lesion location which is validated its potential value in clinical diagnosis and treatment, and showcases the potential to alter and enhance the existing paradigm of visual-based diagnostic methods.Furthermore, we quantitatively assessed the changes in imaging characteristics introduced by the reconstruction process.This process utilized predefined radiomic features, contrasting the differences in features between real CT and secondarily reconstructed CT.The results indicate that while the majority of radiomic features exhibit considerable similarity (Spearman correlation coefficient >0.8), there are still 2.8% of imaging features with Spearman correlation coefficients less than 0.6.This observation to some extent reflects the maturity of current CT reconstruction algorithms, yet also highlights the inevitable impact of the reconstruction process on the expression of diagnostic information.Therefore, transitioning from signal to knowledge in modeling holds significant research value.
Accurately classifying benign and malignant pulmonary nodules is essential since it can help improve the chances of curing lung cancer.In this study, we investigated the feasibility of using raw data prior to CT reconstruction to predict the benign and malignant pulmonary nodules.The results show that the raw data can discriminate well between malignant and benign nodules.The AUCs of the RCTM in the training cohort, validation cohort and test cohort were 0.914, 0.867 and 0.863, respectively.We further compared the performance of RCTM with CT and reconstructed CT-based models, and the AUC of RCTM was higher than all reconstructed CT-based models and comparable to the performance of the CT-based models.These results demonstrate that mining information from raw data that has not been reconstructed into CT to predict benign and malignant lung nodules is not only achievable but can yield competitive results.This is a new paradigm of medical-aided diagnosis, which is completely different from the previous ones.Otherwise, our subgroup analysis results reveal a steadfast performance of the RCTM, developed from raw data, across various subgroups.Notably, the proposed RCTM notably surpassed the model derived from reconstructed CT, especially within younger and male subgroups.This suggests a potential edge for the RCTM model's application in these demographics.Furthermore, discrepancies arose in the performance of the RCTM and the model predicated on reconstructed CT between the left and right lobe subgroups.Such variance may hint that the raw data-driven model captures nuances potentially overlooked by its reconstructed CT counterpart.
Our study has several limitations.Firstly, although our study comprehensively validates the value of diagnosing directly from raw data, evaluating this method on real raw data is still necessary.After all, real raw data includes the noise generated by the equipment during acquisition.Secondly, while our study employed axial scanning for simulation, exploring a raw data model suitable for the prevalent helical scanning mode is still necessary.Thirdly, while our constructed method has shown improvement, it is not yet fully optimized.Fourthly, RCTM utilizes positional information obtained from the reconstructed CT in modeling.Development of a lesion detection network using raw data is deemed essential.In future research, we will continue developing and refining the current model to harness the full potential of raw data.
The current research on raw data still requires continuous iteration and improvement in policies and technologies from the government, academic institutions, and industry.It is a long-term issue that needs to be addressed.The value of studying raw data is not limited to the improvement in accuracy by avoiding the reconstruction process; it also eliminates the differences in image quality caused by different manufacturers using different calibration and reconstruction algorithms.This advantage not only encourages researchers to collect larger datasets but also contributes to enhancing the generalization of AI medical models.However, providing relevant data requires academic institutions and industrial companies to sign comprehensive data sharing and usage agreements.Additionally, privacy protection is crucial for raw data analysis, and the collection and use of raw data must be protected by existing or future privacy legislation, such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States (Wang et al 2022).Finally, raw data analysis does not aim to eliminate image generation, and visualization remains essential.In the future, researchers using dual-mode data analysis cannot only retain visualized images but also achieve higher-precision AI diagnostic models.Regarding the technical prospects of raw data analysis, how to utilize AI techniques to construct the implicit representation of images is a crucial technology that cannot be ignored.Neural radiance fields (NeRF) are a significant representative of implicit representation, having achieved remarkable success in natural image research.NeRF can overcome the spatial resolution limitations imposed by traditional voxel-based representations.Therefore, our future focus is on integrating medical imaging and applying it to diagnostic models within a scene.

Conclusion
In summary, we propose a diagnostic model for CT raw data following the 'signal-to-knowledge' approach.This model exhibits superior performance compared to CT models based on secondary reconstruction, even achieving results comparable to those from the original CT model.This implies that analyzing diagnostic knowledge directly from raw signals is feasible, and this direction can facilitate the advancement of intelligent medical imaging scanning.

Figure 1 .
Figure 1.Traditional image acquisition and analysis processes, and new ways of raw data analysis beyond reconstruction.

Figure 2 .
Figure 2. RCTM model structure.The model consists of three modules: orientation analysis module, Spatial analysis module and Global analysis model.

Figure 3 .
Figure 3. Quantitative evaluation of radiomic feature deviations from reconstruction.(a) Difference of feature mean between original and secondary reconstructed CT; (b) feature correlation statistics of benign group and malignant group.

Table 1 .
Patient characteristics.Data are n (%) or median (IQR).For continuous variables and categorical variables, Kruskal-Wallis test and Chi-squared test were applied respectively.

Table 2 .
Performance comparison of RCTM with existing CT-based methods in test cohort.

Table 5 .
Ablation experiments results in test cohort.