Modeling complex networks of nuclear reaction data for probing their discovery processes

Figures(6)

Get Citation
Xiaohang Wang, Long Zhu and Jun Su. Modeling complex networks of nuclear reaction data for probing their discovery processes[J]. Chinese Physics C. doi: 10.1088/1674-1137/ac23d5
Xiaohang Wang, Long Zhu and Jun Su. Modeling complex networks of nuclear reaction data for probing their discovery processes[J]. Chinese Physics C.  doi: 10.1088/1674-1137/ac23d5 shu
Milestone
Received: 2021-08-30
Article Metric

Article Views(25)
PDF Downloads(4)
Cited by(0)
Policy on re-use
To reuse of subscription content published by CPC, the users need to request permission from CPC, unless the content was published under an Open Access license which automatically permits that type of reuse.
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Email This Article

Title:
Email:

Modeling complex networks of nuclear reaction data for probing their discovery processes

    Corresponding author: Jun Su, sujun3@mail.sysu.edu.cn
  • Sino-French Institute of Nuclear Engineering and Technology, Sun Yat-sen University, Zhuhai 519082, China

Abstract: Experimental data sets of nuclear reactions have been systematically collected in hundreds of thousands and are still growing rapidly. The data and their correlations compose a complex system, which underpins nuclear science and technology. We model the nuclear reaction data as weighted evolving networks for the purpose of data verification and validation. The networks are employed to study the growing cross-section data of a neutron induced threshold reaction (n,2n) and photoneutron reaction. In the networks, nodes are the historical data and weights of the links are the relative deviation between the data points. It is found that the networks exhibit the small-world behavior, and their discovery processes are well described by the Heaps law. What makes the networks novel is the mapping relation between the network properties and the salient features of the database: Heaps exponent corresponds to the exploration efficiency of the specific data set, the distribution of the edge-weights corresponds to the global uncertainty of the data set, and the mean node weight corresponds to the uncertainty of the individual data point. This new perspective to understand the database would be helpful for nuclear data analysis and compilation.

    HTML

    I.   INTRODUCTION
    • Nuclear reaction data underpin nuclear science and technology [1, 2]. One of the most important work in nuclear physics is to measure the nuclear reactions and reduce the data uncertainties in order to achieve the accuracy requirements in fundamental research fields, like nuclear astrophysics [3, 4], and many application fields, such as transmutation of nuclear wastes [5] and design of future nuclear reactor [6]. Since 1935, tens of thousands of experiments have been performed and hundreds of thousands of experimental data sets have been systematically collected to develop the experimental nuclear reaction databases, among which the most important and complete one is EXFOR [7]. In addition to this, the evaluation and prediction using the historical data have also attracted attentions, resulting in various international nuclear data libraries, such as ENDF [8], CENDL [9], JEFF [10], JENDL [11], and BROND [12].

      Among the methods for data evaluation and prediction, the nuclear reaction models combined with the generalized least squares method gains prominence [13]. Successful applications of Monte Carlo evaluation methods based on the Bayesian statistical inference are also reported [14, 15]. To solve illinversed regression problems for prediction with uncertainty quantification, the machine learning, which is a very powerful tool to learn complex big data, has been applied [16, 17]. In the past decade, investigations on the advanced reactor systems have defined the new needs in accuracy and scope of the nuclear reaction data [18]. To this end, huge investment values including salaries, equipment, and working hours are devoted to measure new data. Meanwhile, the statistical verification and validation is an effective means for quality improvement of the rapidly growing data [19].

      The growth of the nuclear reaction data is actually an innovation process, which is intrinsic for the human experience. In this respect, the fruitful research on the complex networks [2023] makes it possible to model and quantify the innovation. In fact, the abstract process of innovation has been investigated in different domains including linguistics [24], biology [25], economy [26], knowledge [27], and science [28]. Many empirical analyses of real-world discovery processes have demonstrated that the basic signatures of the innovation process is the Heaps law [2931], which was originally introduced to describe the number of distinct words in a text document [32]. Various models have proved that the Heaps law well describes the pace at which scientists discover concepts or users collect new items [33, 34]. In the case of science, different networks have been extracted from scholars, projects, papers, patents, ideas, or(and) academic positions [28, 35]. Investigation of the key network properties offers a quantitative understanding of the interactions among scientific agents and quantitative insight in the evolution of individual scientific impact [36, 37].

      In this work, it is devoted to model the discovery process of the nuclear reaction data as a weighted evolving networks. It is expected to map the salient features of the database to network properties so that the global uncertainty of the specific data set and the uncertainty of the individual data point could be quantified. The paper is organized as follows. In Sec. II, we describe the Bayesian Gaussian CANDECOMP/PARAFAC (BGCP) tensor decomposition model and the method to compose the networks. In Sec. III, we present both the results and discussions. Finally, the summaries are given in Sec. IV.

    II.   METHOD
    • The cross-section in a specific reaction channel, such as (n,2n) reaction, depends on the change and neutron numbers of the target and the incident energy. Let us discretize the energy degree of freedom using the energy gap $ dE $, then the data can be represented by a three order missing tensor. Nowadays, tensor completion is widely used in image inpainting [38] and data imputation [39]. There are various of non-Bayesian tensor completion techniques, for example Liu et al. proposed an algorithm for missing values completion in visual data [40]. The beginning of the Bayesian model in matrix completion area should be belonging to Salakhutdinov and Mnih, who built the models of Bayesian matrix factorization [41]. Kolda and Bader gave a comprehensive review of tensor decomposition [42]. Xiong et al. combined the tensor decomposition with Bayesian inference considering the time dependency of each tensor [43]. Chen et al. recently proposed a Bayesian Gaussian CANDECOMP/PARAFAC (BGCP) tensor decomposition model without the time structure [44], which is taken into account in this work. We denote by $ {\cal{S}}\in{\mathbb{R}}^{I\times J\times K} $ the actual value of the tensor, where the entry $ \sigma_{ijk} $ is the physical reality of the cross section in the reaction of target $ ^{2i+j}_{i}X_{i+j} $ at the incident energy $ E = k\cdot dE $. Here, j denotes the isospin degrees of freedom, expressed as difference between neutron and charge numbers $ j = N-Z $. Indeed, we don't have the values of $ \sigma_{ijk} $ but some observations with uncertainties. Considering the multiple measurements, let $ \sigma_{ijk}^{(p)} $ ($ p = 1,\cdots,P_{ijk} $) represent the p-th observation of the cross-section and there is total $ P_{ijk} $ observations. A list of missing tensors $ {\widetilde{\cal{S}}} = \{ {\cal{S}}^{(1)},\ldots,{\cal{S}}^{(P)} \} $ could be used to describe all the observations, where $ P = \max(P_{ijk}) $ for all possible $ ijk $. The observations are arranged in the missing tensors according to their superscripts, such as $ \sigma_{ijk}^{(1)} $ in $ {\cal{S}}^{(1)} $, $ \sigma_{ijk}^{(2)} $ in $ {\cal{S}}^{(2)} $, and so on. The entries for not observations are missing. Being different from Ref. [44], the multiple observations for the physical reality $ \sigma_{ijk} $ are considered. In the following, we expound the resulting changes of the Bayesian framework of parameters.

      It is assumed that the uncertainty of each observed value follows independent Gaussian distribution,

      $ \widetilde{\sigma}_{ijk} \sim {\cal{N}} \left(\sigma_{ijk}, \tau_{\epsilon}^{-1}\right), $

      (1)

      where $ \tau_{\epsilon} $ is the precision. In real-world applications the expectation $ \sigma_{ijk} $ is unknown and replaced with the the estimated value $ \hat{\sigma}_{ijk} $, which is the entry of the estimated tensor $ {\hat{\cal{S}}} $. The CP decomposition is applied to calculate the estimation $ {\hat{\cal{S}}} $:

      $ {\hat{\cal{S}}} = \sum\limits_{l = 1}^{L}{\bf{z}}_{l}\circ {\bf{d}}_{l}\circ {\bf{e}}_{l}, $

      (2)

      where $ {\bf{z}}_{l}\in{\mathbb{R}}^{I} $, $ {\bf{d}}_{l}\in{\mathbb{R}}^{J} $, and $ {\bf{e}}_{l}\in{\mathbb{R}}^{K} $ are respectively the l-th column vector of the factor matrix $ {\bf{Z}}\in{\mathbb{R}}^{I\times L} $, $ {\bf{D}}\in{\mathbb{R}}^{J\times L} $ and $ {\bf{E}}\in{\mathbb{R}}^{K\times L} $. The symbol $ \circ $ represents outer product.

      The prior distribution of the row vectors of the factor matrix Z is the multivariate Gaussians,

      $ {\bf{z}}_{i}\sim {\cal{N}} \left[{\boldsymbol{\mu}}_{i}^{(z)}, ({\bf{\Lambda}}_{i}^{(z)})^{-1}\right], $

      (3)

      where the hyper-parameter $ {\boldsymbol{\mu}}^{(z)}\in{\mathbb{R}}^{L} $ expresses the expectation, and $ {\bf{\Lambda}}^{(z)}\in{\mathbb{R}}^{L\times L} $ indicates the width of the distribution. The likelihood function can be written as,

      $ \begin{aligned}[b] {\cal{L}} ( \sigma_{ijk}^{(p)} | {\bf{z}}_{i}, {\bf{d}}_{j}, {\bf{e}}_{k}, \tau_{\epsilon} ) \propto \exp \left\{ -\frac { \tau_{\epsilon} } {2} \left[\sigma_{ijk}^{(p)} -({\bf{z}}_{i})^{T} ({\bf{d}}_{j} \circledast {\bf{e}}_{k} ) \right]^2 \right\}, \end{aligned} $

      (4)

      where $ \circledast $ is the Hadamard product. The posterior values of the hyper-parameters $ {\boldsymbol{\mu}}^{(z)} $ and $ {\bf{\Lambda}}^{(z)} $ are given as,

      $ \begin{aligned}[b] {\widehat{\bf{\Lambda}}}^{(z)}_{i} =& \tau_\epsilon ({\bf{d}}_{j} \circledast {\bf{e}}_{k}) ({\bf{d}}_{j} \circledast {\bf{e}}_{k} )^{T} +{\bf{\Lambda}}_{i}^{(z)}, \\ {\widehat{\boldsymbol{\mu}}}^{(z)}_{i} =& ({\widehat{\bf{\Lambda}}}^{(z)}_{i})^{-1} \left[ \tau_\epsilon \sigma_{ijk}^{(p)}({\bf{d}}_{j} \circledast {\bf{e}}_{k}) + {\bf{\Lambda}}_{i}^{(z)} {\boldsymbol{\mu}}_{i}^{(z)} \right]. \end{aligned} $

      (5)

      The contribution of the observations to the hyper-parameter is equivalent, being independent of which missing tensor it is arranged in.

      The likelihood function of all observations is,

      $ \begin{aligned}[b] {\cal{L}} ( {\widetilde{\cal{S}}} | {\bf{Z}}, {\bf{D}}, {\bf{E}}, \tau_{\epsilon} ) \propto & \prod\limits_{p = 1}^{P} \prod\limits_{i = 1}^{I} \prod\limits_{j = 1}^{J} \prod\limits_{k = 1}^{K} (\tau_{\epsilon})^{1/2}\\& \exp \left[ -\frac{\tau_{\epsilon}}{2} b_{ijk}^{(p)} (\sigma_{ijk}^{(p)}-\hat{\sigma}_{ijk})^2 \right]. \end{aligned} $

      (6)

      where $ b_{ijk}^{(p)} $ is 1 for the measured entry and 0 for the missing entry. Placing a conjugate Γ prior over to the precision $ \tau_{\epsilon} $,

      $ \tau_{\epsilon}\sim \Gamma(a_{0},b_{0}), $

      (7)

      The posterior values of the hyper-parameters $ a_{0} $ and $ b_{0} $ are given as,

      $ \begin{aligned}[b] \hat{a}_0 = &\frac{1}{2} \sum\limits_{p = 1}^{P} \sum\limits_{i = 1}^{I} \sum\limits_{j = 1}^{J} \sum\limits_{k = 1}^{K} b_{ijk}^{(p)}+a_0, \\ \hat{b}_0 =& \frac{1}{2} \sum\limits_{p = 1}^{P} \sum\limits_{i = 1}^{I} \sum\limits_{j = 1}^{J} \sum\limits_{k = 1}^{K} (\sigma_{ijk}^{(p)}-\hat{\sigma}_{ijk})^2+b_0. \end{aligned} $

      (8)

      Based on the Eq. (8), each observation contributes to the increase of $ \dfrac{1}{2} $ in $ \hat{a}_0 $, and $ \dfrac{1}{2}(x_{\bf{i}}^{(p)}-\hat{x}_{\bf{i}})^2 $ in $ \hat{b}_0 $.

      Equation (5) shows that the observations with same subscript i contribute to the posterior values of the hyper-parameter $ {\boldsymbol{\mu}}_{i}^{(z)} $. Changing the second formula in Eq. (5) to,

      $ \begin{aligned}[b] {\widehat{\boldsymbol{\mu}}}^{(z)}_{i} =& {\boldsymbol{\mu}}_{i}^{(z)} + \Delta {\boldsymbol{\mu}}_{i}^{(z)}, \\ \Delta {\boldsymbol{\mu}}_{i}^{(z)} =& ({\widehat{\bf{\Lambda}}}^{(z)}_{i})^{-1} ({\bf{d}}_{j} \circledast {\bf{e}}_{k}) \tau_\epsilon \left[ \sigma_{ijk}^{(p)} - ({\bf{d}}_{j} \circledast {\bf{e}}_{k} )^{T} {\boldsymbol{\mu}}_{i}^{(z)} \right]. \end{aligned} $

      (9)

      then the relative deviation between two observations can be defined as,

      $ \delta(\sigma_{ij_{1}k_{1}}^{(p_{1})}, \sigma_{ij_{2}k_{2}}^{(p_{2})}) = \left[ \frac{\Delta {\boldsymbol{\mu}}_{i}^{(z)}(\sigma_{ij_{1}k_{1}}^{(p_{1})}) - \Delta {\boldsymbol{\mu}}_{i}^{(z)}(\sigma_{ij_{2}k_{2}}^{(p_{2})})}{{\boldsymbol{\mu}}_{i}^{(z)}} \right]^{2}. $

      (10)

      The weighted network can be built, where nodes are the observations and weight of the link between two observations with same subscript is defined as

      $ w(\sigma_{ij_{1}k_{1}}^{(p_{1})}, \sigma_{ij_{2}k_{2}}^{(p_{2})}) = \exp \left[ -\delta(\sigma_{ij_{1}k_{1}}^{(p_{1})}, \sigma_{ij_{2}k_{2}}^{(p_{2})}) \right], $

      (11)

      Cases for the subscript j and k are similar.

      In Fig. 1, it is illustrated the above method. In brief, as same as the image inpainting, the cross-section in a specific reaction channel is represented by a three order tensor $ {\cal{S}} $. According to the CP decomposition, the tensor $ {\cal{S}} $ is expressed as the outer product of the factor matrixes Z, D, and E. The prior distribution of the factor matrixes are assumed to be the multivariate Gaussians. With the observed data of the cross-section, the posterior values of the factor matrixes and their distributions can be calculated using the Bayesian inference and iteration. Finally, the predicted cross-sections is reconstituted with the factor matrixes Z, D, and E, while the network is built with the hyper-parameter µ and $ \Delta {\boldsymbol{\mu}} $.

      Figure 1.  (color online) Illustration for visualizing the definitions of the involved quantities and their relations.

    III.   RESULTS AND DISCUSSION
    • Pioneers to measure the (n,2n) reaction are Fowler and Slye, who measured the cross-sections in $ ^{63} {\rm{Cu}}$(n,2n)$ ^{64} {\rm{Cu}}$ reaction near the threshold [45]. After decades, the measurement technology and deuterium-tritium (D-T) neutron generator were widely used resulting in the linear growth of the data points (mainly around the incident energy 14 MeV). At 1980, a good deal of data measured between threshold and 20 MeV using the pulsed neutron source at Bruyères-le-Châtel were published [46]. After that, thanks to continuing investment of salaries, equipment, and working hours, the data points grew linearly. Up to now, 7671 cross-section data points of (n,2n) reaction (including 98 derived data) are recorded in the EXFOR database. Their annual growth is shown in Fig. 2(a).

      Figure 2.  (color online) Growth of nuclear reaction data and its evolving networks. Annual growth cross-section data of (a) (n,2n) reaction and (b) (γ,xn) reaction in the EXFOR database. Network generated from the (n,2n) cross-section data published before (c) 1960 and (d) 1965. Size and color of the node present the degree $ k_{m} $ of a node m.

      The discovery of the photoneutron reaction could date back to 1956, when the cross-section of $ ^{6} {\rm{Li}}( \gamma ,{\rm{X}}){\rm{n}}$ reaction was measured by Edge [47]. Subsequently, the published data of this reaction grew rapidly. Due to the proven technology to provide monoenergetic photon beam, the cross-section data for most of the stable nuclei have been measured after 20 years of its discovery. After 1980, only a few data were published because the scientific interest moved to more subdivided channels, such as (γ,n), (γ,2n) and so on. Twelve thousands data points are collected to the EXFOR database. Their annual growth is shown in Fig. 2(b).

      Those two sets of nuclear reaction data are modeled respectively as the weighted evolving networks using the BGCP approach. The nodes in the networks are the data points and weights of the links are computed from the relative deviation between the data points. An example of the network is displayed in Fig. 2(c), where 362 data points published before 1960 are considered. The nodes, links and number of neighbours are visualized clearly. The complexity of the network increases with increasing node number. One sees another example shown in Fig. 2(d), where 1251 nodes are included but the links are too many to recognize in the figure.

      The degree $ k_{m} $ of a node m is the number of edges linking with the node (the number of neighbors of node m). As shown in Fig. 3(a), the mean degrees and the node number in both networks approximate to grow proportionally. It is an inevitable consequence of the discovery processes, where an experiment usually probes new data on the basis of verifying existing data. The slope reveals the innovativeness. Decrease of the slope indicates the emergence of new technologies for detecting new isotopes or in new incident energy region. A representative example is the slowdown of the mean degree $ \langle k_{m}\rangle $ in the (n,2n) network near $ N_{node} $ = 4000, which is resulted from the extension of the neutron beam from 14 MeV to the threshold.

      Figure 3.  (color online) Properties of the evolving small-world networks. (a) Mean degree of a node $ \langle k_{m}\rangle $, (b) mean weight of edges $ \langle w_{mn} \rangle $, (c) clustering coefficient C (d) global efficiency E, and (e) number of novelties S of the graph as a function of node number.

      In a weighted graph, the node strength $ s_{m} $ is defined as the summation of the edge-weights linking with a node. It integrates the information on the number and the weights of links. The mean strength in both networks also increase linearly with the increasing node number. In contrast, a more interesting property is the mean weight of the edges in the graph, as shown in Fig. 3(b). It is found that not only the number (mean degree $ \langle k_{m}\rangle $) but also the weights (mean weight $ \langle w_{mn} \rangle $) of links increase during the growth of both networks. But for (n,2n) reaction, an approximate platform for the mean weight $ \langle w_{mn} \rangle $ appears for $ N_{node} > $ 3000.

      The weight $ w_{mn} $ describes the linking strength between two nodes, which its reciprocal $ l_{mn} = 1/w_{mn} $ naturally expresses their relative distance in the network. Once $ \{ l_{mn} \} $ is given, it can be used to calculate the matrix of the shortest path length $ d_{mn} $ between two generic nodes i and j. Then the so-called clustering coefficient C and the global efficiency E of the graph can be calculated [23]. The clustering coefficient C is also considered as the first approximation of the local efficiency. By using those efficiencies, both the local and global behaviors in the small-world networks can be studied [20, 21]. The large C values and small E values in the early network indicate that the early data are locally clustered [see example in fig. 2(c)]. It is proved that the local verification and validation of the data are effective, where data of same isotope and similar incident energy are compared. This local method is widely applied up to now. With the growth of the networks, the clustering coefficients C decrease, while the global efficiencies E increase. The networks generated by the latest experimental data sets have C = 0.58 and E = 0.40 in the (n,2n) network, C = 0.65 and E = 0.45 in the (γ,xn) network. These results indicate that each region in the data-network is intermingled with the others and hence the verification and validation of the data can be performed either locally or globally.

      Figure 3(e) is presented with the goal to investigate the Heaps law. The Heaps law was originally introduced to describe the number of distinct words in a text document [32]. Whereafter this statistical property was observed in other real data of innovation processes by empirical analyses [2931]. Various models have proved that the Heaps law well describes the pace at which scientists discover concepts or users collect new items [33, 34]. In this work, the 3 order missing tensor is applied to represent our knowledge about the cross-sections in the nuclear reaction. The number of novelties S corresponds to the number of the entries in the tensor which has been observed. Then the Heaps law will be $ S \propto N_{node}^{\beta} $, where $ N_{node} $ is the node number in the network. As shown in Fig. 3(e), the well Heaps laws are found in discovery processes of nuclear reaction data, with the Heaps exponents $ \beta_{(n,2n)} $ = 0.77 for the (n,2n) reaction and $ \beta_{(\gamma,xn)} $ = 0.90 for the (γ,xn) reaction. Higher value of the Heaps exponent β denotes a faster exploration of the adjacent possible in the measurements of the cross-section in the (γ,xn) reaction. Being limited by the monoenergetic neutron source, the exploration of the innovative data (new isotope or new incident energy) for (n,2n) reaction is slower than that for (γ,xn) reaction.

      One of the objectives to generate networks in this work is the data evaluation. In the traditional methods, in order to evaluate a new experimental data point, one may compare it with historical measurements, predictions in the data libraries, and calculations by the theoretical models for the same reaction. However, the global and hidden uncertainties of the historical measurements may have propagated in the data libraries and theoretical models, which will mislead the evaluation so that the uncertainties will be remained undetected in the database. This is a positive feedback process that may last for even decades before it is discovered [19]. The data-networks based on the Bayesian statistics approach reveal the relative deviations between data points. The weight $ w_{mn} $ of a edge is calculated by the posterior values of the hyper-parameter $ {\widehat{\boldsymbol{\mu}}} $ recommended by the m-th and n-th data points. The $ w_{mn} $ value will be 1 if two data points are the same, but close to 0 for two data with huge discrepancy. This definition of the relative deviation expands the data range for direct comparison. Traditionally, only data points for same isotope and similar incident energy can be compared.

      In the evolution of weight distribution, shown in Fig. 4, not only new edges are linked to the network, but the weight of the original edges will also change due to the appearance of new nodes. It is the universal law in the weighted evolving networks deduced by the coupling topology and weight dynamics [22]. This dynamics is naturally reproduced by the iterative computation in the Bayesian statistics-based approach. This may be very meaningful, since few abnormal data mean large uncertainty of the measuring technique but a great many ‘abnormal’ data reveal a novel mechanism.

      Figure 4.  (color online) Distribution of the edge-weight $ w_{mn} $ in the networks.

      The distribution of the weights in the network is a appropriate quantity to evaluate the global uncertainty of the data set, while the mean node weight $ w_{m} $ = $ s_{m}/k_{m} $ can be applied to estimate the uncertainty of the individual data. It is noted here that the mentioned uncertainty is beyond the systematic and statistical errors, which is published with the data. Even the experimental data points with small published errors may be quite different from other data for the same nuclear reaction. It means that there is a global and hidden uncertainty, which comes from the limitations of the measurement technology and the theoretical knowledge.

      As shown in Fig. 4, both networks show peaks near $ w_{mn} $ = 0.9 at different periods except that for (γ,xn) reaction at 1970. However, different shapes are observed for the two reactions. In the case of the (n,2n) reaction, wide distributions are observed. For the (γ,xn) reaction, narrow distributions near $ w_{mn} $ = 0.9 are observed after 1973. Those results correspond to the fact that the global uncertainty in the database of the (n,2n) reaction is larger than that of the (γ,xn) reaction.

      The correlations between mean node weight $ w_{m} $ and node degree $ k_{m} $ are shown in Fig. 5. In the region with a good deal of existing data, the data points can be mutual verified before publishing. Hence in the $ w_{m} v.s. k_{m} $ map, the nodes with large number of neighbours concentrate in the region with large $ w_{m} $ values, while those with a few neighbors distribute in a wide region from $ w_{m} $ = 0.3 to 0.8. Figure 6 shows the distribution of the $ w_{m} $ values for the (n,2n) and (γ,xn) reactions. The uncertainty in the (n,2n) database is larger than that for (γ,xn) reaction. A more visual estimator for the individual data is the subgraph, four examples of which are shown in the embedded illustrations in Fig. 6. In the subgraph, the data point to be evaluated is shown as the center node, and its neighbours are displayed around. The distance from the center node to its neighbour is defined as $ ln(1/w_{mn}) $. The colour of the node indicates the number of its neighbours. The subgraph for a data point with small $ w_{m} $ value and hence large uncertainty like a blooming flower, since its distances to other nodes in the network is huge. On the contrary, data point with small uncertainty will huddle in the network.

      Figure 5.  (color online) Correlation between mean node weight $ w_{m} $ and node degree $ k_{m} $. The colour is an eye-guide for the point-density.

      Figure 6.  (color online) Distribution of the $ w_{m} $ value for the (a) (n,2n) and (b) (γ,xn) reactions. The embedded illustrations are examples of the estimator, where the data to be evaluated is shown as the center node, and its neighbours are displayed around. The distance from the center node to its neighbour is defined as $ ln(1/w_{mn}) $. The colour of the node indicates the number of its neighbours.

    IV.   CONCLUSION
    • In summary, based on the Bayesian statistics-based approach, a model is developed to build networks for discovery processes of the nuclear reaction data. After discretize the incident energy degree of freedom, the data are recorded by a list of 3 order missing tensors and the data evaluation is constructed as a problem of tensor decomposition and imputation with multiple observations on an entry. To solve this problem, the Bayesian tensor decomposition approach by Chen et al. [44] is extended. Case studies of cross-sections in the neutron induced threshold reaction (n,2n) and photoneutron reaction (γ,xn) are presented to build the weighted evolving networks, where nodes are the historical data and weights of the links are the relative deviation between the data points. It is found that the networks exhibit the small-world behavior, and their dynamics is well described by the Heaps law, which has been widely observed in other real networks of innovation processes. What makes the networks novel is the mapping relation between the properties in the network and the salient features of the database. It is such relation make it possible to (i) quantify the exploration efficiency of the specific data set by the Heaps exponent, (ii) evaluate the global uncertainty of the data set by the distribution of the edge-weights, and (iii) visualize and quantify the uncertainty of the individual data point by the mean node weight.

      The network built in this work is a new perspective to understand the database, which would be helpful for nuclear data analysis and compilation as well as quality improvement of the database. Future works could focus on studying the effect of the uncertainty distribution that is similar for the data measured in same period but will changes with the development of the experimental technique. The idea of noise modelling in Ref. [48] is enlightening to feed the extracted uncertainty back to the likelihood function.

    DECLARATION OF INTERESTS
    • The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Reference (48)

目录

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return