GENERATIONS IN BAYESIAN NETWORKS

This paper focuses on the study of some aspects of the theory of oriented graphs in Bayesian networks. In some papers on the theory of Bayesian networks, the concept of “Generation of vertices” denotes a certain set of vertices with many parents belonging to previous generations. Terminology for this concept, in our opinion, has not yet fully developed. The concept of “Generation” in some cases makes it easier to solve some problems in Bayesian networks and to build simpler algorithms. In this paper we will consider the well-known example “Asia”, described in many articles and books, as well as in the technical documentation for various toolboxes. For the construction of this example, we have used evaluation versions of AgenaRisk.


Introduction
Nowadays artificial intelligence is widely used in different fields of science. Since the beginning of 2000 Bayesian Networks is the most popular tool of artificial intelligence in different researches. For the correct use of Bayesian Networks, the corresponding mathematical tools were developed. There are ample opportunities to study problems in various fields of science. However, massive, complex computations in the process of work with Bayesian networks required the involvement of both good computing equipment and the availability of good software that ensure convenient work with Bayesian networks.
The most popular software products for work with Bayesian Networks are BayesiaLab, AgenaRisk, Bayes Server, Netica, Hugin Expert, BayesFusion. Even though some packages are over 15 years old, there is still an issue with interaction between mathematicians working on the theoretical basis of Bayesian networks, algorithm developers and programmers, as well as researchers in applied science.
In this paper, we will consider one of the theoretical directions in the theory of Bayesian networkspartition of the vertices of the Bayesian network into generations. In our opinion, the theory of Bayesian networks is rather poorly represented in the literature; however, in some cases, partition vertices into generations may allow you to take a fresh look at the general theory of Bayesian networks, and make it easier to develop algorithms for some problems.
In this paper we will consider the well-known example "Asia", described in many articles and books, as well as in the technical documentation for various toolboxes. For the construction of this example, we have used evaluation versions of AgenaRisk. For the basic knowledge about Bayesian networks reader can use the literature [2,4,[6][7][8][9][10][11][12][13] and the literature [1,5] to get the basic knowledge about the capabilities of AgenaRisk.

Main definitions
Bayesian Networks are a handy tool for describing complex processes with uncertainties.
Below we list the basic concepts associated with Bayesian networks. Definition 1. A Bayesian network is an acyclic oriented graph with Markov condition. The vertices of the graph are often called nodes. Nodes represent some variables that reflect the main entities in the developed model. Some nodes of the Bayesian network may be interconnected by directed arcs (edges). Arcs define some probabilistic connection between corresponding nodes. Sometimes such a relationship is causal. The reason is the node where the oriented arc comes from, the consequence is the node where the oriented arc comes Definition 6. The edges are arcs without direction. Definition 7. If the nodes are not connected by an arc, then these nodes are considering as conditionally independent.
Definition 8. A skeleton of a Bayesian network is a graph obtained from a Bayesian network by replacing arcs with edges.
Definition 9. If an arc goes from the vertex A to the vertex B, then A is called the parent of B, and B is called the child vertex of the vertex A.
Definition 10. Let Y be some subset of vertices of a Bayesian network. P (Y) is often denoted as the set of all parents belongs to Y. C (Y) is often denoted as the set of all children belongs to Y. Definition 11. A node x and an arc e are called incident if e enters or comes from x. Definition 12. Nodes are called supplementary if they are incidents of one arc, i.e. if one of them is the parent of the other one.
Definition 13. The route from vertex a 0 to vertex a n in an oriented graph (in a Bayesian network) is an alternating sequence of vertices and arcs of the form a 0 , {a 0 a 1 }, a 1 , {a 1 a 2 }, a 2 , {a 2 a 2 }, , a n . Definition 14. The path is a route without repeating arcs. Definition 15. If there is an oriented path from the vertex A to the vertex B, then A is called the ancestor of B, and B is called the descendant of A.
Definition 16. Two nodes are called connected if there is a route between them.
Definition 17. If a vertex has no ancestors, then its local probability distribution is called unconditional, otherwise conditional.
Definition 18. The topological numbering of nodes of a Bayesian network is called such a numbering of nodes, in which the number of any node is greater than its parent number.
Definition 19. Sometimes it is convenient to part nodes into generations to develop Bayesian network calculation algorithms. Generations can be of two types: generations of descendants and generations of ancestors. Generations of descendants are defined as follows:  Nodes without parents belong to the 0 generation of descendants.  Nodes with only 0 generation of parents belong to 1 generation of descendants.  Nodes with only 0 and 1 generation of parents belong to 2 generation of descendants.  ………….  Nodes with 0, 1, 2, … K generation of parents belong to K+1 generation of descendants  ………….
Definition 20. Generations of ancestors are defined as follows:  Nodes with no children belong to the 0 generation of ancestors.  Nodes with only 0 generation of children belong to 1 generation of ancestors.  Nodes with only 0 and 1 generation of children belong to the 2 generation of ancestors.  ………….  Nodes with only 0, 1, 2, … K generation of children belong to the K+1 generation of ancestors.  …………..

How can we use the definition "Generation"?
We can distinguish two types of generationsgenerations of descendants and generations of ancestors. Generations of descendants are built, starting from vertices without parents. Generations of descendants are built, starting from vertices without children. In most cases, we obtain completely different partitions that have a different number of generations. However, it is easy to build a graph (Bayesian network) which partition into generations of descendants and generations of ancestors are completely identical.
The expediency of partition vertices of a Bayesian network into one or another type of generation is determined by a specific task. To solve some problems, sometimes we need to do two partitions at once: into generations of descendants and generations of ancestors.
The algorithm for partition vertices of a Bayesian network into generations is quite simple. Let's consider the both variants.

Partition into generations of descendants
First, we search and select vertices without parents. We assign such vertices to zero generation and mark selected vertices. Next, we look through the remaining unmarked vertices and select only those vertices whose parents belong only to the zero generation. We obtain the first generation of descendants, and mark the newly selected vertices of the first generation. Again, we look through the remaining unmarked vertices and select only those vertices whose parents belong only to either zero or first generation. We obtain the second generation of descendants and mark the selected vertices. We continue that way until all vertices are marked.
An example of the partition into generations is shown in Figure 1.
Generations of descendants for this Bayesian network:  Vertices Age and VisitAsia will be referred to the zero generation.  Vertices Smoker and Tuberculosis belong to the first generation.  Vertices Cancer and Bronchtis belong to the second generation.  The only vertex TbOrCa belong to the third generation.  Vertices XRay and Dyspnea belong to the fourth generation.

Partition into generations of ancestors
First, we search and select vertices without children. We assign such vertices to zero generation of ancestors and mark selected vertices. Next, we look through the remaining unmarked vertices and select only those vertices whose children belong only to the zero generation of ancestors. We obtain the first generation of ancestors, and mark the newly selected vertices of the first generation. Again, we look through the remaining unmarked vertices and select only those vertices whose children belong only to either zero or first generation of ancestors. We obtain the second generation of descendants and mark the selected vertices. We continue that way until all vertices are marked. Generations of ancestors for this Bayesian network:  Vertices XRay and Dyspnea will be referred to the zero generation.  Vertices TbOrCa and Bronchtis belong to the first generation.  Vertices Cancer and Tuberculosis belong to the second generation.  Vertices Smoker and VisitAsia belong to the third generation.  The only vertex Age belong to the fourth generation.
We can use both partitions into generations: into generations of descendants or generations of ancestors to prove the following theorem. Theorem 1. If at some stage of generation's construction, some generation is empty, but we still have unmarked vertices, there are oriented cycles in the graph. Proof.
If at some stage among unmarked vertices there were no vertices that have parents from previous generations, then considered node can only have as parents the nodes from the current generation. We choose any parent whose parents also do not belong to previous generations. As a result, we have an unlimited path with a limited number of nodes, that is, a cycle.
Thus, we have a fairly simple way to find out if an oriented graph has a cycle and find where this cycle is.
Below are a few standard, previously proven, theorems from graph theory and the theory of Bayesian networks, as well as their new proofs with using the concept of "Generation". In our opinion, the proofs of these theorems are simpler.

Theorem 2. Two isomorphic Bayesian networks have the same partition into generations of descendants and generations of ancestors.
Proof. 1. If the partition of the vertices of the two Bayesian networks into generations of descendants have done, and the partitioning data for these networks are different, then in the generation where there is a difference, there are vertices which have a different number of parents in each network. This fact contradicts the isomorphism condition of two networks. 2. If the partition of the vertices of the two Bayesian networks into generations of ancestors have done, and the partitioning data for these networks are different, then in the generation where there is a difference, there are vertices which have a different number of children in each network. This fact contradicts the isomorphism condition of two networks. The problem of verifying the isomorphism of two Bayesian networks is greatly simplified by using the concept of "Generation". Theorem 3. Bayesian network allows topological numbering. Proof.
Let the vertices of the Bayesian network be partitioned into generations of descendants.

The first method of numbering
First, we randomly number the nodes of the zero generation, then continue the numbering for the nodes of the first generation, then for the nodes of the second generation, etc.
Since the parents of each node are in previous generations, the numbering will be topological.

The second method of numbering
Let's consider all nodes of zero generation. We take an arbitrary node of the zero generation and consider the set M1 of all descendants of this node on all generations. We number the nodes of the set M1 as follows. We assign the number 1 to the node from the zero generation. Next, we number the nodes of the first generation on set M1, then the nodes of the second generation on set M1, etc. Then we mark all the numbered nodes.
Next, we take the next node of the zero generation and assign the next number to it. We consider the set M2 of all descendants of a given node which was not previously marked. Next, we number the nodes of the first generation on set M2, then the nodes of the second generation on set M2, etc. Then we mark all the numbered nodes.
We do the same with all the remaining nodes of zero generation. We number these nodes in the same way.
Since the parents of each node are from previous generations, the numbering will be topological. Theorem 4. The Bayesian network allows such a topological numbering, in which the number of any node X will be greater than the numbers of any node from an arbitrary set of nodes M that do not contain node X, as well as descendants of node X.

Proof.
Let the vertices of the Bayesian network be partitioned into generations of descendants.
Let's slightly change the second numbering method from Theorem 3.
Let MX be the set of nodes of the considered Bayesian network, consisting of the node X and its descendants. Let L be the number of nodes in the considered Bayesian network. We will do the topological numbering of network nodes by the second method described in Theorem 3.
We add the number L to all numbers of nodes of the set MX. Continuous numbering of Bayesian network nodes will be broken. However, the number of any node will remain still greater than the number of its parent. Now we need to restore the numbering continuity of the Bayesian network.
We arrange in ascending order the obtained node numbers of the Bayesian network. Let's number these node numbers. If we take the numbering of node numbers for new node numbers, then this numbering will remain topological numbering. In addition, the node numbers of the MX array will be greater than the numbers of all remaining nodes of the network, and therefore greater than node numbers of the array M. Theorem 5. The Bayesian network allows such a topological numbering, in which the number of any node X will be less than the numbers of any node from an arbitrary set of nodes M that do not contain node X, as well as ancestors of node X.

Proof.
Let the vertices of the Bayesian network be partitioned into generations of ancestors.
Let's slightly change the second numbering method from Theorem 3.
Let MX be the set of nodes of the considered Bayesian network, consisting of the node X and its ancestors. Let L be the number of nodes in the considered Bayesian network. We will do the topological numbering of network nodes by the second method described in Theorem 3.
We add the number L to all numbers of nodes of the set MX. Continuous numbering of Bayesian network nodes will be broken. However, the number of any node will remain still less than the number of its child. Now we need to restore the numbering continuity of the Bayesian network.
We arrange in ascending order the obtained node numbers of the Bayesian network. Let's number these node numbers. If we take the numbering of node numbers for new node numbers, then this numbering will remain topological numbering. In addition, the node numbers of the MX array will be less than the numbers of all remaining nodes of the network, and therefore less than node numbers of the array M.

Conclusion
This paper shows some possibilities of using the concept of "Generation" in the process of proof of some standard theorems in the theory of Bayesian networks. Use of this concept also simplifies some algorithms for problems in Bayesian networks, for example, searching for loops or checking the isomorphism of two networks.
Here we have presented only a small part of the possibilities of using the concept of "Generation".