A Granulation Strategy-Based Algorithm for Computing Strongly Connected Components in Parallel

He, Huixing; Xu, Taihua; Chen, Jianjun; Cui, Yun; Song, **g**g

doi:10.3390/math12111723

Open AccessArticle

A Granulation Strategy-Based Algorithm for Computing Strongly Connected Components in Parallel

by

Huixing He

,

Taihua Xu

^*,

Jianjun Chen

,

Yun Cui

and

**g**g Song

School of Computer, Jiangsu University of Science and Technology, Zhenjiang 212100, China

^*

Author to whom correspondence should be addressed.

Mathematics 2024, 12(11), 1723; https://doi.org/10.3390/math12111723

Submission received: 30 March 2024 / Revised: 19 May 2024 / Accepted: 27 May 2024 / Published: 31 May 2024

(This article belongs to the Topic New Advances in Granular Computing and Data Mining)

Download

Browse Figures

Versions Notes

Abstract

:

Granular computing (GrC) is a methodology for reducing the complexity of problem solving and includes two basic aspects: granulation and granular-based computing. Strongly connected components (SCCs) are a significant subgraph structure in digraphs. In this paper, two new granulation strategies were devised to improve the efficiency of computing SCCs. Firstly, four SCC correlations between the vertices were found, which can be divided into two classes. Secondly, two granulation strategies were designed based on correlations between two classes of SCCs. Thirdly, according to the characteristics of the granulation results, the parallelization of computing SCCs was realized. Finally, a parallel algorithm based on granulation strategy for computing SCCs of simple digraphs named GPSCC was proposed. Experimental results show that GPSCC performs with higher computational efficiency than algorithms.

Keywords:

strongly connected components; granulation strategy; SCCs correlations; parallel

MSC:

62H30; 68T30; 05C85; 68R10

1. Introduction

Strongly connected components (SCCs) refer to a subgraph structure in digraphs in which every pair of vertices can reach each other. If an SCC contains only one vertex, it is called trivial. Otherwise, it is called nontrivial, which is the main goal of SCC knowledge discovery tasks.

Some serial [1,2,3,4,5,6,7,8] algorithms have been proposed for computing SCCs. Due to the inherent efficiency disadvantages of serial algorithms compared with parallel algorithms, some parallel [6,9,10,11,12,13,14,15,16,17] algorithms have been proposed. Tarjan [1] is a well-known serial algorithm for computing SCCs. The time complexity of Tarjan is O(n+m), where n is the number of vertices and m is the number of edges. However, it is not computationally efficient when dealing with large amounts of data. To address this issue, a parallel SCC algorithm based on a union-find data structure (UFSCC) [10] was proposed. It uses the union-find data structure to synchronize the vertices’ state information between different processors to avoid the current vertex’s SCC from being computed repeatedly. The time complexity of UFSCC is O(q·(n+m)), and the space complexity of UFSCC is O(q· n), where q is the number of processors. The nested depth-first search (NDFS) algorithm [13] is a linear temporal logic (LTL) model-checking algorithm. Although it is also a serial algorithm, it uses a different idea to Tarjan’s. With the process of computing SCCs in the NDFS, the state of each vertex is modified by two search processes, once in blue DFS and once in red DFS. The time complexity of NDFS is O(n· log(n)). The multi-core nested depth-first search (MC-NDFS) algorithm [14] is a parallel variant of the NDFS algorithm. It implements parallelization of the blue and red DFS using a shuffle (random) function. The time complexity of MC-NDFS is

O (\frac{n}{q} \cdot l o g (n))

. The space complexity of MC-NDFS is

O (N \cdot l o g_{2} (N))

, where N is the number of threads. The forward–backward (FB) algorithm [15] is an important parallel recursive algorithm for computing SCCs. It uses breadth-first search (BFS) to decompose the digraph into subgraph slices. The time complexity of FB is O(n· (n+m)), and the space complexity of FB is O(m+n·(maximum depth of recursion)). In addition, the OWCTY-BWD-FWD (OBF) algorithm [16] decomposes a digraph into several independent subgraph slices to compute SCCs. In detail, the OWCTY function is repeatedly invoked to eliminate the vertices that constitute trivial SCCs. Subsequently, the digraph is divided into different subgraph slices using a forward and backward reachability search. Then, the FB algorithm is used to compute the SCCs contained in each subgraph slice. The time complexity of OBF is O(n· (n+m)), and the space complexity of OBF is O(m+n·(maximum depth of recursion)). Parallel identification of SCCs with the spanning trees (ISPAN) algorithm [17] is a parallel algorithm for computing SCCs. It uses simple spanning trees to identify SCCs. The time complexity of ISPAN is O(

\frac{n + m}{q}

).

Granular computing (GrC) [18,19,20,21,22,23] is a methodology for viewing the objective world. It abstracts and divides a complex problem into simple problems. Thus, the complexity of solving problems can be reduced, and the efficiency of solving problems can be improved by introducing the idea of GrC [24,25,26,27,28]. GrC has been applied to uncertainty handling in decision analysis [29,30,31,32,33] and solving graph theory [34,35,36,37,38,39]. For SCCs problem solving, granulation strategies based on SCC correlations have been introduced. In [7], two nontrivial SCC correlations were found, and a granulation strategy was proposed based on them. Vertex granules corresponding to each vertex were constructed using the granulation strategy. If vertex x has been confirmed to belong to a nontrivial SCC, then vertices in the vertex granule of x can be regarded as redundant vertices for computing SCCs. This conclusion can narrow the solution domain without complicated computations to improve the computational efficiency of SCCs. Furthermore, two trivial SCC correlations were found by Cheng et al. [8]. Subsequently, a granulation strategy based on four SCC correlations was proposed. This granulation strategy is unidirectional; as a result, vertices satisfying these correlations are unidirectionally added to the corresponding vertex granule. Therefore, SCC correlations between vertex granules are transitive. Based on this transitivity, a series of redundant vertices can be found by searching vertex granules with BFS. Consequently, the computational efficiency of SCCs was further enhanced. The time complexities of algorithms in [7,8] are both O(c·(n+m)), where c is the number of nontrivial SCCs.

Vertex granules were used to solve the problem of SCC discovery in a serial manner in [7,8]. However, they have inherent disadvantages in terms of efficiency. In view of this, this paper carries out the following work: (1) firstly, the SCC correlations between vertices are analyzed deeply. Then, four new SCC correlations are obtained, which can be divided into two classes: trivial and nontrivial SCC correlations. (2) Secondly, two new granulation strategies are designed based on the above two classes of SCC correlations. These two granulation strategies are implemented by two functions named GVT (granulate vertex by trivial correlations) and GVNT (granulate vertex by nontrivial correlations), respectively. Computations of SCCs for vertices found by GVT can be considered redundant computations; therefore, these vertices can be added to the vertex granule. The same is true for vertices found by GVNT. (3) Thirdly, computations of SCCs for vertices that do not satisfy either of these granulation strategies must be performed. Consequently, a parallel method for computing SCCs is realized based on the characteristics of vertex granules. (4) Finally, a parallel algorithm based on a granulation strategy named GPSCC is proposed to compute SCCs of simple digraphs.

The rest of this paper is arranged as follows: Section 2 recalls related concepts concerning SCCs. Section 3 analyzes the SCC correlations between vertices. Section 4 puts forward a parallel algorithm named GPSCC for computing SCCs. Section 5 displays the experimental results and related analysis. Section 6 concludes this work.

2. Preliminaries

In this section, for the convenience of formal description, some basic concepts of SCCs are briefly introduced.

Definition 1

([40]). A directed graph (or digraph)

D = {U, E}

consists of a nonempty finite set U of elements called vertices and a finite set E of ordered pairs of distinct vertices called directed edges. The size of U is denoted by

| U |

or n. The size of E is denoted by

| E |

or m.

There is a binary relation ‘

\overset{*}{\to}

’ on U called the reachable relation. For two distinct vertices, x and y, in U, if x is reachable to y (

x \overset{*}{\to} y

), then there is a sequence of vertices and edges leading from x to y. Then, x is called an

a n c e s t o r

of y, and y is called a descendant of x. If the sequence only contains an edge, then x is called a predecessor of y, and y is called a successor of x.

d e s c e n d a n t s (x)

is a set of vertices that is reachable from x,

a n c e s t o r s (x)

is a set of vertices that is reachable to x,

s u c c e s s o r s (x)

is a set of vertices that is reachable from x by an edge, and

p r e d e c e s s o r s (x)

is a set of vertices that is reachable to x by an edge. For convenience,

d e s c e n d a n t s (x)

is abbreviated to

d e s (x)

,

a n c e s t o r s (x)

is abbreviated to

a n c (x)

,

s u c c e s s o r s (x)

is abbreviated to

s u c (x)

, and

p r e d e c e s s o r s (x)

is abbreviated to

p r e (x)

. Notably, x is always reachable to x (

x \overset{*}{\to} x

); thus, it is straightforward

x \in d e s (x)

and

x \in a n c (x)

, which also means that the set of

d e s (x) \cap a n c (x)

contains x.

The concept of loop is a directed edge which connects a vertex twice. If a digraph contains neither loop nor multiple directed edges, it is called a simple digraph. All the digraphs mentioned in this paper are simple digraphs. For a simple digraph,

x \notin s u c (x)

and

x \notin p r e (x)

can be concluded.

Definition 2

([1]). Let

D = {U, E}

be a digraph. For each pair of vertices

x, y \in U

, if x is always reachable to y (

x \overset{*}{\to} y

) and from y (

y \overset{*}{\to} x

), then D is strongly connected.

Proposition 1

([6]). Let

D = {U, E}

be a digraph. If there are three vertices

x, y, z \in U

, such that

x \overset{*}{\to} y

and

y \overset{*}{\to} z

, then

x \overset{*}{\to} z

.

Definition 3

([1]). Let

D = {U, E}

be a digraph. We may define an equivalence relation of U as follows: for

x, y \in U

, x and y are equivalent if

x \overset{*}{\to} y

and

y \overset{*}{\to} x

. Let the distinct equivalence classes under this equivalence relation be

U_{i}, 1 \leq i \leq | U |

. Let

D_{i} = {U_{i}, E_{i}}

, where

E_{i} = {(x, y) \in E | x, y \in U_{i}}

. If each

D_{i}

is strongly connected, and no

D_{i}

is a proper subgraph of a strongly connected subgraph of D, then the

D_{i}

subgraphs are referred to as strongly connected components (SCCs) of D.

Furthermore, the

D_{i}

is referred to as a trivial SCC if

U_{i}

contains only one vertex;

D_{i}

is called a nontrivial SCC if

U_{i}

contains two distinct vertices at least. For convenience, we use the set

T_{S} (D)

to collect the vertex sets of trivial SCCs in D, snf the set

N T_{S} (D)

to collect the vertex sets of nontrivial SCCs in D. Obviously, the computation of

N T_{S} (D)

is the target, rather than

T_{S} (D)

.

Lemma 1.

Let

D = {U, E}

be a digraph. For any

x \in U

,

d e s (x) = {x} \cup s u c (x) \cup ⋃_{y \in s u c (x)} d e s (y)

.

Proof.

(1) If

z = x

, then

z \in d e s (x)

because x is always reachable to itself (i.e.,

x \in d e s (x)

); if

z \in s u c (x)

, then

z \in d e s (x)

according to the definitions of

s u c c e s s o r

and

d e s c e n d a n t

; if

z \in ⋃_{y \in s u c (x)} d e s (y)

, then there must be

y \in s u c (x)

, which leads to

z \in d e s (y)

. Together, the following can be found:

\begin{matrix} y \in s u c (x) \Rightarrow x \overset{*}{\to} y, \\ z \in d e s (y) \Rightarrow y \overset{*}{\to} z . \end{matrix}\} \overset{P r o p o s i t i o n 1}{\Rightarrow} x \overset{*}{\to} z \Rightarrow z \in d e s (x) .

To conclude, if

z \in ({x} \cup s u c (x) \cup ⋃_{y \in s u c (x)} d e s (y))

, there must be

z \in d e s (x)

.

(2) Reversely,

z \in d e s (x)

and

z = x

can deduce

z \in {x}

.

z \in d e s (x)

and

z \neq x

can deduce

x \overset{*}{\to} z

. That is to say, there is a sequence of vertices and edges leading from x to z. If the sequence only contains an edge, then

z \in s u c (x)

can be obtained. If the sequence contains at least two edges, then there must be

y \in s u c (x)

so that

y \overset{*}{\to} z

. In other words,

z \in ⋃_{y \in s u c (x)} d e s (y)

. Together,

z \in d e s (x)

can infer

z \in {x} \cup s u c (x) \cup ⋃_{y \in s u c (x)} d e s (y)

.

Combining (1) and (2),

d e s (x) = {x} \cup s u c (x) \cup ⋃_{y \in s u c (x)} d e s (y)

can be obtained. □

Lemma 2.

Let

D = {U, E}

be a digraph. For any

x \in U

,

a n c (x) = {x} \cup p r e (x) \cup ⋃_{y \in p r e (x)} a n c (y) .

Proof.

(1) If

z = x

, then

z \in a n c (x)

because x is always reachable to itself (i.e.,

x \in a n c (x)

); if

z \in p r e (x)

, then

z \in a n c (x)

according to the definitions of

p r e d e c e s s o r

and

a n c e s t o r

; if

z \in ⋃_{y \in p r e (x)} a n c (y)

, then there must be

y \in p r e (x)

, which leads to

z \in a n c (y)

. Together, the following can be obtained:

\begin{matrix} y \in p r e (x) \Rightarrow y \overset{*}{\to} x, \\ z \in a n c (y) \Rightarrow z \overset{*}{\to} y . \end{matrix}\} \overset{P r o p o s i t i o n 1}{\Rightarrow} z \overset{*}{\to} x \Rightarrow z \in a n c (x) .

To conclude, if

z \in ({x} \cup p r e (x) \cup ⋃_{y \in p r e (x)} a n c (y))

, then there must be

z \in a n c (x)

.

(2) Reversely,

z \in a n c (x)

and

z = x

can deduce

z \in {x}

.

z \in a n c (x)

and

z \neq x

can deduce

z \overset{*}{\to} x

. That is to say, there is a sequence of vertices and edges leading from z to x. If the sequence only contains an edge, then

z \in p r e (x)

can be obtained. If the sequence contains at least two edges, then there must be

y \in p r e (x)

so that

z \overset{*}{\to} y

. In other words,

z \in ⋃_{y \in p r e (x)} a n c (y)

. Together,

z \in a n c (x)

can infer

z \in {x} \cup p r e (x) \cup ⋃_{y \in p r e (x)} a n c (y)

.

Combining (1) and (2),

a n c (x) = {x} \cup p r e (x) \cup ⋃_{y \in p r e (x)} a n c (y)

can be obtained. □

Theorem 1.

Let

D = {U, E}

be a digraph. For any

x \in U

, if

U_{x} = d e s (x) ⋂ a n c (x)

, then

U_{x}

is a set of vertices of a SCC of D.

Proof.

For any pair of vertices of

y, z \in U_{x}

, we can obtain

y \in (d e s (x) \cap a n c (x))

and

z \in (d e s (x) \cap a n c (x))

, then

\begin{matrix} y \in (d e s (x) \cap a n c (x)) \Rightarrow \{\begin{matrix} y \in d e s (x) \Rightarrow x \overset{*}{\to} y, \\ y \in a n c (x) \Rightarrow y \overset{*}{\to} x . \end{matrix} \\ z \in (d e s (x) \cap a n c (x)) \Rightarrow \{\begin{matrix} z \in d e s (x) \Rightarrow x \overset{*}{\to} z, \\ z \in a n c (x) \Rightarrow z \overset{*}{\to} x . \end{matrix} \end{matrix}} \overset{P r o p o s i t i o n 1}{\Rightarrow} \{\begin{matrix} y \overset{*}{\to} z, \\ z \overset{*}{\to} y . \end{matrix}

According to Definition 3, it can be obatined that

U_{x}

is a set of vertices of a SCC of D. □

Proposition 2.

Let

D = {U, E}

be a digraph. If

s u c (x) = \emptyset

, then

U_{x} \in T_{S} (D)

.

Proof.

According to Lemma 1,

d e s (x) = {x} \cup s u c (x) \cup ⋃_{y \in s u c (x)} d e s (y)

. If

s u c (x) = \emptyset

, then

d e s (x) = {x}

. According to Lemma 2,

a n c (x) = {x} \cup p r e (x) \cup ⋃_{y \in p r e (x)} a n c (y)

. Using Theorem 1,

U_{x} = d e s (x) \cap a n c (x) = {x}

, which leads to

U_{x} \in T_{S} (D)

. □

Proposition 3.

Let

D = {U, E}

be a digraph. If

p r e (x) = \emptyset

, then

U_{x} \in T_{S} (D)

.

Proof.

According to Lemma 2,

a n c (x) = {x} \cup p r e (x) \cup ⋃_{y \in p r e (x)} a n c (y)

. If

p r e (x) = \emptyset

, then

a n c (x) = {x}

. According to Lemma 1,

d e s (x) = {x} \cup s u c (x) \cup ⋃_{y \in s u c (x)} d e s (y)

. Using Theorem 1,

U_{x} = d e s (x) \cap a n c (x) = {x}

, which leads to

U_{x} \in T_{S} (D)

. □

Example 1.

Let

D = {U, E}

be a digraph, as shown in Figure 1a, U = {a, b, c, d, e, f, g, h, i, j}, E = {(a, b), (b, e), (b, d), (c, b), (d, f), (e, f), (e, d), (f, h), (h, d), (h, g), (i, h), (i, j), (j, i)}. The digraph of D contains seven SCCs: five trivial SCCs,

D_{{a}}

,

D_{{b}}

,

D_{{c}}

,

D_{{e}}

,

D_{{g}}

and two nontrivial SCCs,

D_{{d, f, h}}

and

D_{{i, j}}

. The nontrivial SCCs in D are marked as blue regions, and the trivial SCCs in D are marked as orange regions, as shown in Figure 1b.

3. SCCs Correlations between Vertices

In this subsection, we analyze the graph concept of SCCs, and then four SCCs correlations are found, as shown in Theorems 2–5. If two distinct vertices, x and y, satisfy any above Theorem, then after computing SCCs for x, y will not deduce any new nontrivial SCCs. In other words, vertex y will lead to redundant computation. Then, Theorems 2–5 can be used to avoid such redundant computations.

Theorem 2.

Let

D = {U, E}

be a digraph. If

s u c (x)

satisfies

\forall y (y \in s u c (x) \to U_{y} \in T_{S} (D))

, then

U_{x} \in T_{S} (D)

.

Proof.

According to the Theorem 1,

U_{x} = a n c (x) \cap d e s (x)

. It is known that there must be

x \in (a n c (x) \cap d e s (x))

. Let

A = U_{x} - {x}

, then

U_{x} = {x} \in T_{S} (D)

if

A

is judged as empty.

Suppose that

A \neq \emptyset

, then

A \cap U_{x} \neq \emptyset

⇒

A \cap (a n c (x) \cap d e s (x)) \neq \emptyset

⇒

A \cap a n c (x) \neq \emptyset \land A \cap d e s (x) \neq \emptyset

. According to Lemma 1,

d e s (x) = {x} \cup s u c (x) \cup ⋃_{y \in s u c (x)} d e s (y)

, then

A \cap d e s (x) \neq \emptyset \Rightarrow (A \cap {x}) \cup (A \cap s u c (x)) \cup (A \cap ⋃_{y \in s u c (x)} d e s (y)) \neq \emptyset

.

(1) Now that

A = U_{x} - {x}

, it is obvious that

x \notin A

. Then,

A \cap {x} = \emptyset

can be concluded.

(2) Suppose

A \cap s u c (x) \neq \emptyset

, then

\exists y (y \in A \land y \in s u c (x))

. Now that

A = U_{x} - {x}

, it is obvious that

A \subseteq U_{x} = a n c (x) \cap d e s (x)

.

y \in A

implies

y \in (a n c (x) \cap d e s (x))

, then

y \overset{*}{\to} x

and

x \overset{*}{\to} y

can be obtained. Furthermore, there is

y \in s u c (x) \Rightarrow y \neq x

because there must be

x \notin s u c (x)

, which is results from the fact that there is no loop in simple digraphs. That is to say, there are two distinct vertices, x and y, which are reachable to each other. Then,

U_{y} = U_{x} \in N T_{S} (D)

can be obtained according to Definition 3. To sum up,

\exists y (y \in s u c (x) \land U_{y} \in N T_{S} (D)

) is true. However, the conclusion conflicts with the given condition of

\forall y (y \in s u c (x) \to U_{y} \in T_{S} (D)

). By the principle of contrary evidence, the hypothesis of

A \cap s u c (x) \neq \emptyset

is false. Then,

A \cap s u c (x) = \emptyset

can be concluded.

(3) Suppose

A \cap ⋃_{y \in s u c (x)} d e s (y) \neq \emptyset

, then

\exists y (y \in s u c (x) \to A \cap d e s (y) \neq \emptyset)

. Firstly,

y \in s u c (x)

can deduce

x \overset{*}{\to} y

and

x \neq y

. Secondly,

A \cap d e s (y) \neq \emptyset \Rightarrow \exists z (z \in A \land z \in d e s (y))

. Because

A \subseteq (a n c (x) \cap d e s (x))

, then we can obtain

z \in A \Rightarrow z \in (a n c (x) \cap d e s (x)) \Rightarrow z \overset{*}{\to} x

.

z \in d e s (y)

means

y \overset{*}{\to} z

. Then,

y \overset{*}{\to} x

holds on the basis of

y \overset{*}{\to} z

and

z \overset{*}{\to} x

.

x \neq y

,

y \overset{*}{\to} x

and

x \overset{*}{\to} y

show that there are two distinct vertices, x and y, which are reachable to each other. Then, it can be obtained that

U_{y} = U_{x} \in N T_{S} (D)

according to Definition 3. To sum up,

\exists y (y \in s u c (x) \land U_{y} \in N T_{S} (D)

) is true. However, the conclusion conflicts with the given condition of

\forall y (y \in s u c (x) \to U_{y} \in T_{S} (D)

). By the principle of contrary evidence, the hypothesis of

A \cap ⋃_{y \in s u c (x)} d e s (y) \neq \emptyset

is false. Then,

A \cap ⋃_{y \in s u c (x)} d e s (y) = \emptyset

can be concluded.

Combining (1), (2), and (3),

A = \emptyset

can be concluded. Thus,

U_{x} = {x}

, i.e.,

U_{x} \in T_{S} (D)

. Overall, if

s u c (x)

satisfies

\forall y (y \in s u c (x) \to U_{y} \in T_{S} (D))

, then

U_{x} \in T_{S} (D)

. □

Theorem 3.

Let

D = {U, E}

be a digraph. If

p r e (x)

satisfies

\forall y (y \in p r e (x) \to U_{y} \in T_{S} (D))

, then

U_{x} \in T_{S} (D)

.

Proof.

According to Theorem 1,

U_{x} = a n c (x) \cap d e s (x)

. It is known that there must be

x \in (a n c (x) \cap d e s (x))

. Let

A = U_{x} - {x}

, then

U_{x} = {x} \in T_{S} (D)

if

A = \emptyset

.

Suppose that

A \neq \emptyset

, then

A \cap U_{x} \neq \emptyset \Rightarrow A \cap (a n c (x) \cap d e s (x)) \neq \emptyset \Rightarrow A \cap a n c (x) \neq \emptyset \land A \cap d e s (x) \neq \emptyset .

According to Lemma 2,

a n c (x) = {x} \cup p r e (x) \cup ⋃_{y \in p r e (x)} a n c (y) \Rightarrow (A \cap {x}) \cup (A \cap p r e (x)) \cup (A \cap ⋃_{y \in p r e (x)} a n c (y)) \neq \emptyset

.

(1) Now that

A = U_{x} - {x}

, it is obvious that

x \notin A

. Then,

A \cap {x} = \emptyset

can be concluded.

(2) Suppose

A \cap p r e (x) \neq \emptyset

, then

\exists y (y \in A \land y \in p r e (x))

. Now that

A = U_{x} - {x}

, it is obvious that

A \subseteq (a n c (x) \cap d e s (x))

.

y \in A

implies

y \in (a n c (x) \cap d e s (x))

, then

y \overset{*}{\to} x

and

x \overset{*}{\to} y

can be obtained. Furthermore, there is

y \in p r e (x) \Rightarrow y \neq x

because there must be

x \notin p r e (x)

, which results from the fact that there is no loop in simple digraphs. That is to say, there are two distinct vertices, x and y, which are reachable to each other. Then,

U_{y} = U_{x} \in N T_{S} (D)

can be obtained according to Definition 3. To sum up,

\exists y (y \in p r e (x) \land U_{y} \in N T_{S} (D)

) is true. However, the conclusion conflicts with the given condition of

\forall y (y \in p r e (x) \to U_{y} \in T_{S} (D)

). By the principle of contrary evidence, the hypothesis of

A \cap p r e (x) \neq \emptyset

is false. Then,

A \cap p r e (x) = \emptyset

can be concluded.

(3) Suppose

A \cap ⋃_{y \in p r e (x)} a n c (y) \neq \emptyset

, then

\exists y (y \in p r e (x) \to A \cap a n c (y) \neq \emptyset)

. Firstly,

y \in p r e (x)

can deduce

y \overset{*}{\to} x

and

x \neq y

. Secondly,

A \cap a n c (y) \neq \emptyset \Rightarrow \exists z (z \in A \land z \in a n c (y))

. Because

A \subseteq (a n c (x) \cap d e s (x))

, then we can obtain

z \in A \Rightarrow z \in (a n c (x) \cap d e s (x)) \Rightarrow x \overset{*}{\to} z

.

z \in a n c (y)

means

z \overset{*}{\to} y

. Then,

x \overset{*}{\to} y

holds on the basis of

x \overset{*}{\to} z

and

z \overset{*}{\to} y

.

y \overset{*}{\to} x

,

x \overset{*}{\to} y

, and

x \neq y

show that there are two distinct vertices, x and y, which are reachable to each other. Then, it can be obtained that

U_{y} = U_{x} \in N T_{S} (D)

according to Definition 3. To sum up,

\exists y (y \in p r e (x) \land U_{y} \in N T_{S} (D)

) is true. However, the conclusion conflicts with the given

\forall y (y \in p r e (x) \to U_{y} \in T_{S} (D)

). By the principle of contrary evidence, the hypothesis of

A \cap ⋃_{y \in p r e (x)} a n c (y) \neq \emptyset

is false. Then,

A \cap ⋃_{y \in p r e (x)} a n c (y) = \emptyset

can be concluded.

Combining (1), (2), and (3),

A = \emptyset

can be concluded. Thus,

U_{x} = {x}

, i.e.,

U_{x} \in T_{S} (D)

. Overall, if

p r e (x)

satisfies

\forall y (y \in p r e (x) \to U_{y} \in T_{S} (D))

, then

U_{x} \in T_{S} (D)

. □

Theorem 4.

Let

D = {U, E}

be a digraph.

\forall x \in U

, if

U_{x} \in N T_{S} (D)

, then

\exists y (y \in s u c (x) \land y \in U_{x})

.

Proof.

According to Theorem 1,

U_{x} = a n c (x) \cap d e s (x)

. If

U_{x} \in N T_{S} (D)

, then

\exists z (z \in (a n c (x) \cap d e s (x)) \to x \neq z)

. In other words, there are at least two distinct vertices, x and z, in the nontrivial SCCs corresponding to

U_{x}

. Along with Definition 3,

z \overset{*}{\to} x

and

x \overset{*}{\to} z

can be obtained.

x \overset{*}{\to} z

means that

z \in d e s (x)

. According to Lemma 1,

d e s (x) = {x} \cup s u c (x) \cup ⋃_{y \in s u c (x)} d e s (y)

. Then,

z \in d e s (x) \land z \neq x

would deduce that

z \in s u c (x)

or

z \in ⋃_{y \in s u c (x)} d e s (y)

.

(1) If

z \in s u c (x)

, then

\exists z (z \in s u c (x) \land z \in U_{x})

. By replacing z with y,

\exists y (y \in s u c (x) \land y \in U_{x})

can be obtained.

(2) If

z \in ⋃_{y \in s u c (x)} d e s (y)

, then

\exists y (y \in s u c (x) \land z \in d e s (y))

. Firstly,

y \in s u c (x)

can deduce

y \in d e s (x)

(

y \neq x

), which means

x \overset{*}{\to} y

. Secondly,

z \in d e s (y)

can deduce

y \overset{*}{\to} z

, along with the

z \overset{*}{\to} x

obtained in the above part, and

y \overset{*}{\to} x

can be concluded according to Proposition 1.

x \neq y

,

y \overset{*}{\to} x

, and

x \overset{*}{\to} y

show that there are two distinct vertices, x and y, which are reachable to each other. That is to say,

y \in U_{x}

. Overall, there must be

\exists y (y \in s u c (x) \land y \in U_{x})

.

Combining (1) and (2), if

U_{x} \in N T_{S} (D)

, then

\exists y (y \in s u c (x) \land y \in U_{x})

. □

Theorem 5.

Let

D = {U, E}

be a digraph.

\forall x \in U

, if

U_{x} \in N T_{S} (D)

, then

\exists y (y \in p r e (x) \land y \in U_{x})

.

Proof.

According to Theorem 1,

U_{x} = a n c (x) \cap d e s (x)

. If

U_{x} \in N T_{S} (D)

, then

\exists z (z \in (a n c (x) \cap d e s (x)) \to x \neq z)

. In other words, there are at least two distinct vertices, x and z, in the nontrivial SCCs corresponding to

U_{x}

. Along with Definition 3,

z \overset{*}{\to} x

and

x \overset{*}{\to} z

can be obtained.

z \overset{*}{\to} x

means that

z \in a n c (x)

. According to Lemma 2,

a n c (x) = {x} \cup p r e (x) \cup ⋃_{y \in p r e (x)} a n c (y)

. Then,

z \in a n c (x) \land z \neq x

would deduce that

z \in p r e (x)

or

z \in ⋃_{y \in p r e (x)} a n c (y)

.

(1) If

z \in p r e (x)

, then

\exists z (z \in p r e (x) \land z \in U_{x})

. By replacing z with y,

\exists y (y \in p r e (x) \land y \in U_{x})

can be obtained.

(2) If

z \in ⋃_{y \in p r e (x)} a n c (y)

, then

\exists y (y \in p r e (x) \land z \in a n c (y))

. Firstly,

y \in p r e (x)

can deduce

y \in a n c (x)

(

y \neq x

), which means

y \overset{*}{\to} x

. Secondly,

z \in a n c (y)

can deduce

z \overset{*}{\to} y

, along with

x \overset{*}{\to} z

obtained in the above part, and

x \overset{*}{\to} y

can be concluded according Proposition 1.

x \neq y

,

y \overset{*}{\to} x

, and

x \overset{*}{\to} y

show that there are two distinct vertices, x and y, which are reachable to each other. That is to say,

y \in U_{x}

. Overall, there must be

\exists y (y \in p r e (x) \land y \in U_{x})

.

Combining (1) and (2), if

U_{x} \in N T_{S} (D)

, then

\exists y (y \in p r e (x) \land y \in U_{x})

. □

4. Proposed Algorithm

According to the characteristics of SCC correlations in Theorems 2–5, they can be divided into two classes: trivial SCC correlations and nontrivial SCC correlations. Based on the two classes of SCC correlations, two granulation strategies were designed to granulate the vertex set U of the given digraph. According to the two granulation strategies, a parallel algorithm named GPSCC was proposed for computing SCCs of simple digraphs. Next, an example was provided to demonstrate how the proposed algorithm works.

4.1. GPSCC

It is worth mentioning that the vertices satisfying Propositions 2 or 3 certainly construct a trivial SCC. Let T be the set of vertices that need to be computed for SCCs, then the vertices satisfying Proposition 2 or 3 can be deleted immediately from T.

In addition, conducted by the above two classes of SCC correlations described in Theorems 2–5, two granulation strategies were provided as follows. (1) The granulation strategy based on trivial SCC correlations (Theorems 2 and 3). The granulation process was implemented using a function named GVT (granulate vertex by trivial correlations), as shown in Algorithm 1. For the vertex x, if any predecessor (or successor) of x individually forms a trivial SCC, then x is added into the vertex granule named

g r a n u_{g v t}

. (2) The granulation strategy based on nontrivial SCC correlations (Theorems 4 and 5). The granulation process was implemented by a function named GVNT (granulate vertex by nontrivial correlations), as shown in Algorithm 2. For the vertex x, if any predecessor (or successor) of x belongs to T, then x is added into the vertex granule named

g r a n u_{g v n t}

.

Algorithm 1 GVT: Granulate vertex by trivial correlations

1:: function GVT( $T, T_{0}$ )
2:: $g r a n u_{g v t} \leftarrow \emptyset$ ;
3:: for $x \in T$ do
4:: if $\forall y (y \in p r e (x) \to {y} \in T_{0})$ then
5:: $g r a n u_{g v t} \leftarrow {g r a n u_{g v t}, x}$ ;
6:: else
7:: if $\forall y (y \in s u c (x) \to {y} \in T_{0})$ then
8:: $g r a n u_{g v t} \leftarrow {g r a n u_{g v t}, x}$ ;
9:: end if
10:: end if
11:: end for
12:: Return $g r a n u_{g v t}$ ;
13:: end function

Algorithm 2 GVNT: Granulate vertex by nontrivial correlations

function GVNT(T)
$g r a n u_{g v n t} \leftarrow \emptyset$ ; $T_{t e m p} \leftarrow T$ ;
while $T_{t e m p} \neq \emptyset$ do
$x = T_{t e m p} (1)$ ;
if $p r e (x) \subseteq T$ then
$g r a n u_{g v n t} \leftarrow {g r a n u_{g v n t}, x}$ ; $T \leftarrow T \ x$ ; $T_{t e m p} \leftarrow T_{t e m p} \ ({x} \cup p r e (x))$ ;
else
if $s u c (x) \subseteq T$ then
$g r a n u_{g v n t} \leftarrow {g r a n u_{g v n t}, x}$ ; $T \leftarrow T \ x$ ; $T_{t e m p} \leftarrow T_{t e m p} \ ({x} \cup s u c (x))$ ;
end if
end if
end while
Return $g r a n u_{g v n t}$ ;
end function

It is exciting that the vertices in

g r a n u_{g v t}

or

g r a n u_{g v n t}

can be deleted immediately from T. The deletion operation is based on the following explanation: the SCCs corresponding to any vertex, whether

g r a n u_{g v t}

or

g r a n u_{g v n t}

, are either trivial or nontrivial; the former is not the target of SCC discovery tasks; the latter is indeed the computation target, but it can be deduced by other vertices that still belong to T.

According to Theorem 1, it can be found that the computation of SCCs for vertex x can proceed independently of the computation of SCCs for vertex y. Therefore, the computation of SCCs for every vertex in the current T can proceed in parallel. Based on the above analysis, a parallel algorithm for computing SCCs based on a granulation strategy named GPSCC was proposed, as shown in Algorithm 3. In general, the number of processors is denoted by q. It is known that SCC discovery corresponding to a vertex requires (n+m) computations at most. Thus, if the target digraph contains c nontrivial SCCs, the time complexity of each processor is

⌈ c / q ⌉ (n + m)

at most. In conclusion, the worst time complexity of the GPSCC is

O (⌈ c / q ⌉ \cdot (n + m))

. In addition, one-dimensional arrays are used to store the descendant (or ancestor) of vertices, so each array has n elements at most. Thus, the space complexity of each processor is n· q +m at most. Therefore, the worst space complexity of the GPSCC is

O (n \cdot q + m)

. This means that, even in the worst case, compared to FB, OBF, and ISPAN algorithms, we still have a lower time complexity and comparative space complexity.

Algorithm 3 GPSCC: The parallel algorithm for finding strongly connected components of simple digraphs based on granulation strategy

Input:

D = {U, E}

Output: The nontrivial SCCs in D: SCCset

$S C C s e t \leftarrow \emptyset$ ; $T \leftarrow U$ ; $T_{0} \leftarrow \emptyset$ ;
$P_{0}, P_{1}, \dots, P_{q - 1}$ ; //q is the number of processors
for $i \leq | T |$ do
compute $s u c (T (i))$ in the processor $P_{(i - 1) % q}$ ;
compute $p r e (T (i))$ in the processor $P_{(i - 1) % q}$ ;
end for
for $x \in U$ do
if $p r e (x) = \emptyset$ or $s u c (x) = \emptyset$ then
$T_{0} \leftarrow {T_{0}, x}$ ;
end if
end for
$T \leftarrow T \ T_{0}$ ;
$g r a n u_{g v t} = G V T (T, T_{0})$ ; $T \leftarrow T \ g r a n u_{g v t}$ ;
$g r a n u_{g v n t} = G V N T (T)$ ; $T \leftarrow T \ g r a n u_{g v n t}$ ;
for $i \leq | T |$ do
compute $U_{T (i)} = d e s (T (i)) ⋂ a c n (T (i))$ in the processor $P_{(i - 1) % q}$ ;
add $U_{T (i)}$ into SCCset;
end for
Output SCCset;

4.2. A Comparative Study

Take the digraph D in Figure 1a as an example to demonstrate the computation process of GPSCC in detail.

Example 2.

In this example, assume

q = 2

, which means there are two processors:

P_{0}

and

P_{1}

.

(1) Let T be the set of vertices that need to be computed SCCs. Therefore, T is initialized as

T = U =

{a, b, c, d, e, f, g, h, i, j}.

(2) According to Propositions 2 and 3, it can be deduced that there are three trivial SCCs, i.e.,

D_{a}

,

D_{c}

, and

D_{g}

. Thid means

U_{a} \in T_{S} (D)

,

U_{c} \in T_{S} (D)

, and

U_{g} \in T_{S} (D)

. Therefore,

T = T \ {a, c, g} = {b, d, e, f, h, i, j}

.

(3) Invoking the GVT on the current T, it can be determined that

U_{b}

and

U_{e}

must belong to

T_{S} (D)

; then,

g r a n u_{g v t} = {b, e}

and

T = T \ g r a n u_{g v t} = {d, f, h, i, j}

.

(4) Invoking the GVNT on the current T,

T_{t e m p}

was initialized to be the current T so that

T_{t e m p} = {d, f, h, i, j}

. Secondly, for the first vertex d, since

s u c (d) = {f} \subseteq T

,

g r a n u_{g v n t} = {d}

,

T = {f, h, i, j}

, and

T_{t e m p} = {h, i, j}

. For vertex h, since

p r e (h) = {f, i} \subseteq T

,

g r a n u_{g v n t} = {d, h}

,

T = {f, i, j}

,

T_{t e m p} = {j}

. For vertex j, since

p r e (j) = {i} \subseteq T

,

g r a n u_{g v n t} = {d, h, j}

,

T = {f, i}

, and

T_{t e m p} = \emptyset

. Finally, it can be obtained that

g r a n u_{g v n t} = {d, h, j}

; thus,

T = T \ g r a n u_{g v n t} = {f, i}

.

(5) The task of computing SCCs for vertex

T (i)

was performed by processor

P_{(i - 1) % q}

. This means that the computation of SCCs for vertex

T (1) = f

was performed by processor

P_{0}

. As a result, it can be deduced that

U_{f} = {d, f, h}

. Then, the computation of SCCs for vertex

T (2) = i

was performed by processor

P_{1}

. As a result, it can be realized that

U_{i} = {i, j}

. Lastly, two nontrivial SCCs, i.e.,

D_{{d, f, h}}

and

D_{{i, j}}

, were obtained.

5. Experiments

All experiments were performed on AMD (R52600) with 12 processors and 24 GB of memory. MATLAB@R2017a(64-bit) was the experiment platform. Tarjan [1], FB [15], OBF [16], and ISPAN [17] algorithms were used as comparison algorithms to verify the efficiency of the GPSCC algorithm. In total, 12 UFMC datasets [41] were used in these experiments, as shown in Table 1, where n is the number of vertices, m is the number of edges, and c is the number of nontrivial SCCs in a dataset. In addition, the graph was stored in the form of a matrix. Considering the impact of cache memory on the speedup effect, the priority principle of columns was used to read and write data.

The computational efficiencies of the Tarjan, FB, OBF, ISPAN, and GPSCC algorithms are shown in Table 2, Table 3, Table 4, Table 5, Table 6 and Table 7. Columns 2–5 of each table represent the execution times of the four algorithms, and the optimal values are set in bold.

S_{t}

represents the speedup of FB, OBF, ISPAN, and GPSCC compared to Tarjan. In detail,

S_{t}

is computed as the ratio of the time consumed by Tarjan against FB, OBF, ISPAN, and GPSCC. In addition, the acceleration of speedup is computed as the variation of speedup divided by the variation of processors.

Table 2, Table 3, Table 4, Table 5, Table 6 and Table 7 show that GPSCC has a higher speedup effect against Tarjan than the FB, OBF, and ISPAN algorithms. Furthermore, the speedup of GPSCC against Tarjan on the same dataset increases with the number of processors. Once the number of processors is determined, the effect of granulation strategies is highly related to the data structure of the given digraph. The changing curves of the speedup are shown in Figure 2.

From Figure 2, three phenomena can be observed: (1) the speedup of GPSCC against Tarjan significantly increases with the increasing number of processors, as shown in Figure 2c–g,j–l. Correspondingly, the acceleration of GPSCC against Tarjan is greater than zero on average. (2) The speedup of GPSCC against Tarjan slowly increases with the increasing number of processors, as shown in Figure 2h,i. Correspondingly, the acceleration of GPSCC against Tarjan tends to be zero on average. (3) The speedup of GPSCC against Tarjan decreases with the increasing number of processors, but it is still much higher than Tarjan, as shown in Figure 2a,b. Correspondingly, the acceleration of GPSCC against Tarjan is less than zero on average.

For the first phenomenon, let the EVA dataset be used as an example to analyze the reason. The scale of the SCCs in each digraph is similar. Therefore, each time consumption of SCCs is similar, as shown in Figure 3a. Then, the time consumption of SCCs can be approximately evenly distributed into the processor. This means that the load on the processor is balanced, as shown in Figure 4a.

For the second phenomenon, let the FA dataset be used as an example to analyze the reason. Firstly, the number of SCCs that need to be computed is certain. Secondly, there are some larger SCCs, which consume a larger proportion of time, as shown in Figure 3b. This means that there is processor load imbalance, as shown in Figure 4b. Then, increasing the number of processors has less of an effect on the overall time consumption.

For the third phenomenon, let the DK01R and CollegeMsg datasets be used as examples to analyze the reason. The scales of the given digraphs and SCCs are relatively small. Therefore, the time consumption of data interaction and resource allocation may be greater than for computing SCCs, as shown in Figure 5a,b. Then, with the increasing number of processors, the time consumption of data interaction and resource allocation also increases, which negatively impacts the overall time.

In conclusion, it can be obtained that the structure of the digraph will affect the time consumed by computing each SCC. Then, the proportion of time consumed by computing each SCC will affect whether the processor load is balanced, and whether the processor load is balanced will affect the efficiency of GPSCC for computing SCCs. Therefore, the speedup effect of GPSCC against Tarjan is greatly influenced by the digraph structure.

6. Conclusions and Future Perspectives

In this paper, firstly, four SCC correlations were found, which can be divided into two classes: trivial and nontrivial SCC correlations. Secondly, two granulation strategies were designed based on the proposed correlations. The granulation strategies were implemented by two functions named GVT and GVNT. Based on the characteristics of the granulation results formed by GVT and GVNT, the computation of SCC in parallel was realized. Finally, a parallel algorithm named GPSCC for computing SCCs based on granulation strategy was proposed. Experiments show that GPSCC has better computational efficiency than the compared algorithms. In the future, the idea of a granulating vertex set can be applied to the discovery of other knowledge in digraphs. In addition, the granulation strategies can be introduced into the edge set for some knowledge discovery tasks.

Author Contributions

Investigation, Resources, Data Curation, Writing—Review and Editing, T.X.; Conceptualization, Methodology, Formal analysis, Investigation, Writing—Original Draft, Writing—Review and Editing, Project administration, H.H.; Investigation, Resources, J.C.; Resources, Data Curation, Y.C.; Data Curation, J.S. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the National Natural Science Foundation of China (Nos. 62006099, 62076111). The authors would like to thank the anonymous reviewers for their insightful and constructive comments, which greatly improved the quality of this paper.

Data Availability Statement

Data are contained within the article.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Tarjan, R. Depth-first search and linear graph algorithms. SIAM J. Comput. 1972, 1, 146–160. [Google Scholar] [CrossRef]
Bernstein, A.; Gutenberg, M.; Saranurak, T. Deterministic decremental reachability, scc, and shortest paths via directed expanders and congestion balancing. In Proceedings of the 2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS), Durham, NC, USA, 16–19 November 2020; pp. 1123–1134. [Google Scholar]
Baswana, S.; Choudhary, K.; Roditty, L. An efficient strongly connected components algorithm in the fault tolerant model. Algorithmica 2019, 81, 67–985. [Google Scholar] [CrossRef]
Wan, X.; Wang, H. Efficient semi-external SCC computation. IEEE Trans. Knowl. Data Eng. 2023, 35, 3794–3807. [Google Scholar] [CrossRef]
Bernstein, A.; Gutenberg, M.; Wulff-Nilsen, C. Decremental strongly connected components and single-source reachability in near-linear time. SIAM J. Comput. 2023, 52, 128–155. [Google Scholar] [CrossRef]
Xu, T.; Wang, G. Finding strongly connected components of simple digraphs based on generalized rough sets theory. Knowl.-Based Syst. 2018, 149, 88–98. [Google Scholar] [CrossRef]
Xu, T.; Wang, G.; Yang, J. Finding strongly connected components of simple digraphs based on granulation strategy. Int. J. Approx. Reason. 2020, 118, 64–78. [Google Scholar] [CrossRef]
Cheng, F.; Xu, T.; Chen, J.; Song, J.; Yang, X. The algorithm for finding strongly connected components based on k-step search of vertex granule and rough set theory. Comput. Sci. 2022, 49, 97–107. (In Chinese) [Google Scholar]
Chen, X.; Chen, C.; Shen, J.; Fang, J.; Tang, T.; Yang, C.; Wang, Z. Orchestrating parallel detection of strongly connected components on gpus. Parallel Comput. 2018, 78, 101–114. [Google Scholar] [CrossRef]
Bloemen, V.; Laarman, A.; Pol, J.v. Multi-core on-the-fly scc decomposition. In Proceedings of the 21st ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Barcelona, Spain, 12–16 March 2016; pp. 1–12. [Google Scholar]
Barnat, J.; Chaloupka, J.; Pol, J.V.D. Distributed algorithms for scc decomposition. J. Log. Comput. 2011, 21, 23–44. [Google Scholar] [CrossRef]
Evangelista, S.; Petrucci, L.; Youcef, S. Parallel nested depth-first searches for ltl model checking. In Automated Technology for Verification and Analysis, Proceedings of the 9th International Symposium, ATVA 2011, Taipei, Taiwan, 11–14 October 2011; Springer: Berlin/Heidelberg, Germany, 2011; pp. 381–396. [Google Scholar]
Courcoubetis, C.; Vardi, M.; Wolper, P.; Yannakakis, M. Memory-efficient algorithms for the verification of temporal properties. Form. Methods Syst. Des. 1992, 1, 275–288. [Google Scholar] [CrossRef]
Laarman, A.; Langerak, R.; Pol, J.; Weber, M.; Wijs, A. Multi-core nested depth-first search. In Automated Technology for Verification and Analysis, Proceedings of the 9th International Symposium, ATVA 2011, Taipei, Taiwan, 11–14 October 2011; Springer: Berlin/Heidelberg, Germany, 2011; Volume 6996, pp. 321–335. [Google Scholar]
Fleischer, L.; Hendrickson, B.; Pınar, A. On identifying strongly connected components in parallel. In Parallel and Distributed Processing; Rolim, J., Ed.; Springer: Berlin/Heidelberg, Germany, 2000; pp. 505–511. [Google Scholar]
Barnat, J.; Moravec, P. Parallel algorithms for finding sccs in implicitly given graphs. In Formal Methods: Applications and Technology; Brim, L., Haverkort, B., Leucker, M., van de Pol, J., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 316–330. [Google Scholar]
Ji, Y.; Liu, H.; Hu, Y.; Huang, H. ispan: Parallel identification of strongly connected components with spanning trees. ACM Trans. Parallel Comput. 2022, 9, 1–27. [Google Scholar] [CrossRef]
Zhang, T.; Zhang, Y.; Ma, F.; Peng, C.; Yue, D.; Pedrycz, W. Local boundary fuzzified rough k-means-based information granulation algorithm under the principle of justifiable granularity. IEEE Trans. Cybern. 2024, 54, 519–532. [Google Scholar] [CrossRef] [PubMed]
Yao, Y. Three-way decision and granular computing. Int. J. Approx. Reason. 2018, 103, 107–123. [Google Scholar] [CrossRef]
Cheng, Y.; Zhao, F.; Zhang, Q.; Wang, G. A survey on granular computing and its uncertainty measure from the perspective of rough set theory. Granul. Comput. 2021, 6, 3–17. [Google Scholar] [CrossRef]
Wu, C.; Zhang, Q.; Yin, L.; **e, Q.; Luo, N.; Wang, G. Data-driven interval granulation approach based on uncertainty principle for efficient classification. IEEE Trans. Fuzzy Syst. 2024, 32, 12–26. [Google Scholar] [CrossRef]
Li, L.; Li, M.; Mi, J. Granular structure evaluation and selection based on justifiable granularity principle. Inf. Sci. 2024, 665, 120403. [Google Scholar] [CrossRef]
Chen, L.; Zhao, L.; **ao, Z.; Liu, Y.; Wang, J. A granular computing based classification method from algebraic granule structure. IEEE Access 2021, 9, 68118–68126. [Google Scholar] [CrossRef]
Zhang, Q.; Wu, C.; **a, S.; Zhao, F.; Gao, M.; Cheng, Y.; Wang, G. Incremental learning based on granular ball rough sets for classification in dynamic mixed-type decision system. IEEE Trans. Knowl. Data Eng. 2023, 35, 9319–9332. [Google Scholar] [CrossRef]
Guo, H.; Wang, L.; Liu, X.; Pedrycz, W. Trend-based granular representation of time series and its application in clustering. IEEE Trans. Cybern. 2022, 52, 9101–9110. [Google Scholar] [CrossRef]
Wang, W.; Liu, W.; Chen, H. Time-series forecasting via fuzzy-probabilistic approach with evolving clustering-based granulation. IEEE Trans. Fuzzy Syst. 2022, 30, 5324–5336. [Google Scholar] [CrossRef]
Han, Z.; Pedrycz, W.; Zhao, J.; Wang, W. Hierarchical granular computing-based model and its reinforcement structural learning for construction of long-term prediction intervals. IEEE Trans. Cybern. 2022, 52, 666–676. [Google Scholar] [CrossRef] [PubMed]
Aggarwal, L.; Sachdeva, S.; Goswami, P. Quantum healthcare computing using precision based granular approach. Appl. Soft Comput. 2023, 144, 110458. [Google Scholar] [CrossRef]
Liang, D.; Liu, D.; Kobina, A. Three-way group decisions with decision-theoretic rough sets. Inf. Sci. 2016, 345, 46–64. [Google Scholar] [CrossRef]
Rodríguez, R.; Labella, Á.; Tré, G.; Martínez, L. A large scale consensus reaching process managing group hesitation. Knowl.-Based Syst. 2018, 159, 86–97. [Google Scholar] [CrossRef]
Labella, Á.; Liu, H.; Rodríguez, R.; Martínez, L. A cost consensus metric for consensus reaching processes based on a comprehensive minimum cost model. Eur. J. Oper. Res. 2020, 281, 316–331. [Google Scholar] [CrossRef]
Zhang, S.; Liu, K.; Xu, T.; Yang, X.; Zhang, A. A meta-heuristic feature selection algorithm combining random sampling accelerator and ensemble using data perturbation. Appl. Intell. 2023, 53, 29781–29798. [Google Scholar] [CrossRef]
Wang, B.; Liang, J.; Yao, Y. A trilevel analysis of uncertainty measuresin partition-based granular computing. Artif. Intell. Rev. 2023, 56, 533–575. [Google Scholar] [CrossRef]
Hua, M.; Xu, T.; Yang, X.; Chen, J.; Yang, J. A novel approach for calculating single-source shortest paths of weighted digraphs based on rough sets theory. Math. Biosci. Eng. MBE 2024, 21, 2626–2645. [Google Scholar] [CrossRef] [PubMed]
Fu, S.; Wang, G.; **a, S.; Liu, L. Deep multi-granularity graph embedding for user identity linkage across social networks. Knowl.-Based Syst. 2020, 193, 105301. [Google Scholar] [CrossRef]
Yan, R.; Bao, P.; Shen, H.; Li, X. Learning node representation via motif coarsening. Knowl.-Based Syst. 2023, 278, 110821. [Google Scholar] [CrossRef]
Du, X.; Yu, F. A fast algorithm for mining temporal association rules in a multi-attributed graph sequence. Expert Syst. Appl. 2022, 192, 116390. [Google Scholar] [CrossRef]
Cheng, D.; Li, Y.; **a, S.; Wang, G.; Huang, J.; Zhang, S. A fast granular-ball-based density peaks clustering algorithm for large-scale data. IEEE Trans. Neural Netw. Learn. Syst. 2023, 1–14. [Google Scholar] [CrossRef] [PubMed]
Liu, S.; Liu, Y.; Yang, C.; Deng, L. Relative entropy of distance distribution based similarity measure of nodes in weighted graph data. Entropy 2022, 24, 1154. [Google Scholar] [CrossRef] [PubMed]
Bang-Jensen, J.; Gutin, G. Digraphs: Theory, Algorithms and Applications; Springer Science & Business Media: Berlin/Heidelberg, Germany, 2008. [Google Scholar]
Davis, T.; Hu, Y. The university of florida sparse matrix collection. ACM Trans. Math. Softw. (TOMS) 2011, 38, 1–25. [Google Scholar] [CrossRef]

Figure 1. A digraph D and its SCCs (the subgraph (a) represents the digraph D without marking the SCCs with color. The subgraph (b) represents the digraph D marking the SCCs with color).

Figure 2. The changing curves of the speedup of four algorithms (FB, OBF, ISPAN, and GPSCC) compared to Tarjan on 12 datasets with different numbers of processors. (The x-axis represents the number of processors, while the y-axis represents the speedup against Tarjan).

Figure 3. The proportion of time consumed to compute each SCC (each color block represents the proportion of time consumed to compute an SCC).

Figure 4. The time consumed by each processor for computing SCCs (case with 12 processors).

Figure 5. The proportion of resource loading and computing SCC time (the blue block represents the time proportion of resource loading, and the yellow block represents the time to compute SCCs).

Table 1. Dataset information.

Number	Datasets	n	m	c
1	DK01R	903	10,863	3
2	CollegeMsg	1899	20,296	6
3	Kohonen	3772	12,729	11
4	Poli	3915	4180	38
5	EPA	4271	8695	26
6	soc-sign-bitcoin-otc	5851	35,413	24
7	EVA	7253	6724	10
8	wiki-Vote	8297	103,689	1
9	FA	10,617	72,172	9
10	foldoc	13,356	120,238	6
11	poli-large	15,575	17,458	16
12	rim	22,560	101,495	1

Table 2. Comparison of the computational efficiency of the Tarjan, FB, OBF, ISPAN, and GPSCC algorithms with 2 processors.

Dataset	Execution Time (s)					$S_{t}$
Dataset	Tarjan	FB	OBF	ISPAN	GPSCC	FB	OBF	ISPAN	GPSCC
DK01R	2.8548	0.0515	0.0383	0.0352	0.0159	55.4330	74.3438	81.1023	179.5472
CollegeMsg	12.8487	0.1728	0.1371	0.1334	0.0603	74.3559	93.7177	96.3171	212.0796
Kohonen	0.9596	0.3674	0.2100	0.3328	0.1504	2.6112	4.5695	2.8834	6.3803
Poli	0.2394	0.0661	0.0378	0.0323	0.0146	3.6190	6.3333	7.4118	16.3973
EPA	0.2991	0.0975	0.0557	0.0425	0.0192	3.0685	5.3698	7.0376	15.5781
soc-sign-bitcoin-otc	0.3351	0.1817	0.1038	0.1230	0.0556	1.8448	3.2283	2.7244	6.0270
EVA	0.5727	0.1052	0.0601	0.0478	0.0216	5.4452	9.5291	11.9812	26.5139
wiki-Vote	0.8809	0.3126	0.1787	0.0834	0.0377	2.8169	4.9295	10.5624	23.3660
FA	0.8442	0.2229	0.1274	0.1693	0.0765	3.7865	6.6264	4.9864	11.0353
foldoc	0.2518	0.3077	0.1759	0.0733	0.0331	0.8180	1.4315	3.4352	7.6073
poli-large	1.8287	1.3369	0.7640	0.1651	0.0746	1.3678	2.3936	11.0763	24.5134
rim	1.8745	1.3390	0.7652	0.1615	0.0730	1.3998	2.4497	11.6068	25.6781