On Consensus-Based Distributed Blind Calibration of Sensor Networks

Stanković, Miloš S.; Stanković, Srdjan S.; Johansson, Karl Henrik; Beko, Marko; Camarinha-Matos, Luis M.

doi:10.3390/s18114027

Open AccessReview

On Consensus-Based Distributed Blind Calibration of Sensor Networks

by

Miloš S. Stanković

^1,2,3,*

,

Srdjan S. Stanković

^2,4,

Karl Henrik Johansson

⁵

,

Marko Beko

^6,7

and

Luis M. Camarinha-Matos

^7,8

¹

Innovation Center, School of Electrical Engineering, University of Belgrade, 11120 Belgrade, Serbia

²

Vlatacom Institute, 11070 Belgrade, Serbia

³

School of Technical Sciences, Singidunum University, 11000 Belgrade, Serbia

⁴

School of Electrical Engineering, University of Belgrade, 11120 Belgrade, Serbia

⁵

ACCESS Linnaeus Center, School of Electrical Engineering, KTH Royal Institute of Technology, SE-100 44 Stockholm, Sweden

⁶

COPELABS, Universidade Lusófona de Humanidades e Tecnologias, Campo Grande 376, 1749-024 Lisboa, Portugal

⁷

CTS/UNINOVA , Monte de Caparica, 2829-516 Caparica, Portugal

⁸

Faculty of Sciences and Technology, NOVA University of Lisbon, 2825-149 Caparica, Portugal

^*

Author to whom correspondence should be addressed.

Sensors 2018, 18(11), 4027; https://doi.org/10.3390/s18114027

Submission received: 25 September 2018 / Revised: 29 October 2018 / Accepted: 5 November 2018 / Published: 19 November 2018

(This article belongs to the Special Issue Signal and Information Processing in Wireless Sensor Networks)

Download

Browse Figures

Versions Notes

Abstract

:

This paper deals with recently proposed algorithms for real-time distributed blind macro-calibration of sensor networks based on consensus (synchronization). The algorithms are completely decentralized and do not require a fusion center. The goal is to consolidate all of the existing results on the subject, present them in a unified way, and provide additional important analysis of theoretical and practical issues that one can encounter when designing and applying the methodology. We first present the basic algorithm which estimates local calibration parameters by enforcing asymptotic consensus, in the mean-square sense and with probability one (w.p.1), on calibrated sensor gains and calibrated sensor offsets. For the more realistic case in which additive measurement noise, communication dropouts and additive communication noise are present, two algorithm modifications are discussed: one that uses a simple compensation term, and a more robust one based on an instrumental variable. The modified algorithms also achieve asymptotic agreement for calibrated sensor gains and offsets, in the mean-square sense and w.p.1. The convergence rate can be determined in terms of an upper bound on the mean-square error. The case when the communications between nodes is completely asynchronous, which is of substantial importance for real-world applications, is also presented. Suggestions for design of a priori adjustable weights are given. We also present the results for the case in which the underlying sensor network has a subset of (precalibrated) reference sensors with fixed calibration parameters. Wide applicability and efficacy of these algorithms are illustrated on several simulation examples. Finally, important open questions and future research directions are discussed.

Keywords:

blind calibration; macro calibration; distributed estimation; sensor networks; consensus; synchronization; stochastic approximation

1. Introduction

Recently emerged technologies dealing with networked systems, such as the Internet of Things (IoT), Networked Cyber-Physical Systems (CPS), and Sensor Networks (SN), still have many conceptual and practical challenges intriguing to both researchers and practitioners [1,2,3,4,5,6,7,8,9]. New classes of problems in this area continuously arise, driven by many new real-world applications. Particularly in the case of SNs, application examples include environment monitoring, wildfires detection, shop-floor manufacturing, smart cities, etc. One of the most important challenges, limiting the performance, robustness and time-to-market of these new technologies, is sensor calibration. Micro-calibration can be performed only in relatively small SNs where every sensor is individually calibrated in a controlled environment. Typical SNs are of large scale, functioning in dynamic and partially unobservable environments, thus demanding new methods and algorithms for efficient calibration. The idea of macro-calibration is to calibrate the entire SN based on the total system response, so that there is no requirement to individually calibrate every sensor node. The typical approach is to formulate the calibration problem as a parameter estimation problem (e.g., [10,11]). Of significant interest are methods for automatic calibration of SNs which successfully perform even if there are no reference signals/sensors, or other sources of groundthruth information about the measured process. In these situations, the goal of the calibration is to achieve homogeneous behavior of all the nodes, possibly enforcing dominant influence of sensors that are a priori known to provide sufficently good (calibrated) measurements. These types of calibration problems are known as the blind calibration problems (e.g., [12]). Furthermore, in many applications of SNs, it is of essential importance that the network functions in a completely decentralized fashion, preforming calibration in real-time, without the requirement for any kind of centralized information fusion. Hence, completely distributed and decentralized real-time calibration algorithms are of paramount importance.

In this paper, we study recently proposed algorithms which possess all the mentioned desirable properties: they deal with blind macro-calibration of SNs based on completely decentralized, real-time and recursive estimation of the parameters of linear calibration functions [13,14,15,16,17]. Another advantageous property of these algorithms is that it is assumed that the underlying SN have directed communication links between neighboring nodes. A basic algorithm is developed by using a distributed optimization problem setup, constructing a distributed gradient recursive scheme, with the local objectives formulated as weighted sums of mean-square differences between the corrected sensor readings of neighboring nodes. A direct consequence of this problem setup is that the algorithm can be studied as a generalized consensus scheme, to which the existing convergence results of standard consensus schemes are not applicable (e.g., [18]). However, by using techniques based on the stability of diagonally quasi-dominant dynamical systems [13,19,20,21] it is possible to prove asymptotic convergence of calibrated sensor outputs to consensus, in the mean-square sense and with probability one (w.p.1). The basic algorithm can be extended by assuming the presence of several factors which are of essential importance for practical applicability of the proposed method: (1) additive communication noise, (2) communication dropouts, (3) additive measurement noise, and (4) asynchronous communication.

Two possible modifications of the basic algorithm are presented for solving the problems posed in the cases (1)–(3) [16,17]. The first is based on the assumption that the noise variance is known a priori, which is used to design an appropriate compensation term [17]. The second modification is more robust and is based on an instrumental variable usage [16]. In both cases, the attainment of the asymptotic consensus in the mean-square sense and w.p.1 is guaranteed. In the case of completely asynchronous communication scenario, which is particularly important, we show how the algorithm can be implemented assuming a broadcast gossip communication scheme, which does not require clock synchronization among the agents, or any type of centralized information or coordination [14].

Another practically important situation arises when there are multiple nodes in the network that do not update (correct) their calibration parameters, but they still participate in the described distributed macro-calibration process. In this case, these nodes are called reference nodes since their only role is to provide reference information based on which other nodes should calibrate themselves. For example, this situation may arise in practice when a set of uncalibrated sensors is added to an already calibrated SN. In the case of more than one reference node, the corrected gains and offsets of the non-reference nodes, in general, do not converge to consensus, but to different points which depend on the information dictated by the reference sensors and the network properties [14]. In the case of only one reference sensor, the corrected gains and offsets of the rest of the sensors converge to the same point imposed by the reference sensor.

Finally, an analysis is given which clarifies the influence of initially selected weights corresponding to particular nodes in the presented calibration parameters estimation recursions. Guidelines are formulated on how these weights should be chosen so that given requirements are satisfied. General discussion of the described results is provided from both theoretical and practical points of view, based on which several future research directions are proposed.

The outline of the rest of the paper is as follows. The following section briefly discusses related work. In Section 3 we introduce the distributed blind macro-calibration problem and derive the basic algorithm for the noiseless case. Section 4 is devoted to the presentation of the convergence properties of the base algorithm. In Section 5 certain assumptions about the measured signals, communication errors, and communication protocol are relaxed, and the appropriate algorithm modifications are introduced, together with their convergence properties. In Section 6 a discussion on the convergence rate, the case of presence of reference sensors with fixed characteristics, and some design guidelines are presented. In Section 7 we present illustrative simulation results. Finally, Section 8 presents some conclusions and future research directions.

2. Related Work

Macro-calibration is based on the idea of calibrating the whole SN based on the responses of all the nodes. The most frequent approaches to this problem are based on parameter estimation techniques (e.g., [11]). If controlled stimuli are not available the problem is usually referred to as blind calibration of SNs. In general, it is a difficult problem, which has certain similarities with more general problems of blind estimation, equalization, and deconvolution (e.g., [22,23,24] and references therein).

Most of the proposed appraches to blind calibration in the existing literature are centralized and non-recursive [12,25,26,27,28,29,30,31,32,33,34,35,36,37]. Within this class of methods, in refs. [12,25] a blind calibration algorithm based on signal subspace projection was analyzed assuming restrictive signal and sensor properties. In ref. [26] the method was improved from the point of view of robustness to subspace uncertainties. In ref. [27] the authors proposed to use sparsity and convex optimization for blind estimation of calibration gains. In ref. [28] an approach to blind sensor calibration is adopted based on centralized consistency maximization at the network level assuming very dense deployment and only pairwise inter-node communications. In ref. [29] a moments-based centralized blind calibration is proposed for mobile SNs, exploiting multiple measurements of the same signal of mobile nodes, assuming that the measured signal does not change in time. In ref. [30], the authors proposed a method which can manage situations in which density requirements are not met. Interesting centralized approaches to blind drift calibration proposed in refs. [31,32,33], which also work when the density requirement is not met, are based on non-restrictive modeling of the assumed underlying signal subspace, with drift estimation using Kalman filter [31], sparse Bayesian learning [32], or deep learning [33]. The approach in ref. [34] also does not rely on stringent assumptions about signal subspace, but assume first-order auto-regressive signal process model. The authors of [35] introduce linear algebraic model of calibration relationships in a SN with centralized architecture to improve the simple mean calibration scheme, assuming sufficiently dense deployment. Another centralized approach to mobile sensors calibration is proposed in ref. [36] and is based on using a nonnegative matrix factorization. Some of the density assumptions introduced in this work were relaxed in ref. [37]. In ref. [38] the blind calibration problem was treated in a context of sparse sensing, using a message passing algorithm, assuming constant measured signal. The method proposed in ref. [39], based on geospatial estimation and Kalman filter, works if the sensors are calibrated at the beginning of the operation after deployment, and then may start to drift.

The problem of distributed blind macro-calibration may have certain similarities with the clock synchronization approaches based on local data processing and communications with neighbors [40,41,42,43,44,45,46,47]. However, these approaches cannot be directly mapped to the calibration problem treated in this paper.

Finally, certain extended consensus algorithms have been applied to SN calibration problems, but in different settings than the one treated in this paper [48,49,50,51]. An approach to blind calibration of sensor gains only, based on distributed gossip-based Expectation-Maximization iterations was proposed in ref. [52], assuming that the measured signal is constant. Another distributed approach was proposed in ref. [53], which explicitly uses a state-space model of the underlying process, and a message exchange protocol for offset compensation. The proposed scheme was formulated without proof of convergence. This paper is focused on the algorithms proposed recently in refs. [13,14,15,16,17] representing completely distributed and decentralized blind macro-calibration algorithms with rigorous proofs of convergences for both corrected sensor gains and offsets, with satisfactory performance under diverse deteriorating conditions which may typically appear in practical applications.

3. Problem Definition and the Basic Algorithm

Assume that the SN to be calibrated consists of n nodes/sensors. In the base setup, it is assumed that each sensor is measuring the same signal

x (t)

in discrete-time instants

t = \dots, - 1, 0, 1, \dots

; this signal can be considered as a realization of a stochastic process

{x (t)}

. Note that we have implicitly assumed that the sensor nodes are functioning synchronously, since all the sensor nodes perform measurements in the same time instances t. We will relax this assumption in Section 5.3. The output (measurement) of the i-th node can be written as

y_{i} (t) = α_{i} x (t) + β_{i},

(1)

where

α_{i}

is the unknown gain, and

β_{i}

the unknown offset of sensor i. Note that, in this problem setup, it is assumed that

α_{i}

and

β_{i}

are unknown constants and not the random variables.

Calibration of a sensor is performed by applying an affine calibration function to the raw readings (1) which results in the following calibrated sensor output

z_{i} (t) = a_{i} y_{i} (t) + b_{i} = g_{i} x (t) + f_{i},

(2)

where

a_{i}

and

b_{i}

are the calibration parameters to be obtained,

g_{i} = a_{i} α_{i}

is the corrected gain and

f_{i} = a_{i} β_{i} + b_{i}

the corrected offset. The calibration objective is, ideally, to find parameters

a_{i}

and

b_{i}

for which

g_{i}

is equal to one and

f_{i}

equal to zero. In general, if we assume that there are no sensors which give perfect readings

z_{i} (t) = x (t)

and that the signal

x (t)

is unknown and cannot be obtained or measured by some other means, this objective is impossible to achieve. Hence, in our decentralized real-time blind macro-calibration problem setup, this ideal objective must be alleviated: we require that the calibration process asymptotically achieves equal calibrated outputs

z_{i} (t)

for all the nodes

i = 1, \dots, n

. To approach as close as possible to the ideal goal of achieving

g_{i} = 1

and

f_{i} = 0

, we could use certain a priori knowledge about the underlying SN, and try to adjust the algorithm, such that, loosely speaking, the “good” sensors (e.g., precalibrated or higher-quality sensors) correct, using the consensus strategy, the response of the rest of the sensors. For example, if, in a given SN, there is an apriori given perfectly calibrated reference sensor, the ideal asymptotic calibration (

g_{i} = 1

and

f_{i} = 0

) of the rest of the sensor nodes will be achieved if the consensus goal is achieved.

It is assumed that the underlying SN have a predefined communication topology, defining possible inter-sensor communications, represented by a directed graph

G = (N, E)

, where

N

is the set of nodes (sensors) and

E

the set of communication links (arcs). Define the adjacency matrix

A = [a_{i j}]

,

i, j = 1, \dots, n

, where

a_{i j} = 1

if the j-th node is able to send messages to the i-th sensor, and

a_{i j} = 0

otherwise. Let

N_{i}

be the set of in-neighboring nodes (or just neighbors) of the i-th node, i.e., the nodes j for which

a_{i j} = 1

. Similarly, let

N_{i}^{out}

be the set of out-neighboring nodes of the i-th node, i.e., the nodes j for which

a_{j i} = 1

.

Let us now derive the basic calibration algorithm. The idea is to start with local criteria for each node, whose local minimization would lead to a network-level consensus on the corrected sensor outputs:

\begin{matrix} J_{i} = \sum_{j \in N_{i}} γ_{i j} E {{(z_{j} (t) - z_{i} (t))}^{2}}, \end{matrix}

(3)

i = 1, \dots, n

, where

γ_{i j}

are nonnegative scalar weights whose influence on the properties of the algorithm will be discussed later. Denoting

θ_{i} = {[a_{i} b_{i}]}^{T}

, the following expression is obtained for the gradient of (3):

{grad}_{θ_{i}} J_{i} = \sum_{j \in N_{i}} γ_{i j} E \{(z_{j} (t) - z_{i} (t)) [\begin{matrix} y_{i} (t) \\ 1 \end{matrix}]\} .

(4)

From (4) we obtain the following stochastic gradient recursion for estimating

θ_{i}^{*}

minimizing (3):

{\hat{θ}}_{i} (t + 1) = {\hat{θ}}_{i} (t) + δ_{i} (t) \sum_{j \in N_{i}} γ_{i j} ϵ_{i j} (t) [\begin{matrix} y_{i} (t) \\ 1 \end{matrix}],

(5)

where

{\hat{θ}}_{i} (t) = {[{\hat{a}}_{i} (t) {\hat{b}}_{i} (t)]}^{T}

,

ϵ_{i j} (t) = {\hat{z}}_{j} (t) - {\hat{z}}_{i} (t)

,

{\hat{z}}_{i} (t) = {\hat{a}}_{i} (t) y_{i} (t) + {\hat{b}}_{i} (t)

, and

δ_{i} (t) > 0

is a time-varying gain whose influence on the convergence properties of the algorithm will be discussed later. The initial conditions are assumed to be

{\hat{θ}}_{i} (0) = {[1 0]}^{T}

,

i = 1, \dots, n

. We expect that the set of recursions (5) asymptotically achieve that all the local estimates of corrected gains

{\hat{g}}_{i} (t) = {\hat{a}}_{i} (t) α_{i}

and corrected offsets

{\hat{f}}_{i} (t) = {\hat{a}}_{i} (t) β_{i} + {\hat{b}}_{i} (t)

converge to the same values

\bar{g}

and

\bar{f}

, respectively; this implies that the corrected sensor outputs of all the nodes are also equal

{\hat{z}}_{j} (t) = {\hat{z}}_{i} (t)

,

i, j = 1, \dots, n

.

In Figure 1 an illustrating smart-city example sensor network is depicted. Completely decentralized network architecture is assumed, i.e., the nodes communicate according to the directed communication graph which is represented in the figure using arcs. The communication graph will typically depend on the mutual node distances, transmission power of individual nodes, channel conditions, presence of obstacles, etc. Each node in the network is equipped with the same type of sensor which measures certain physical quantity (e.g., certain atmospheric condition or air quality indicator). At each time instant t, a node i performs local reading of the raw sensor output

y_{i} (t)

, calculation of the corrected sensor output

{\hat{z}}_{i} (t)

according to (2) using current local estimates of the calibration parameters

{\hat{a}}_{i} (t)

and

{\hat{b}}_{i} (t)

, transmission of the corrected value

{\hat{z}}_{i} (t)

to the out-neighbors

N_{i}^{out}

, reception of the values

{\hat{z}}_{j} (t)

from the in-neighbors

j \in N_{i}

, and calculation of the updated estimates of the local calibration parameters

{\hat{a}}_{i} (t + 1)

and

{\hat{b}}_{i} (t + 1)

using (5). In the initial presentation we will assume that, at each iteration of the algorithm (5), local sensor measurement

y_{i} (t)

and the current messages of the neighboring nodes’ corrected outputs

{\hat{z}}_{j} (t)

are available at node i. Possible communication dropouts and/or faulty/noisy sensor readings will be treated later. Local computational cost for each agent is minor since only two parameters are being estimated. Communication complexity depends on the number of neighboring agents, which is small in typical SNs with decentralized architecture.

For the sake of compact notations, suitable for convergence analysis of the derived algorithm, let us introduce

{\hat{ϕ}}_{i} (t) = [\begin{matrix} {\hat{g}}_{i} (t) \\ {\hat{f}}_{i} (t) \end{matrix}] = [\begin{matrix} α_{i} & 0 \\ β_{i} & 1 \end{matrix}] {\hat{θ}}_{i} (t),

(6)

and

ϵ_{i j} (t) = [\begin{matrix} x (t) & 1 \end{matrix}] ({\hat{ϕ}}_{j} (t) - {\hat{ϕ}}_{i} (t)),

(7)

so that (5) becomes

{\hat{ϕ}}_{i} (t + 1) = {\hat{ϕ}}_{i} (t) + δ_{i} (t) \sum_{j \in N_{i}} γ_{i j} Ω_{i} (t) ({\hat{ϕ}}_{j} (t) - {\hat{ϕ}}_{i} (t)),

(8)

where

\begin{matrix} Ω_{i} (t) = & [\begin{matrix} α_{i} y_{i} (t) x (t) & α_{i} y_{i} (t) \\ β_{i} y_{i} (t)] x (t) & 1 + β_{i} y_{i} (t) \end{matrix}] \\ = & [\begin{matrix} α_{i} β_{i} x (t) + α_{i}^{2} x {(t)}^{2} & α_{i} β_{i} + α_{i}^{2} x (t) \\ (1 + β_{i}^{2}) x (t) + α_{i} β_{i} x {(t)}^{2} & 1 + β_{i}^{2} + α_{i} β_{i} x (t) \end{matrix}], \end{matrix}

(9)

with the initial conditions

{\hat{ρ}}_{i} (0) = {[α_{i} β_{i}]}^{T}, i = 1, \dots, n

. Therefore, the following compact form for the recursions (8) is obtained

\hat{ϕ} (t + 1) = [I + (Δ (t) \otimes I_{2}) B (t)] \hat{ϕ} (t),

(10)

where ⊗ is the Kronecker product, I is the identity matrix of dimension

2 n

,

I_{2}

is the dimension 2 identity matrix,

\hat{ϕ} (t) = {[{\hat{ϕ}}_{1} {(t)}^{T} \dots {\hat{ϕ}}_{n} {(t)}^{T}]}^{T}, Δ (t) = diag {δ_{1} (t), \dots, δ_{n} (t)}

,

diag {\dots}

denotes the corresponding block diagonal matrix,

B (t) = Ω (t) (Γ \otimes I_{2}),

Ω (t) = diag {Ω_{1} (t), \dots, Ω_{n} (t)},

Γ = [\begin{matrix} - \sum_{j, j \neq 1} γ_{1 j} & γ_{12} & \dots & γ_{1 n} \\ γ_{21} & - \sum_{j, j \neq 2} γ_{2 j} & \dots & γ_{2 n} \\ ⋱ \\ γ_{n 1} & γ_{n 2} & \dots & - \sum_{j, j \neq n} γ_{n j} \end{matrix}],

where

γ_{i j} = 0

when

j \notin N_{i}

, and the initial condition is

\hat{ϕ} (0) = {[{\hat{ϕ}}_{1} {(0)}^{T} \dots {\hat{ϕ}}_{n} {(0)}^{T}]}^{T},

according to (8). From the way in which we have constructed the vector

\hat{ϕ} (t)

we conclude that the asymptotic value of

\hat{ϕ} (t)

should be such that all of its odd components are equal, and all of its even components are equal.

In the next section, it will be shown that, under certain general assumptions, for any choice of the weights

γ_{i j} \geq 0

for

j \in N_{i}

(and

γ_{i j} = 0

when

j \notin N_{i}

) the algorithm achieves convergence to consensus. However, if the underlying calibration objective is to achieve absolute calibration of the sensors (i.e.,

\bar{g}

close to one and

\bar{f}

close to zero), this can be done by trying to exploit sensors that are a priori known to have good characteristics. In a large SN, this can be achieved in two ways: (1) if the large number of sensors are “good” sensors, then

γ_{i j}

-s in all neighborhoods

N_{i}

should be approximately the same; or (2) if there is a set of a priori chosen good sensors

j \in N^{f} \subset N

the goal is to enforce their dominant influence to the rest of the nodes. There are two possibilities to achieve this: (a) to set high values of

γ_{i j}

for all

j \in N^{f}

and

i \in N_{j}^{out}

; or (b) to set small values of

γ_{j k}

for all

j \in N^{f}

,

k \in N_{j}

,

k \neq j

(which prevents large changes of

{\hat{ϕ}}_{j} (t)

). Section 6.3 deals with the guidelines on weights tuning, while Section 6.4 treats the case in which a set of reference sensors is kept with fixed calibration parameters.

4. Convergence Analysis

In this section we discuss the convergence properties of the calibration scheme presented in the previous section, where it has been assumed that both local sensor measurements and inter-node communications are perfect, i.e., possible communication errors and/or measurement errors are not present. We first analyze this basic scheme in order to focus on structural characteristics of the algorithm; the case of lossy SNs will be treated in the subsequent sections. In the basic setup, without presence of any unreliability, it is sufficient to assume that the step sizes

δ_{i} (t)

are constant:

(A1)

δ_{i} (t) = δ = const

, for all

i = 1, \dots, n

.

For clearer initial presentation of the convergence results, we now adopt a simplifying assumption:

(A2)

{x (t)}

is independent and identically distributed (i.i.d.) sequence, with

E {x (t)} = \bar{x} < \infty

and

E {x {(t)}^{2}} = s^{2} < \infty

.

In practice, when the SNs are used to measure certain physical quantities, the assumption that

{x (t)}

is i.i.d. is almost never satisfied; hence it will be relaxed later.

Based on (A1) and (A2), the expectation of the parameter estimates

\bar{ϕ} (t) = E {ϕ (t)}

satisfies the following recursion

\bar{ϕ} (t + 1) = (I + δ \bar{B}) \bar{ϕ} (t),

(11)

where

\bar{ϕ} (0) = ϕ (0)

,

\bar{B} = \bar{Ω} (Γ \otimes I_{2})

and

\bar{Ω} = E {Ω (t)} = diag {{\bar{Ω}}_{1} \dots {\bar{Ω}}_{n}}

, with

{\bar{Ω}}_{i} = [\begin{matrix} α_{i} β_{i} \bar{x} + α_{i}^{2} s^{2} & α_{i} β_{i} + α_{i}^{2} \bar{x} \\ (1 + β_{i}^{2}) \bar{x} + α_{i} β_{i} s^{2} & 1 + β_{i}^{2} + α_{i} β_{i} \bar{x} \end{matrix}] .

(12)

The following assumption, typical for consensus-based algorithms, is introduced:

(A3) Graph

G

has a spanning tree.

It implies that the matrix

Γ

has one zero eigenvalue and the rest eigenvalues with negative real parts, e.g., [54]. Hence, from the structure of matrix

\bar{B}

, we directly conclude that it has at least two zero eigenvalues. Its remaining eigenvalues can be characterized starting from the following assumption:

(A4)

s^{2} - {\bar{x}}^{2} = var {x (t)} > 0

.

This assumption guarantees that the estimation recursions are sufficiently excited by the signal

x (t)

. Its important consequence is that

- {\bar{Ω}}_{i}

defined by (12) is Hurwitz, for all

i = 1, \dots, n

. Indeed, using some simple algebra it can be derived that

- {\bar{Ω}}_{i}

is Hurwitz if and only if (iff)

α_{i}^{2} (s^{2} - {\bar{x}}^{2}) > 0, 2 α_{i} β_{i} \bar{x} + α_{i}^{2} s^{2} + 1 + β_{i}^{2} > 0 .

(13)

Both inequalities hold iff (A4) holds. This greatly simplifies further derivations which depend on somewhat complicate expression (12) for the

2 \times 2

diagonal blocks of the matrix

\bar{Ω}

.

Because of the block structure of matrices

\bar{Ω}

and

\bar{B}

, the properties of the main recursion (11) cannot be analyzed using standard linear consensus methodologies (see, e.g., [18,54] and references therein). To cope with this problem, a methodology based on the concept of diagonal quasi-dominance of matrices decomposed into blocks has been used [13,17,19,20,21] to obtain the following important result characterizing all the eigenvalues of the matrix

\bar{B}

.

Lemma 1

([13,17]). Assume that the assumptions (A3) and (A4) hold. Then, matrix

\bar{B}

in (11) has two zero eigenvalues and the rest eigenvalues have negative real parts.

Observe that vectors

i_{1} = {[\begin{matrix} 1 & 0 & 1 & 0 & \dots & 1 & 0 \end{matrix}]}^{T} \in R^{2 n}

and

i_{2} = {[\begin{matrix} 0 & 1 & 0 & 1 & \dots & 0 & 1 \end{matrix}]}^{T} \in R^{2 n}

, where

R

is the set of real numbers, are the right eigenvectors of

\bar{B}

corresponding to the eigenvalue at the origin. Let

ρ_{1}

and

ρ_{2}

be the corresponding normalized left eigenvectors, satisfying

[\begin{matrix} ρ_{1} \\ ρ_{2} \end{matrix}] [\begin{matrix} i_{1} & i_{2} \end{matrix}] = I_{2}

. The following lemma deals with a similarity transformation important for all the remaining derivations throughout the paper.

Lemma 2

([13,17]). Let

T = [\begin{matrix} i_{1} & i_{2} & T_{2 n \times (2 n - 2)} \end{matrix}]

, where

T_{2 n \times (2 n - 2)}

is an

2 n \times (2 n - 2)

matrix, such that span

{T_{2 n \times (2 n - 2)}} = span {\bar{B}}

(

span {A}

denotes a linear space spanned by the columns of matrix A). Then, T is nonsingular and

T^{- 1} \bar{B} T = [\begin{array}{c} 0_{2 \times 2} & 0_{2 \times (2 n - 2)} \\ 0_{(2 n - 2) \times 2} & {\bar{B}}^{*} \end{array}],

(14)

where

{\bar{B}}^{*}

is Hurwitz, and

0_{i \times j}

denotes a

i \times j

zero matrix.

Notice that

T^{- 1} = [\begin{matrix} ρ_{1} \\ ρ_{2} \\ S_{(2 n - 2) \times 2 n} \end{matrix}],

(15)

where

S_{(2 n - 2) \times 2 n}

can be determined from the definition of T.

From the structure of the matrices in (11), it can be concluded that the transformation T from Lemma 2, when applied to the original matrix

B (t)

, will produce a matrix which has the same structure as the transformed matrix given in Equation (14).

Lemma 3

([13,17]). For the matrix

B (t)

in (10) it holds that, for all t,

T^{- 1} B (t) T = [\begin{matrix} 0_{2 \times 2} & 0_{2 \times (2 n - 2)} \\ 0_{(2 n - 2) \times 2} & B {(t)}^{*} \end{matrix}],

(16)

where

B {(t)}^{*}

is an

(2 n - 2) \times (2 n - 2)

matrix and T is given in Lemma 2.

The following convergence theorem can now be formulated.

Theorem 1

([17]). Assume that Assumptions (A1)–(A4) hold. Then there exists

δ^{'} > 0

such that for all

δ \leq δ^{'}

in (10)

lim_{t \to \infty} \hat{ϕ} (t) = (i_{1} ρ_{1} + i_{2} ρ_{2}) \hat{ϕ} (0)

(17)

in the mean square sense and w.p.1.

Note here that the limit vector in (17)

(i_{1} ρ_{1} + i_{2} ρ_{2}) \hat{ϕ} (0)

have all the odd elements equal, and all the even elements equal, which means that the corrected gains of all the nodes converge to the same value, and the corrected offsets of all the nodes converge to the same value. It can be shown [13] that this value only depends on the unknown sensor parameters

α_{i}

and

β_{i}

, and the weights

γ_{i j}

in

J_{i}

,

i, j = 1, \dots n

. For given initial conditions in (5),

ρ_{1} \hat{ϕ} (0)

and

ρ_{2} \hat{ϕ} (0)

are in the form of weighted sums of

α_{i}

and

β_{i}

,

1, \dots, n

, respectively. Assuming that the weights

γ_{i j}

are the same for all the nodes, and that

α_{i}

have a distribution centered around one, and

β_{i}

around zero, these weighted sums will be close to one and zero, respectively.

The value of

δ^{'} > 0

in Theorem 1, which ensures convergence, may be restrictive. In practice, the choice of step size

δ

in (A1) should be based on the actual properties of the underlying SN; its value needs to be small enough to achieve convergence, but it should also be sufficiently large to achieve acceptable rate of convergence (as in the standard parameter estimation recursions [55]).

After clarifying the main structural properties of the algorithm, we now treat the more realistic case of correlated sequences

{x (t)}

. We replace (A2) with:

(A2’) The random process

{x (t)}

is weakly stationary, bounded w.p.1, and with bounded first and second moments, i.e.,

| x (t) | \leq K < \infty

,

E {x (t)} = \bar{x} < \infty

,

E {x (t - d) x (t)} = m (d) < \infty

for all

d \in {0, 1, 2, \dots}

(

E {\cdot}

is a sign of the mathematical expectation),

m (0) = s^{2} < \infty

. It also holds that

\begin{matrix} (a) & | E {x (t) | F_{t - τ}} - \bar{x} | = o (1), (w . p . 1) \end{matrix}

(18)

\begin{matrix} (b) & | E {x (t - d) x (t) | F_{t - τ}} - m (d) | = o (1), (w . p . 1) \end{matrix}

(19)

when

τ \to \infty

, for all

d \in {0, 1, 2, \dots}

,

τ > d

(

F_{t - τ}

denotes the minimal

σ

-algebra generated by

{x (0), x (1), \dots, x (t - τ)}

, and

o (1)

denotes a function that converges to zero when

τ \to \infty

).

Hence, (A2’) requires stationarity, boundedness, and imposes a mixing condition on the signal

{x (t)}

. The explicitly used time shift parameter d will be used later for introducing a new algorithm based on an instrumental variable, capable of dealing with possible measurement noise.

The following theorem examines the convergence of the algorithm (11) under assumption (A2’):

Theorem 2

([16]). Assume that the assumptions (A1), (A2’), (A3) and (A4) hold. Then there exists

δ^{″} > 0

such that for all

δ \leq δ^{″}

in (10)

{lim}_{t \to \infty} \hat{ϕ} (t) = (i_{1} ρ_{1} + i_{2} ρ_{2}) \hat{ϕ} (0)

in the mean square sense and w.p.1.

5. Extensions of the Basic Algorithm

In this section, we introduce several modifications and generalizations of the basic algorithm (5), so that it is possible to achieve distributed calibration under more challenging conditions, typically present in real-life SNs: communication dropouts, additive communication noise, measurement noise, and asynchronous communication. Convergence properties of the introduced modifications are presented in detail.

5.1. Communication Errors

In this subsection, we assume that inter-node communication errors can be manifested in two ways: (1) communication dropouts (outages) and (2) additive communication noise. Communication dropouts typically occur in SNs using digital communication; additive noise can, in this case, model quantization effects. For example, in the case of smart city sensor networks, depicted in Figure 1, the dropouts will happen relatively often because of the dynamic environment, where both physical obstacles and electronic interference can be persistent. In certain, less frequent practical situations, SNs can use analog communication (e.g., when certain types of energy harvesting are used [56]), when additive communication noise is dominant, and dropouts appear less frequently.

The communication errors are formally introduced using the following assumptions:

(A5) The weights

γ_{i j}

in the algorithm (5) are now randomly time-varying, according to stochastic processes given by

{γ_{i j} (t)} = {u_{i j} (t) γ_{i j}}

, where

{u_{i j} (t)}

are i.i.d. binary random sequences, such that

u_{i j} (t) = 1

with probability

p_{i j}

(

p_{i j} > 0

when

j \in N_{i}

), and

u_{i j} (t) = 0

with probability

1 - p_{i j}

.

(A6) Instead of receiving

{\hat{z}}_{j} (t)

from the j-th node, the i-th node receives

{\hat{z}}_{j} (t) + ξ_{i j} (t)

, where

{ξ_{i j} (t)}

is an i.i.d. random sequence with

E {ξ_{i j} (t)} = 0

and

E {ξ_{i j} {(t)}^{2}} = {(σ_{i j}^{ξ})}^{2} < \infty

.

(A7) Processes

{x (t)}

,

{u_{i j} (t)}

and

{ξ_{i j} (t)}

are mutually independent.

Based on the above assumptions, the communication dropout at any iteration t, when node j is sending to node i, will happen with probability

1 - p_{i j}

, independently of the additive communication noise process

{ξ_{i j} (t)}

and the measured signal

{x (t)}

.

Denoting

ν_{i} (t) = \sum_{j \in N_{i}} γ_{i j} (t) ξ_{i j} (t) [\begin{matrix} α_{i} y_{i} (t) \\ 1 + β_{i} y_{i} (t) \end{matrix}],

and

ν (t) = [\begin{matrix} ν_{1} (t) & \dots & ν_{n} (t) \end{matrix}]

, one obtains from (10) that

\hat{ϕ} (t + 1) = [I + (Δ (t) \otimes I_{2}) B^{'} (t)] \hat{ϕ} (t) + Δ (t) ν (t),

(20)

where

B^{'} (t) = Ω (t) (Γ (t) \otimes I_{2})

, and

Γ (t)

is obtained from

Γ

by applying (A5).

Convergence properties of the recursion (20), under the additional assumptions (A5)–(A7), can be derived starting from the results of the previous subsection. Due to the mutual independence of the random variables in

B^{'} (t)

, it can be concluded that

E {B^{'} (t)} = {\bar{B}}^{'} = \bar{Ω} (\bar{Γ} \otimes I_{2})

, where

\bar{Γ} = E {Γ (t)}

is the same as

Γ

but with

γ_{i j}

replaced by

γ_{i j} p_{i j}

. Also, it follows that

{\tilde{B}}^{'} (t) ≐ B^{'} (t) - {\bar{B}}^{'}

, is a martingale difference sequence (since

E {{\tilde{B}}^{'} (t) | F_{t - 1}} = 0

). Furthermore, it can be concluded that

{\bar{B}}^{'} = \bar{Ω} (\bar{Γ} \otimes I_{2})

has the same spectrum as

\bar{B}

in (11): it has two zero eigenvalues and the rest eigenvalues are with negative real part.

Since the additive noise is now present in the recursions (20), (A1) needs to be replaced with the following assumption, typical in the stochastic approximation literature (e.g., [57]):

(A1’)

δ_{i} (t) = δ (t) > 0

,

\sum_{t = 0}^{\infty} δ (t) = \infty

,

\sum_{t = 0}^{\infty} δ {(t)}^{2} < \infty

,

i = 1, \dots, n

.

Intuitively, (A1’) introduces diminishing gains

δ_{i} (t)

which converge to zero slowly enough, so that the additive noise can be averaged out while asymptotic convergence to a consensus point is achieved (despite the presence of noise).

Therefore, we have

\hat{ϕ} (t + 1) = (I + δ (t) {\bar{B}}^{'}) \hat{ϕ} (t) + δ (t) {\tilde{B}}^{'} (t) \hat{ϕ} (t) + δ (t) ν (t) .

(21)

Similarly as in the noiseless case, let as introduce the similarity transformation

T^{'} = [\begin{matrix} i_{1} & i_{2} & T_{2 n \times (2 n - 2)}^{'} \end{matrix}],

where

T_{2 n \times (2 n - 2)}^{'}

is an

2 n \times (2 n - 2)

matrix, such that

span {T_{2 n \times (2 n - 2)}^{'}} = span {{\bar{B}}^{'}}

. Then,

{(T^{'})}^{- 1} = [\begin{matrix} ρ_{1}^{'} \\ ρ_{2}^{'} \\ S_{(2 n - 2) \times 2 n}^{'} \end{matrix}]

, where

ρ_{1}^{'}

and

ρ_{2}^{'}

are the left eigenvectors of

{\bar{B}}^{'}

corresponding to the eigenvalue at the origin. By applying transformation

T^{'}

to (21), and using stochastic Lyapunov stability arguments, along with the arguments typically used in analyzing stochastic approximation algorithms [13,17,58,59], the following theorem can be proved:

Theorem 3

([13,17]). Let Assumptions (A1’), (A2)–(A7) be satisfied. Then,

\hat{ϕ} (t)

generated by (21) converges to

i_{1} w_{1} + i_{2} w_{2}

in the mean square sense and w.p.1, where

w_{1}

and

w_{2}

are scalar random variables satisfying

E {w_{1}} = ρ_{1}^{'} \hat{ϕ} (0)

and

E {w_{2}} = ρ_{2}^{'} \hat{ϕ} (0)

.

The theorem essentially states that, again, all the corrected drifts converge to the same point, and all the corrected offsets converge to the same point; however, because of the additive communication noise, these points are random and depend on the noise realization. The mean values of these possible convergence points depend on the sensor parameters

α_{i}

and

β_{i}

, the design parameters

γ_{i j}

, as well as on the dropout probabilities

p_{i j}

,

i, j = 1, \dots n

.

5.2. Measurement Noise

In this subsection we, in addition to communication errors, assume that the signal

x (t)

is measured with additive measurement noise. This situation is of essential importance for practical applications since practically all the existing sensors contain certain measurement errors which are typically modeled using stochastic processes [3].

Formally, we model the additive noise stochastic process using the following assumption:

(A8) Instead of

y_{i} (t)

given by (1), the sensor measurements are now contaminated by noise, and given by

y_{i}^{η} (t) = α_{i} x (t) + β_{i} + η_{i} (t),

where

{η_{i} (t)}

,

i = 1, \dots n

, are zero mean i.i.d. random sequences with

E {η_{i} {(t)}^{2}} = {(σ_{i}^{η})}^{2}

, independent of the measured signal

x (t)

.

By replacing

y_{i}^{η} (t)

instead of

y_{i} (t)

in the base algorithm (5), one obtains the following “noisy” version of (8):

\begin{matrix} {\hat{ϕ}}_{i} (t + 1) = & {\hat{ϕ}}_{i} (t) + δ_{i} (t) \sum_{j \in N_{i}} γ_{i j} {[Ω_{i} (t) + Ψ_{i} (t)] [{\hat{ϕ}}_{j} (t) - {\hat{ϕ}}_{i} (t)] + N_{i j} (t) {\hat{ϕ}}_{j} (t) - N_{i i} (t) {\hat{ϕ}}_{i} (t)}, \end{matrix}

(22)

where

Ψ_{i} (t) = η_{i} (t) [\begin{matrix} α_{i} x (t) & α_{i} \\ β_{i} x (t) & β_{i} \end{matrix}],

N_{i j} (t) = \frac{η_{j} (t)}{α_{j}} [\begin{matrix} α_{i} y_{i} (t) & 0 \\ β_{i} y_{i} (t) & 0 \end{matrix}] + [\begin{matrix} \frac{η_{j} (t) η_{i} (t)}{α_{j}} & 0 \\ 0 & 0 \end{matrix}]

and

N_{i i} (t) = \frac{η_{i} (t)}{α_{i}} [\begin{matrix} α_{i} y_{i} (t) & 0 \\ β_{i} y_{i} (t) & 0 \end{matrix}] + [\begin{matrix} \frac{η_{i} {(t)}^{2}}{α_{i}} & 0 \\ 0 & 0 \end{matrix}]

, assuming

α_{i} \neq 0

,

i = 1, \dots, n

. It is important to observe here that

E {Ψ_{i} (t)} = 0

,

E {N_{i j} (t)} = 0

; however

E {N_{i i} (t)} = [\begin{matrix} \frac{{(σ_{i}^{η})}^{2}}{α_{i}} & 0 \\ 0 & 0 \end{matrix}]

.

Assuming again that the step sizes

δ_{i} (t)

,

i = 1, \dots, n

, satisfy (A1’), one can obtain the following equation analog to (10):

\hat{ϕ} (t + 1) = (I + δ (t) {[Ω (t) + Ψ (t)] (Γ \otimes I_{2}) + \tilde{N} (t)}) \hat{ϕ} (t),

(23)

where

Ψ (t) = diag {Ψ_{1} (t), \dots, Ψ_{n} (t)}

and

\tilde{N} (t) = [{\tilde{N}}_{i j} (t)]

with

{\tilde{N}}_{i j} (t) = - \sum_{k, k \neq i} γ_{i k} N_{i i} (t)

for

i = j

and

{\tilde{N}}_{i j} (t) = γ_{i j} N_{i j} (t)

for

i \neq j

,

i, j = 1, \dots, n

.

In an analogous way as in the previous section, instead of (11), the following equation is obtained for the mean of the corrected calibration parameters

\bar{ϕ} (t + 1) = [I + δ (t) (\bar{B} + Σ_{η})] \bar{ϕ} (t),

(24)

where

\bar{B}

is as in (11) and

Σ_{η} = - diag {\frac{{(σ_{1}^{η})}^{2}}{α_{1}} \sum_{j} γ_{1 j}, 0, \dots, \frac{{(σ_{n}^{η})}^{2}}{α_{n}} \sum_{j} γ_{n j}, 0}

. Because of the additional term

Σ_{η}

, the sums of the rows of the matrix

\bar{B} + Σ_{η}

are not equal to zero anymore, so that the convergence to consensus (as in Theorem 1) cannot be achieved in this case.

However, it can be seen from the structure of the recursion (24) that, if we assume that the measurement noise variances

{(σ_{i}^{η})}^{2}

are a priori known, we can use them to modify the basic algorithm (5) in the following way, ensuring again the asymptotic convergence to consensus:

\begin{matrix} {\hat{θ}}_{i} (t + 1) = {\hat{θ}}_{i} (t) + δ (t) {\sum_{j \in N_{i}} γ_{i j} ϵ_{i j}^{η} (t) [\begin{matrix} y_{i}^{η} (t) \\ 1 \end{matrix}] + [\begin{array}{c} {(σ_{i}^{η})}^{2} \sum_{j \in N_{i}} γ_{i j} & 0 \\ 0 & 0 \end{array}] {\hat{θ}}_{i} (t)}, \end{matrix}

(25)

where

ϵ_{i j}^{η} (t) = {\hat{z}}_{j}^{η} (t) - {\hat{z}}_{i}^{η} (t)

and

{\hat{z}}_{i}^{η} (t) = {\hat{a}}_{i} (t) y_{i}^{η} (t) + {\hat{b}}_{i} (t)

,

i = 1, \dots, n

.

The following theorem deals with the convergence of the above modification of the basic algorithm, when the measurement noise is present together with the communication errors. The convergence points will again depend on the measurement and communication noise realizations, in a similar way as in Theorem 3.

Theorem 4

([17]). Assume that the assumptions (A1’), (A2)–(A8) hold. Then,

\hat{ϕ} (t)

, given by (25), converges to

i_{1} w_{1} + i_{2} w_{2}

in the mean square sense and w.p.1, where

w_{1}

and

w_{2}

are scalar random variables satisfying

E {w_{1}} = ρ_{1}^{'} \hat{ϕ} (0)

and

E {w_{2}} = ρ_{2}^{'} \hat{ϕ} (0)

.

Notice that the above theorem was based on assumption (A2): indeed, when both

{x (t)}

and

{η_{i} (t)}

are i.i.d. sequences, it is not surprising that the asymptotic consensus is achievable only provided

σ_{i}^{η}

,

i = 1, \dots, n

, are known. However, we can replace the unrealistic assumption (A2) with (A2’) (introduced in Section 4 in the noiseless case) allowing correlated sequences

{x (t)}

which is almost always the case in practice. In such a way, the correlatedness problem present in the algorithm (24) can be overcame, without requiring any a priori information about the measurement noise process. The idea is to introduce instrumental variables in the basic algorithm in the way analogous to the one often used in the field system identification, e.g., [60,61]. Instrumental variables have the basic property of being correlated with the measured signal, and uncorrelated with noise. If

{ζ_{i} (t)}

is the instrumental variable sequence of the i-th agent, one has to ensure that

ζ_{i} (t)

is correlated with

x (t)

and uncorrelated with

η_{j} (t)

,

j = 1, \dots, n

. Under A2’) a logical choice is to take the delayed sample of the measured signal as an instrumental variable, i.e., to take

ζ_{i} (t) = y_{i}^{η} (t - d)

, where

d \geq 1

. Consequently, we present the following general calibration algorithm based on instrumental variables able to cope with measurement noise:

{\hat{θ}}_{i} (t + 1) = {\hat{θ}}_{i} (t) + δ (t) \sum_{j \in N_{i}} γ_{i j} ϵ_{i j}^{η} (t) [\begin{matrix} y_{i}^{η} (t - d) \\ 1 \end{matrix}],

(26)

where

d \geq 1

and

ϵ_{i j}^{η} (t) = {\hat{z}}_{j}^{η} (t) - {\hat{z}}_{i}^{η} (t)

,

{\hat{z}}_{i}^{η} (t) = {\hat{a}}_{i} (t) y_{i}^{η} (t) + {\hat{b}}_{i} (t)

,

i = 1, \dots, n

. Following the derivations from Section 3, one obtains from (26) the following relations involving explicitly

x (t)

and the noise terms:

\begin{matrix} {\hat{ϕ}}_{i} (t + 1) = {\hat{ϕ}}_{i} (t) + δ (t) \sum_{j \in N_{i}} γ_{i j} {(Ω_{i} (t, d) + Ψ_{i} (t, d)) ({\hat{ϕ}}_{j} (t) - {\hat{ϕ}}_{i} (t)) \\ + N_{i j} (t, d) {\hat{ϕ}}_{j} (t) - N_{i i} (t, d) {\hat{ϕ}}_{i} (t)}, \end{matrix}

(27)

where

Ω_{i} (t, d) = [\begin{matrix} α_{i} β_{i} x (t) + α_{i}^{2} x (t) x (t - d) & α_{i} β_{i} + α_{i}^{2} x (t - d) \\ (1 + β_{i}^{2}) x (t) + α_{i} β_{i} x (t) x (t - d) & 1 + β_{i}^{2} + α_{i} β_{i} x (t - d) \end{matrix}],

Ψ_{i} (t, d) = η_{i} (t - d) [\begin{matrix} α_{i} x (t) & α_{i} \\ β_{i} x (t) & β_{i} \end{matrix}],

N_{i j} (t, d) = \frac{η_{j} (t)}{α_{j}} [\begin{matrix} α_{i} y_{i} (t - d) & 0 \\ β_{i} y_{i} (t - d) & 0 \end{matrix}] + [\begin{matrix} \frac{η_{j} (t) η_{i} (t - d)}{α_{j}} & 0 \\ 0 & 0 \end{matrix}]

and

N_{i i} (t, d) = \frac{η_{i} (t)}{α_{i}} [\begin{matrix} α_{i} y_{i} (t - d) & 0 \\ β_{i} y_{i} (t - d) & 0 \end{matrix}] + [\begin{matrix} \frac{η_{i} (t) η_{i} (t - d)}{α_{i}} & 0 \\ 0 & 0 \end{matrix}] .

In the same way as in (23), we have

\hat{ϕ} (t + 1) = (I + δ (t) {[Ω (t, d) + Ψ (t, d)] (Γ \otimes I_{2}) + \tilde{N} (t, d)}) \hat{ϕ} (t),

(28)

where

Ω (t, d) = diag {Ω_{1} (t, d), \dots, Ω_{n} (t, d)}

,

Ψ (t, d) = diag {Ψ_{1} (t, d), \dots, Ψ_{n} (t, d)}

,

\tilde{N} (t, d) = [{\tilde{N}}_{i j} (t, d)]

, where

{\tilde{N}}_{i j} (t, d) = - \sum_{k, k \neq i} γ_{i k} N_{i i} (t, d)

for

i = j

and

{\tilde{N}}_{i j} (t, d) = γ_{i j} N_{i j} (t, d)

for

i \neq j

,

i, j = 1, \dots, n

.

To formulate a convergence theorem for (28), the following modification of (A4) is needed:

(A4’)

m (d) > {\bar{x}}^{2}

for some

d = d_{0} \geq 1

.

This assumption implies that the correlation

m (d_{0})

should be large enough. Similarly as in the case of (A4), it can be concluded that (A4’) implies that

- \bar{Ω} (d) = - E {Ω_{i} (t, d)}

is Hurwitz. Similarly as in the above cases, let as introduce the similarity transformation

T^{″} = [\begin{matrix} i_{1} & i_{2} & T_{2 n \times (2 n - 2)}^{″} \end{matrix}],

where

T_{2 n \times (2 n - 2)}^{″}

is an

2 n \times (2 n - 2)

matrix, such that

span {T_{2 n \times (2 n - 2)}^{″}} = span {\bar{B} {(d)}^{″}}

. Then,

{(T^{″})}^{- 1} = [\begin{matrix} ρ_{1}^{″} \\ ρ_{2}^{″} \\ S_{(2 n - 2) \times 2 n}^{″} \end{matrix}]

, where

ρ_{1}^{″}

and

ρ_{2}^{″}

are the left eigenvectors of

\bar{B} {(d)}^{″} = E {Ω (t, d) (Γ (t) \otimes I_{2})} = \bar{Ω} (d) (\bar{Γ} \otimes I_{2})

corresponding to the zero eigenvalue. The following theorem deals with the convergence of the instrumental variable algorithm (26). The convergence point, again, depends on the noise realization.

Theorem 5

([16]). Assume that the assumptions (A1’), (A2’), (A3), (A4’), (A5)–(A8) hold. Then

\hat{ϕ} (t)

, given by (28) with

d = d_{0}

, converges to

i_{1} w_{1} + i_{2} w_{2}

in the mean square sense and w.p.1, where

w_{1}

and

w_{2}

are scalar random variables satisfying

E {w_{1}} = ρ_{1}^{″} \hat{ϕ} (0)

and

E {w_{2}} = ρ_{2}^{″} \hat{ϕ} (0)

.

5.3. Asynchronous Broadcast Gossip Communication

So far we have shown how to deal with most of the practical challenges which emerge when dealing with real life SNs, such as communication dropouts, communication additive noise and measurement noise. However, in all of the above discussed algorithms we have implicitly assumed that all the nodes in the network share a common clock, based on which the recursions in (5), (25) or (26) can be implemented synchronously. Indeed, when introducing the basic algorithm we have assumed that the signal

x (t)

is being measured in discrete-time instances t by all the nodes. These instances are also used as time indexes of synchronous recursions of the above algorithms. Yet, there are many practical cases of SNs for which it is impossible or impractical to function synchronously. A typical example is the case when the nodes follow certain slee** policies in order to minimize power consumption (e.g., [3]). For example, the nodes in SN shown in Figure 1, measuring air pollution or atmospheric conditions, may be programmed to make measurements less often during periods in which there is less traffic in the city. These types of situations are rigorously treated in the rest of this subsection.

Instead of the problem setup introduced in Section 3, assume now that the sensors are measuring a continuous-time signal

x (t)

at discrete points

t_{k}

,

t_{k} \in R^{+}

,

k = 1, 2, \dots

,

t_{k + 1} > t_{k}

, producing the sensor outputs

y_{i} (t_{k}) = α_{i} x (t_{k}) + β_{i} + η_{i} (t_{k}),

(29)

where the

α_{i}

and

β_{i}

are the same unknown parameters as in the previous subsections, and we also assume that the measurement noise

η_{i} (t_{k})

,

i = 1, \dots, n

, is present in the sensor readings.

Furthermore, since the goal is to remove dependence on a common global clock, it is now assumed that every node

j \in N

has its own local clock. For the sake of compact notation and simpler derivations, a single clock, called global virtual clock, is introduced, which ticks when any of the local clocks ticks. Hence,

t_{k}

in (29) can be considered as the time in which the k-th tick of the virtual clock happend. To have a well defined situation, it is formally assumed that the ticks of the local clocks are independent, and that the intervals between any two consecutive ticks are finite w.p.1. It is also assumed, for the sake of simpler derivations, that the unconditional probability that the j-th clock ticked at an instance

t_{k}

is

q_{j} > 0

, independently of k. It is easy to verify that these conditions are satisfied for a typical model used in SNs, where it is assumed that the local clocks tick according to independent Poisson processes with rates

μ_{j}

(as in, e.g., [62,63]). This case will be adopted throughout this subsection. It directly follows that, in this case, the virtual global clock ticks according to a Poisson process with the rate

\sum_{j = 1}^{n} μ_{j}

.

According to the above assumptions, let us denote with

t_{l}^{j}

the ticks of the local clock j,

l = 1, 2, \dots

. The communication protocol can then be defined in the following way. At each local clock tick, a node j makes the local sensor measurement, calculates the corrected sensor output

z_{j} (t_{l}^{j})

(based on the current estimates of calibration parameters

a_{j}

and

b_{j}

), and broadcasts it to its out-neighbors

i \in N_{j}^{out}

. We assume also that communication dropouts can happen, i.e., each node

i \in N_{j}^{out}

receives the transmitted message with probability

p_{i j} > 0

. For the sake of clarity of presentation, we do not treat additive communication noise in this subsection. It is also assumed that the communication delay is negligible, so that, practically at the same time instant all the nodes which have received the broadcast, perform the local sensor reading, calculate their corrected outputs

z_{i} (t_{l}^{j})

, and update the local estimates of their calibration parameters

a_{i}

and

b_{i}

. This procedure is repeated for any local clock tick. The index of the node whose clock has ticked at instant

t_{k}

is denoted by

j (k)

, and let

J (k)

be the subset of the out-neighbors

i \in N_{j (k)}^{out}

which have received the broadcast message. Also, let

x (k) = x (t_{k}) = x (t_{l}^{j (k)})

,

y_{i} (k) = y_{i} (t_{k}) = y_{i} (t_{l}^{j (k)})

,

y_{j} (k) = y_{j} (t_{k}) = y_{j (k)} (t_{l}^{j (k)})

,

z_{i} (k) = z_{i} (t_{k}) = z_{i} (t_{l}^{j (k)})

,

z_{j} (k) = z_{j} (t_{k}) = z_{j (k)} (t_{l}^{j (k)})

,

η_{i} (k) = η_{i} (t_{k}) = η_{i} (t_{l}^{j (k)})

and

η_{j} (k) = η_{j} (t_{k}) = η_{j (k)} (t_{l}^{j (k)})

for some l.

The measurement noise is treated as in the previous subsection, by using the delayed measurement

y_{i} (d_{i} (k))

as the instrumental variable

ζ_{i} (k) = y_{i} (d_{i} (k)),

(30)

where

d_{i} (k)

is the global iteration number that corresponds to the closest past measurement of the node i. By using the same local criteria as in (3) and gradients as in (4), the following new recursion for updating the calibration parameters at node i is formulated:

{\hat{θ}}_{i} (k) = {\hat{θ}}_{i} (k - 1) + δ_{i} (k) γ_{i, j (k)} ϵ_{i, j (k)} (k) [\begin{matrix} y_{i} (d_{i} (k)) \\ 1 \end{matrix}],

(31)

where:

${\hat{θ}}_{i} (k) = {[{\hat{a}}_{i} (k) {\hat{b}}_{i} (k)]}^{T}$ ,
$δ_{i} (k)$ is the step size given by $δ_{i} (k) = ν_{i} {(k)}^{- c}$ , where $ν_{i} (k) = \sum_{m = 1}^{k} I {i \in J (m)}$ is the number of parameter updates of node i up to the iteration k, with $1 / 2 < c \leq 1$ ( $I {\cdot}$ denotes the indicator function),
$ϵ_{i, j (k)} (k) = {\hat{z}}_{j (k)} (k) - {\hat{z}}_{i} (k)$ , where

$\begin{matrix} {\hat{z}}_{j (k)} (k) = & {\hat{a}}_{j (k)} (k - 1) y_{j (k)} (k) + {\hat{b}}_{j (k)} (k - 1), \end{matrix}$

(32)

$\begin{matrix} {\hat{z}}_{i} (k) = & {\hat{a}}_{i} (k - 1) y_{i} (k) + {\hat{b}}_{i} (k - 1) \end{matrix}$

(33)

are the corrected outputs of node $j (k)$ and node i.

The initial conditions are adopted to be

{\hat{θ}}_{i} (0) = {[1 0]}^{T}

. Note that, according to the problem setup, at a given iteration k only the nodes

i \in J (k)

perform the above parameters update; for the rest of the nodes it holds that

{\hat{θ}}_{i} (k) = {\hat{θ}}_{i} (k - 1)

.

Computationally, the algorithm is as simple as the basic one, requiring only a few additions and multiplications in one iteration. Information needed at node i are: the local sensor measurement, the local instrumental variable, and the current output sent by an in-neighbor j. Knowledge of the global iteration index k (or

d_{i} (k)

) is not needed.

From the above definition of the step size

δ_{i} (k)

it can be concluded that it depends only on the number of local clock ticks, which makes the algorithm completely decentralized.

It should also be noticed that the instrumental variables in (31) can be selected in several ways. For example, instead of choosing (30), it can be practical to choose

ζ_{i} (k) = y_{i} ({\bar{t}}_{l}^{j, i})

, where

{\bar{t}}_{l}^{j, i}

is the time instant of a supplementary measurement of node i, just after the last step of the recursion (31) has been locally performed. This scheme is not assumed in the sequel, because of much more complicated notation; all the results can be easily transferred to this case.

Similarly as in the synchronous case, we introduce:

{\hat{ϕ}}_{i} (k) = [\begin{matrix} {\hat{g}}_{i} (k) \\ {\hat{f}}_{i} (k) \end{matrix}] = [\begin{matrix} α_{i} & 0 \\ β_{i} & 1 \end{matrix}] {\hat{θ}}_{i} (k),

(34)

and

\begin{matrix} ϵ_{i, j (k)} (k) = [\begin{matrix} c x (k) & 1 \end{matrix}] ({\hat{ϕ}}_{j (k)} (k) - {\hat{ϕ}}_{i} (k)) + {\hat{a}}_{j (k)} (k) η_{j (k)} (k) - {\hat{a}}_{i} (k) η_{i} (k) . \end{matrix}

(35)

Consequently, we have

\begin{matrix} {\hat{ϕ}}_{i} (k) = & {\hat{ϕ}}_{i} (k - 1) + δ_{i} (k) γ_{i, j (k)} {(Ω_{i} (k) + Ψ_{i} (k)) ({\hat{ϕ}}_{j (k)} (k - 1) - {\hat{ϕ}}_{i} (k - 1)) \\ + N_{i, j (k)} (k) {\hat{ϕ}}_{j (k)} (k - 1) - N_{i i} (k) {\hat{ϕ}}_{i} (k - 1)}, \end{matrix}

(36)

where

Ω_{i} (k) = [\begin{matrix} α_{i} β_{i} x (k) + α_{i}^{2} x (k) x (d_{i} (k)) & α_{i} β_{i} + α_{i}^{2} x (d_{i} (k)) \\ (1 + β_{i}^{2}) x (k) + α_{i} β_{i} x (k) x (d_{i} (k)) & 1 + β_{i}^{2} + α_{i} β_{i} x (d_{i} (k)) \end{matrix}]

Ψ_{i} (k) = η_{i} (d_{i} (k)) [\begin{matrix} α_{i} x (k) & α_{i} \\ β_{i} x (k) & β_{i} \end{matrix}],

N_{i, j (k)} (k) = \frac{η_{j (k)} (k)}{α_{j (k)}} [\begin{matrix} α_{i} y_{i}^{0} (d_{i} (k)) & 0 \\ β_{i} y_{i}^{0} (d_{i} (k)) & 0 \end{matrix}] + [\begin{matrix} \frac{η_{j (k)} (k) η_{i} (d_{i} (k))}{α_{j (k)}} & 0 \\ 0 & 0 \end{matrix}]

and

N_{i i} (k) = \frac{η_{i} (k)}{α_{i}} [\begin{matrix} α_{i} y_{i}^{0} (d_{i} (k)) & 0 \\ β_{i} y_{i}^{0} (d_{i} (k)) & 0 \end{matrix}] + [\begin{matrix} \frac{η_{i} (k) η_{i} (d_{i} (k))}{α_{i}} & 0 \\ 0 & 0 \end{matrix}],

where

y_{i}^{0} (k) = α_{i} x (k) + β_{i}

, with the initial conditions

{\hat{ϕ}}_{i} (0) = {[α_{i} β_{i}]}^{T}, i \in J (k)

.

Recursions (36) for

i = 1, \dots, n

, can be written compactly as

\hat{ϕ} (k) = {I + [Ω (k) + Ψ (k)] (Δ (k) Γ (k) \otimes I_{2}) + (Δ (k) \otimes I_{2}) \tilde{N} (k)} \hat{ϕ} (k - 1),

(37)

where:

$\hat{ϕ} (k) = {[{\hat{ϕ}}_{1} {(k)}^{T} \dots {\hat{ϕ}}_{n} {(k)}^{T}]}^{T},$
$Δ (k) = diag {δ_{1} (k), \dots, δ_{n} (k)}$ ,
$Ω (k) = diag {Ω_{1} (k), \dots, Ω_{n} (k)}$ ,
$Γ (k) = [Γ {(k)}_{l m}]$ , with $Γ {(k)}_{l l} = - γ_{l, j (k)}$ and $Γ {(k)}_{l, j (k)} = γ_{l, j (k)}$ for all $l \in J (k)$ , $Γ {(k)}_{l m} = 0$ , otherwise,
$Ψ (k) = diag {Ψ_{1} (k), \dots, Ψ_{n} (k)}$ ,
$\tilde{N} (k) = [{\tilde{N}}_{l m} (k)]$ , where ${\tilde{N}}_{l l} (k) = - γ_{l, j (k)} N_{l l} (k)$ and ${\tilde{N}}_{l, j (k)} (k) = γ_{l, j (k)} N_{l, j (k)} (k)$ , for all $l \in J (k)$ , $\tilde{N} {(k)}_{l m} = 0$ , otherwise.

The initial condition is

\hat{ϕ} (0) = {[{\hat{ϕ}}_{1} {(0)}^{T} \dots {\hat{ϕ}}_{n} {(0)}^{T}]}^{T} = {[{[α_{1} β_{1}]}^{T} \dots {[α_{n} β_{n}]}^{T}]}^{T}

.

Since we have formulated a slightly different problem setup than in Section 3, we introduce a new set of assumptions, and denote them using letter B:

(B1)

{x (k)}

is a stationary random sequence, bounded w.p.1, and satisfying the ϕ-mixing condition.

(B2) Let

{t^{i, l}}

,

l = 1, 2, \dots

represent time instants in which node i performs measurements. Then,

{min}_{i} {\bar{r}}_{i} > m^{2}

, where

m = E {x (k)}

and

{\bar{r}}_{i} = E {x (t^{i, l}) x (t^{i, l - 1})}

,

i = 1, \dots, n

.

(B3) Graph

G

has a spanning tree.

(B4)

{η_{i} (k)}

,

i = 1, \dots n

, are zero-mean sequences of independent and bounded w.p.1 random variables.

{η_{i} (k)}

is independent of the process

{x (t)}

, with

E {η_{i} {(k)}^{2}} = {(σ_{i}^{η})}^{2}

for all k.

Assumptions (B3) and (B4) are essentially the same as (A3) and (A8).

The ϕ-mixing condition (B1) represents one of the strong mixing conditions, usually satisfied for sensory signals [64,65,66].

Assumption (B2) represents an extension of the assumption (A4), adapted to the presence of the instrumental variable

y_{i} (d_{i} (k))

in (31). It guarantees the persistence of excitation in the sense that the variance of

x (k)

must be greater than zero (for all k, because of stationarity) so that constant signals are not allowed [13,55]. However, it also ensures sufficent correlation between the instrumental variable and the current measurement, so that e.g., white noise signals are also not allowed. It can be easily derived [14] that (B2) is satisfied if the autocovariance function of

x (t)

is positive in a sufficiently large interval around zero. Also, if the rates

μ_{j}

are adjustable we can choose

μ_{\min} = {min}_{j \in N} μ_{j}

large enough, such that (B2) is always satisfied. Therefore, (B2) is, in general, not restrictive for processes having dominant low frequency spectrum, which is typical in practical applications of SNs.

Based on the above modified problem definition, the following result was proved in ref. [14], stating that both corrected gains and corrected offsets will converge to consensus points (which depend on the realizations of the stochastic processes) for all the nodes.

Theorem 6

([14]). Let Assumptions (B1)–(B4) be satisfied. Then

\hat{ϕ} (k)

given by (37) converges to

{\hat{ϕ}}_{\infty} = χ_{1} i_{1} + χ_{2} i_{2}

in the mean square sense and w.p.1, where

χ_{1}

and

χ_{2}

are random variables with bounded second moments.

6. Discussion

6.1. Rate of Convergence

In the above subsections we did not discuss on how quickly the presented algorithms can achieve convergence to the specified points. Since the basic Equation (5) have constant step size, it can be concluded that the asymptotic convergence rate in the noiseless case is exponential. In the cases of measurement and additive communication noise, the convergence rate of the algorithms can be obtained following general methodology applied to the analysis of standard stochastic approximation algorithms. The following result gives an upper bound on the mean-square error with respect to the consensus point:

Theorem 7

([17]). Under the assumptions of any of the Theorems 3, 4 or 5, together with

{lim}_{t \to \infty} (δ {(t + 1)}^{- 1} - δ {(t)}^{- 1}) = d \geq 0

, there exists such a positive number

σ^{'} < 1

that for all

0 < σ < σ^{'}

the asymptotic consensus is achieved by the presented algorithms with the convergence rate

o (δ {(t)}^{σ})

.

It might be problematic to obtain the precise value of

σ^{'}

in concrete applications. However, it can be shown that it directly depends on the sensor and network properties (encoded by matrix

B (t)

or

B (t, d)

) and on the connectivity of the underlying communication graph [67]. On one hand, if the number of nodes is increased without increasing network connectivity, the rate of convergence will decrease; however, if the graph connectivity is increased, the convergence rate will also increase. For example, if the graph is fully connected, the convergence rate will be high at the expense of very large number of communication links. In practice, a compromise between the rate of convergence and the network complexity needs to be found.

Another compromise to be found is between algorithm’s noise immunity and convergence rate. Indeed, according to [68], assuming that

δ (t)

is given as

δ (t) = m_{1} / (m_{2} + t^{μ})

,

m_{1}, m_{2} > 0

,

\frac{1}{2} < μ \leq 1

, the values of

μ

closer to

\frac{1}{2}

give larger rate of convergence but higher sensitivity to noise; the values of

μ

closer to 1 give the opposite effect.

6.2. Stationarity of the Measured Signal

In the previous section, we made comments about all the introduced assumptions, explaining their practical applicability. Let us make some additional comments on the stationarity assumption for the random process

{x (t)}

, introduced in (A2), (A2’) and (B1). From the point of view of applications, it cannot be considered restrictive, since it encompasses a large variety of quickly and slowly varying real signals. This assumption is not essential for proving convergence of the presented algorithms: it has been introduced primarily for the sake of focusing on the essential structural aspects of the algorithm and avoiding complex notation [13,14,17]. Notice, according to Lemmas 2 and 3, that the similarity transformation T can be applied even in the case of time varying matrix

\bar{B} (t)

, owing to its specific structure; namely, we have

T^{- 1} \bar{B} (t) T = [\begin{array}{c} 0_{2 \times 2} & 0_{2 \times (2 n - 2)} \\ 0_{(2 n - 2) \times 2} & \bar{B} {(t)}^{*} \end{array}]

, where

\bar{B} {(t)}^{*}

is Hurwitz and T is obtained from

\bar{B} (t)

for any selected

t = t^{'}

. Moreover, notice that the conclusions of Theorem 1 hold, in general, provided the following unrestrictive condition holds:

{lim}_{t} \prod_{τ} (I - \bar{B} {(t - τ)}^{*}) = 0

. Also, it is possible to conclude directly that the results of the above theorems hold for changes of

\bar{B} {(t)}^{*}

sufficiently slow. Moreover, it is not difficult to prove that the above convergence results exactly hold when the signal is asymptotically stationary.

6.3. Network Weights Design

As already discussed in the previous subsections, the implicit goal of the presented calibration scheme is to exploit the sensors with a priori good calibration properties by enforcing their dominating effect in the final consensus value to which all the nodes converge. This can be done in two ways, by adjusting the design weights

γ_{i j}

: (1) if the majority of sensors are “good”, we can set all

γ_{i j}

for the neighborhood of any node i to the same value; or (2) if there is a smaller subset

N^{f} \subset N

of a priori “good” sensors in the network we should appropriately tune the values of

γ_{i j}

. For this scenario, in this subsection, we give a more detailed analysis of the weights adjusting problem for the case of asynchronous communications treated in Section 5.3.

According to the theoretical results presented in detail in ref. [14], the dominant component of the random variables

[χ_{1} χ_{2}]

in Theorem 6 is given by a weighted sum of the unknown sensor parameters

α_{i}

and

β_{i}

. The positive weights are determined by the left eigenvectors

w_{1}

and

w_{2}

of

\bar{B}

corresponding to the zero eigenvalue. In turn, these weights are functions of the design parameters

γ_{i j}

, the wake-up probabilities

q_{j}

, and the dropout probabilities

p_{j i}

,

i, j = 1, \dots, n

. Therefore, it is clear that the initial characteristics of a selected sensor i will have larger influence on the asymptotic consensus value if the appropriate elements of

w_{1}

and

w_{2}

are increased. This can be achieved in two ways:

By reducing the values of all the elements in the i-th row of $\bar{Γ}$ , or
By increasing the values $γ_{j i}$ , $j \neq i$ , from the i-th column (kee** in mind that $\bar{Γ}$ must be row stochastic).

Probabilities

q_{j}

can in certain situations also be adjusted since they depend on the rate of the local clock of node j. By increasing the clock rate of the node j, the influence of that node on the asymptotic calibration parameter values achieved at consensus, is also increased. Adjusting dropout probabilities

1 - p_{j i}

might also be possible in certain situations: by decreasing the probability of a node i of receiving a message from a neighbor, we increase its influence on the asymptotic consensus. Hence, there are several design variables which can be adjusted so that the desired convergence point is achieved.

6.4. Macro Calibration for Networks with Reference Nodes

As discussed in the previous subsection, the selection of the weights in the matrix

\bar{Γ}

is important for attaining the calibration goal of emphasizing a priori selected “good” sensors (“leaders”). Besides the described methods, this can be ultimately done by leaving the nodes from a set

N^{f} \subset N

with unchanged calibration parameters (reference nodes), and only apply the recursions (31) (or (5), or (26)) to the rest of the nodes

i \in N - N^{f}

. An example of a SN with such topology corresponding to the smart city example in Figure 1 is shown in Figure 2.

In practice, this situation emerges, for example, when a SN needs to be expanded, i.e., when several uncalibrated sensors needs to be added to an already calibrated SN. In this subsection, the convergence results for this case are presented assuming asynchronous calibration algorithm [14].

First, we treat the special case in which

| N^{f} | = 1

, i.e., there is only one reference sensor, and we want to calibrate the rest of the SN so that their calibrated outputs converge to the output of the reference sensor. For this case, all of the above results still hold, since the resulting communication graph will again have a spanning tree (with the reference node as a center node), which implies that (B3) (and (A3)) holds. Therefore, by applying the above convergence theorems, one concludes that the corrected gains and offsets

{\hat{ϕ}}_{i} (k)

,

i = 1, \dots, n

, will converge to the same value, dictated by the “leader”.

In the general case, assume, without loss of generality, that

N^{f} = {1, 2, \dots, n_{f}}

,

n_{f} = | N^{f} | > 0

, is the set of reference senors which have fixed parameters:

ϕ_{i}^{f} = [\begin{matrix} g_{i}^{f} \\ f_{i}^{f} \end{matrix}]

,

i \in N^{f}

. Assume that

{\bar{ϕ}}^{f} = [ϕ_{1}^{f T} \dots ϕ_{n_{f}}^{f T}]^{T}

. The calibration algorithms above are now applied in the same way, except that the reference nodes do not change the calibration parameters:

{\hat{θ}}_{i} (k) = {\hat{θ}}_{i} (k - 1)

for all

i \in N^{f}

. Let

N - N^{f} = {n_{f} + 1, \dots, n}

and let

{\hat{ϕ}}^{v} (k) = [{\hat{ϕ}}_{n_{f} + 1} {(k)}^{T} \dots {\hat{ϕ}}_{n} {(k)}^{T}]^{T}

be the vector of all the calibration parameters to be tuned. In this case, all of the above theorems do not hold anymore, since, when

n_{f} > 1

, the communication graph does not necessarily satisfy (B3) (the graph does not have a center node anymore, since two or more reference nodes are not mutually reachable). Hence, a separate convergence theorem is needed which treats this case:

Theorem 8

([14]). Let Assumptions (B1)–(B4) be satisfied and let all the nodes from

N - N^{f}

be reachable from all the nodes in

N^{f}

. Then the algorithm (31), in which

γ_{i j} = 0

for all

i \in N^{f}

, provides convergence of

{\hat{ϕ}}^{v} (k)

in the mean square sense and w.p.1 to the limit defined by

{\hat{ϕ}}_{\infty}^{v} = - {({\bar{Γ}}^{v} \otimes I_{2})}^{- 1} ({\bar{Γ}}^{f, v} \otimes I_{2}) {\bar{ϕ}}^{f},

(38)

where matrices

{\bar{Γ}}^{v}

and

{\bar{Γ}}^{f, v}

are

(n - n_{f}) \times (n - n_{f})

and

(n - n_{f}) \times n_{f}

submatrices of matrix

P^{- c} \bar{Γ}

, with indices

i, j = n_{f} + 1, \dots, n

and

i = n_{f} + 1, \dots, n

,

j = 1, \dots, n_{f}

, respectively;

P = diag {p_{1}, \dots, p_{n}}

and c is defined in (31).

It can be easily concluded that if the calibration parameters of all the reference nodes are the same, equal to some

ϕ^{f}

, then the calibration parameters of the rest of the nodes will also converge to

ϕ^{f}

. If this is not the case, it can be seen from (38) that the calibration parameters of different nodes will converge to different values. These values will be typically dictated by the reference sensors which are the closest to a given node.

6.5. Autonomous Gain Correction and Relationship with Time Synchronization

After presenting the most important aspects of the presented blind calibration methodology encompassing both noiseless and noisy environments, we will now make a few comments on the problem of its relationship with the algorithms for time synchronization in sensor networks, which has attracted a lot of attention (e.g., [40,41,42,43,44,45,46,47,69,70,71] and the references therein). Indeed, after coming back to the main measurement model, one easily realizes that in the case of time synchronization one has the form (1) with the absolute time t replacing the signal value

x (t)

. The estimation schemes used in the mentioned analogous time synchronizaion algorithms are, indeed, related to the estimation of the parameters of the calibration functions (2) based on the use of local time measurements; however, they consist of one separate recursion for the relative drift estimation and one separate recursion for the estimation of offsets, relying on the obtained relative drifts. In ref. [72] this scheme was reformulated in the light of the calibration problem and the above described methodology. One starts from the difference model

Δ y_{i} (t) = y_{i} (t + 1) - y_{i} (t) = α_{i} Δ x (t)

, where

Δ x (t) = x (t + 1) - x (t)

, and construct a gradient recursion for

{\hat{a}}_{i}

following the above methodology, having in mind that

Δ y_{i} (t)

does not depend on

β_{i}

. On the other hand, the estimation of

b_{i}

has to start from (1); it has the form of the recursion for

{\hat{b}}_{i}

in (5), but should use

{\hat{a}}_{i}

generated by first recursion. Such a combined gradient algorithm based on use of

Δ y_{i} (t)

resembles typical time synchronization algorithms. In general, it is important to observe that the introduction of t instead of

x (t)

in the basic relation (1) suffers from the very basic problem that unboundedness of the linear function contradicts the requirements for boundedness of the second order moments of

x (t)

(typical for stochastic approximation algorithms) and that it is not possible to guarantee convergence of the obtained recursions. This indicates that formal transfers of methodologies from one domain to the other should be done with extreme caution.

7. Simulation Results

In this section we present the results of extensive simulations, illustrating that the algorithms are applicable to real-world problems involving sensor networks with decentralized architectures. The results are presented using several figures which demonstrate all of the important properties theoretically addressed in the previous sections. In all of the simulations, a SN with ten nodes, with randomly generated communication graph satisfying assumption (A3), has been chosen. To show that the algorithms are applicable to a large variety of sensor characteristics, the sensor parameters

α_{i}

and

β_{i}

have been generated randomly from uniform distribution with means one and zero, respectively, and with standard deviation 0.3.

In Figure 3 the corrected gains

{\hat{g}}_{i} (t)

and offsets

{\hat{f}}_{i} (t)

obtained by the presented algorithm (5) in the noiseless case are presented, for a preselected step size, equal for all the nodes,

δ = 0.01

. It is clear from the figure that the convergence to consensus, and, hence, implicit asymptotic calibration is achieved with the asymptotic values close the desired (one for corrected gains, and zero for corrected offsets).

In Figure 4 the simulation results are presented for the situation in which the first node is set to be a reference node (“leader”) with perfect parameters

α_{1} = 1

and

β_{1} = 0

. Obviously, all of the rest of the nodes converge to this ideal characteristic.

In Figure 5 the corrected gains

{\hat{g}}_{i} (t)

and offsets

{\hat{f}}_{i} (t)

are depicted for the case in which all the theoretically discussed unreliabilities are included: communication dropouts with

p = 0.2

for all the links in the network, normally distributed communication additive noise with variance 0.1, and normally distributed measurement noises with variances different for all the nodes, uniformly generated in the range

(0, 0.1)

. In this case, in order to achieve convergence in the presence of noise, the time varying diminishing step-size sequences are chosen

δ (t) = 0.01 / t^{0.6}

, equal for all the nodes. The algorithm which works in this case is the one based on the instrumental variables (26), so that the measured signal

x (t)

is assumed to be generated by a second order linear system with white noise at the input, resulting in a correlated sequence with zero mean and standard deviation one. It is obvious from the figure that the convergence is achieved despite the presence of all the introduced unreliabilities.

Next we simulate the asynchronous algorithm (31) which includes instrumental variables. It is assumed that the local clocks of all the nodes are driven by Poisson processes with the same rates.

In Figure 6, the corrected gains

{\hat{g}}_{i} (k)

and offsets

{\hat{f}}_{i} (k)

generated by (31) are depicted assuming the steps

δ (k) = 0.01 / k^{0.6}

and the presence of communication dropouts with probabilities

p_{i j} = 0.2

for each link. The measurement noises and the signal

x (k)

are assumed to be the same as in Figure 5.

In Figure 7 the necessity of introducing the instrumental variables is demonstrated when the noise is present. The basic algorithm (5), without instrumental variables, has been simulated and the convergence is not achieved in this case. It can be seen that all the corrected gains

{\hat{g}}_{i} (k)

slowly converge to zero which is very undesirable.

Figure 8 illustrates the case discussed in Section 6.4 in which there are two reference sensors in the given SN. It can be seen from the figure that, as predicted by Theorem 8, the consensus is not achieved, and that all the calibration parameters converge to different values, determined by Equation (38).

8. Conclusions

In this paper, we consolidate the existing results on distributed recursive blind macro-calibration based on consensus, present the algorithms in a unified way, and provide additional analysis of several important theoretical and practical issues. The studied algorithms are completely decentralized, not requiring any fusion center. It was shown that the algorithms successfully perform on lossy sensor networks, characterized by unreliable communications, limited only to the local neighbors. Convergence properties have been presented under both noiseless conditions, and the conditions typical for noisy and unreliable environments. The practically important case of asynchronous communication based on a broadcast gossip scheme has also been treated. Extensive discussions have been provided, explaining in detail several important practical issues: convergence rate, design of tunable network weights, and calibration of sensor networks with multiple reference nodes. Extensive simulation results illustrating the behavior of the algorithms have been presented.

Future Work

The presented results can be extended in several directions.

The first future direction, which is imposed naturally, is to extend the presented algorithms, or develop new ones, for the case when the nodes measure spatially varying but correlated signals. Treating this situation would drastically increase the practical applicability of the described methodology. The performance of such a scheme would highly depend on a priori knowledge about the interrelatedness of the measurements of different nodes.

Another direction of possible future work is to extend the results to include the possibility that individual nodes measure vector values (instead of scalars treated in this paper). This case would arise in practically frequent situations when there are multiple diverse sensors on each node, and when each sensor possibly measures different (overlap**) subsets of all the available sensed values. One possible idea for treating these situations is to use the wide existing literature in overlap** decentralized estimation (e.g., [20,73,74,75,76]). One typical example where this situation arises are networks of cameras with different view angles and scene coverage.

Third future direction emerges if one interprets the above results not in the context of calibration, but as pure synchronization (consensus) results for certain special types of linear time-varying systems. Indeed, the derivations and results of the presented convergence theorems open up a possibility of extending them to the case of synchronization of general higher order linear parameter-varying systems, which is a topic of high theoretical importance (e.g., [77] and references therein).

Finally, sensor networks often operate in hostile environments where battery usage and power consumption of the sensor nodes are of crucial importance. In this context it would be of high importance to analyze, for a given particular sensor network mission, how to formulate optimization problem and achieve optimal compromise between sensor network calibration and the nominal operation dictated by the given mission.

Author Contributions

Conceptualization, M.S.S., S.S.S., K.H.J., M.B. and L.M.C.-M.; Methodology, M.S.S., S.S.S., K.H.J., M.B. and L.M.C.-M.; Simulations, M.S.S. and S.S.S.; Validation, M.S.S., S.S.S., K.H.J., M.B. and L.M.C.-M.; Formal Analysis, M.S.S., S.S.S., K.H.J.; Resources, M.S.S., S.S.S., K.H.J., M.B. and L.M.C.-M.; Writing—Original Draft Preparation, M.S.S., S.S.S., K.H.J., M.B. and L.M.C.-M.; Writing—Review & Editing, M.S.S., S.S.S., K.H.J., M.B. and L.M.C.-M.; Visualization, M.S.S; Funding Acquisition, M.S.S., S.S.S., K.H.J., M.B. and L.M.C.-M.

Funding

This work was partially supported by Fundação para a Ciência e a Tecnologia under Grant CEECIND/02902/2017, Project UID/EEA/00066/2013, Project foRESTER PCIF/SSI/0102/2017, and Program Investigador FCT under Grant IF/00325/2015. The work by K. H. Johansson was supported in part by the Knut and Alice Wallenberg Foundation, the Swedish Research Council, and the Swedish Foundation for Strategic Research.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kim, K.D.; Kumar, P.R. Cyber–physical systems: A perspective at the centennial. Proc. IEEE 2012, 100, 1287–1308. [Google Scholar]
Holler, J.; Tsiatsis, V.; Mulligan, C.; Avesand, S.; Karnouskos, S.; Boyle, D. From Machine-to-Machine to the Internet of Things: Introduction to a New Age of Intelligence; Academic Press: Cambridg, MA, USA, 2014. [Google Scholar]
Akyildiz, I.F.; Vuran, M.C. Wireless Sensor Networks; John Wiley & Sons: Hoboken, NJ, USA, 2010. [Google Scholar]
Gharavi, H.; Kumar, S.P. Special issue on sensor networks and applications. Proc. IEEE 2003, 91, 1151–1256. [Google Scholar] [CrossRef]
Akyildiz, I.F.; Su, W.; Sankarasubramaniam, Y.; Cayirci, E. Wireless sensor networks: A survey. Comput. Netw. 2002, 38, 393–422. [Google Scholar] [CrossRef]
Speranzon, A.; Fischione, C.; Johansson, K.H. Distributed and collaborative estimation over wireless sensor networks. In Proceedings of the 45th IEEE Conference on Decision and Control, San Diego, CA, USA, 13–15 December 2006; pp. 1025–1030. [Google Scholar]
Tomic, S.; Beko, M.; Dinis, R. Distributed RSS-AoA Based Localization with Unknown Transmit Powers. IEEE Wirel. Commun. Lett. 2016, 5, 392–395. [Google Scholar] [CrossRef]
Tomic, S.; Beko, M.; Dinis, R.; Montezuma, P. Distributed algorithm for target localization in wireless sensor networks using RSS and AoA measurements. Pervasive Mob. Comput. 2017, 37, 63–77. [Google Scholar] [CrossRef]
Tomic, S.; Beko, M.; Dinis, R. Distributed RSS-Based Localization in Wireless Sensor Networks Based on Second-Order Cone Programming. Sensors 2014, 14, 18410–18432. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Whitehouse, K.; Culler, D. Calibration as parameter estimation in sensor networks. In Proceedings of the 1st ACM International Workshop on Wireless sensor networks and applications, Atlanta, GA, USA, 28 September 2002; pp. 59–67. [Google Scholar]
Whitehouse, K.; Culler, D. Macro-calibration in sensor/actuator networks. Mob. Netw. Appl. 2003, 8, 463–472. [Google Scholar] [CrossRef]
Balzano, L.; Nowak, R. Blind calibration of sensor networks. In Proceedings of the 6th International Conference on Information Processing in Sensor Networks, Cambridge, MA, USA, 25–27 April 2007; pp. 79–88. [Google Scholar]
Stanković, M.S.; Stanković, S.S.; Johansson, K.H. Distributed Blind Calibration in Lossy Sensor Networks via Output Synchronization. IEEE Trans. Autom. Control 2015, 60, 3257–3262. [Google Scholar] [CrossRef]
Stanković, M.S.; Stanković, S.S.; Johansson, K.H. Asynchronous Distributed Blind Calibration of Sensor Networks under Noisy Measurements. IEEE Trans. Control Netw. Syst. 2018, 5, 571–582. [Google Scholar] [CrossRef]
Stanković, M.S.; Stanković, S.S.; Johansson, K.H. Distributed Macro Calibration in Sensor Networks. In Proceedings of the 20th Mediterranean Conference on Control & Automation (MED), Barcelona, Spain, 3–6 July 2012; pp. 1049–1054. [Google Scholar]
Stanković, M.S.; Stanković, S.S.; Johansson, K.H. Distributed Calibration for Sensor Networks under Communication Errors and Measurement Noise. In Proceedings of the IEEE 51st IEEE Conference on Decision and Control (CDC), Maui, HI, USA, 10–13 December 2012; pp. 1380–1385. [Google Scholar]
Stanković, M.S.; Stanković, S.S.; Johansson, K.H. A consensus-based distributed calibration algorithm for sensor networks. Serb. J. Electr. Eng. 2016, 13, 111–132. [Google Scholar] [CrossRef]
Olfati-Saber, R.; Fax, A.; Murray, R. Consensus and cooperation in networked multi-agent systems. Proc. IEEE 2007, 95, 215–233. [Google Scholar] [CrossRef]
Ohta, Y.; Siljak, D. Overlap** block diagonal dominance and existence of Lyapunov functions. J. Math. Anal. Appl. 1985, 112, 396–410. [Google Scholar] [CrossRef]
Šiljak, D.D. Decentralized Control of Complex Systems; Academic Press: New York, NY, USA, 1991. [Google Scholar]
Pierce, I.F. Matrices with dominating diagonal blocks. J. Econ. Theory 1974, 9, 159–170. [Google Scholar] [CrossRef]
Levin, A.; Weiss, Y.; Durand, F.; Freeman, W. Understanding and evaluating blind deconvolution algorithms. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; pp. 1964–1971. [Google Scholar]
Yu, C.; ** decentralized estimator. IEEE Trans. Autom. Control 2009, 54, 410–415. [Google Scholar] [CrossRef]
Stanković, S.S.; Stanković, M.S.; Stipanović, D.M. Consensus Based Overlap** Decentralized Estimation with Missing Observations and Communication Faults. Automatica 2009, 45, 1397–1406. [Google Scholar] [CrossRef]
Stanković, M.S.; Stanković, S.S.; Stipanović, D.M. Consensus-based decentralized real-time identification of large-scale systems. Automatica 2015, 60, 219–226. [Google Scholar] [CrossRef]
Stanković, S.S.; Šiljak, D.D. Model abstraction and inclusion principle: A comparison. IEEE Trans. Autom. Control 2001, 8, 816–832. [Google Scholar] [CrossRef]
Seyboth, G.; Schmidt, G.; Allgöwer, F. Output synchronization of linear parameter-varying systems via dynamic couplings. In Proceedings of the IEEE 51st IEEE Conference on Decision and Control (CDC), Maui, HI, USA, 10–13 December 2012; pp. 5128–5133. [Google Scholar]

Figure 1. An example sensor network used in smart-city applications with decentralized communication topology. The inter-node communication is performed according to the depicted directed graph. The introduced distributed calibration algorithm achieves asymptotic calibration of all the sensor nodes in the network without using any type of fusion center.

Figure 2. An example sensor network used in smart-city applications with multiple (four) reference nodes. The reference nodes (RNs) have fixed calibration parameters: only the rest of the nodes implement the given distributed sensor calibration recursions.

Figure 3. Noiseless synchronous algorithm without references: convergence to consensus is achieved for corrected gains and corrected offsets.

Figure 4. Noiseless synchronous algorithm with one reference sensor: convergence to the reference is chieved.

Figure 5. The modified algorithm (25): convergence to consensus is achieved for corrected gains and corrected offsets despite measurement noise presence.

Figure 6. The asynchronous algorithm based on instrumental variables without reference sensors: convergence to consensus is achieved for corrected gains and corrected offsets.

Figure 7. Stochastic gradient algorithm: convergence to consensus is not achieved.

Figure 8. The asynchronous algorithm with two reference sensors with different characteristics: both the corrected gains and the corrected offsets converge to different values determined by (38).

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Stanković, M.S.; Stanković, S.S.; Johansson, K.H.; Beko, M.; Camarinha-Matos, L.M. On Consensus-Based Distributed Blind Calibration of Sensor Networks. Sensors 2018, 18, 4027. https://doi.org/10.3390/s18114027

AMA Style

Stanković MS, Stanković SS, Johansson KH, Beko M, Camarinha-Matos LM. On Consensus-Based Distributed Blind Calibration of Sensor Networks. Sensors. 2018; 18(11):4027. https://doi.org/10.3390/s18114027

Chicago/Turabian Style

Stanković, Miloš S., Srdjan S. Stanković, Karl Henrik Johansson, Marko Beko, and Luis M. Camarinha-Matos. 2018. "On Consensus-Based Distributed Blind Calibration of Sensor Networks" Sensors 18, no. 11: 4027. https://doi.org/10.3390/s18114027

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On Consensus-Based Distributed Blind Calibration of Sensor Networks

Abstract

1. Introduction

2. Related Work

3. Problem Definition and the Basic Algorithm

4. Convergence Analysis

5. Extensions of the Basic Algorithm

5.1. Communication Errors

5.2. Measurement Noise

5.3. Asynchronous Broadcast Gossip Communication

6. Discussion

6.1. Rate of Convergence

6.2. Stationarity of the Measured Signal

6.3. Network Weights Design

6.4. Macro Calibration for Networks with Reference Nodes

6.5. Autonomous Gain Correction and Relationship with Time Synchronization

7. Simulation Results

8. Conclusions

Future Work

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI