Sensor Fusion
The selfvalidating (SEVA) sensor carries out an internal quality assessment, and generates, for each measurement, standard metrics for its quality, including online uncertainty. In the broadest sense, the process of sensor fusion is the synergistic use of a set of potentially inconsistent measurements from different sources to achieve a specific task.
The basic questions posed by sensor fusion are:
 What is the model of the measurand to be tracked?
 What is the model of measurement uncertainty (here used in the broadest sense)?
 How are measurements judged to be consistent?
 What happens to measurements inconsistent with the rest (typically they are discarded)?
 How are consistent measurements fused to provide a combined best estimate of the measurand?
The research in the UTC focuses on a highlevel consistency checking scheme, suitable for detecting gross failures of the individual sensor validation algorithms, and for generating a best estimate of the measurand and its uncertainty in all circumstances.The value of this approach thus lies less in its diagnostic precision in particular cases compared with other techniques, but rather that it is suitable for routine implementation in control systems, working with any instrument complying with the SEVA standard.
Consider now the case of n SEVA measurements x_{i}, i=1,..,n, with their associated uncertainties u_{i}, all estimating the same single valued measurand. This is a relatively simple problem compared with e.g. target tracking. What is to be determined is whether any of the measurements is inconsistent with the rest and, having dealt with any such outliers, what is the combined best estimate of the measurand and its uncertainty. The MV status for the combined measurement is also to be determined, based upon the consistency of the input measurements as well as their individual MV status values. These calculations can take place within a generic Combination Block (CB).
Each measurement may be the product of a different sensor, as shown. Alternatively, all calculations can take place within a single transmitter handling multiple transducers. In either case the calculation of the combined estimate will be largely identical.
The core functions of the CB are:
 checking for consistency for a set of measurements using their uncertainties
 combining the consistent measurements and uncertainties to obtain a best estimate
Combining a set of measurements with their uncertainties
Given n measurements x_{i} and their associated uncertainties u_{i}, and assuming that they are all consistent, a combined best estimate (CBE) x^{*} of the measurand is given by
It follows that the uncertainty u^{*} of CBE is:
Let the combination operation be denoted by:
An important property of L is that it is commutative and associative. For example, given variables x_{1}, x_{2} and x_{3} and defining
it is trivial to show that
and
This implies that the order in which measurements are combined is not important.
Consistency checking: consistency vs. inconsistency
Consistent measurements agree with each other, according to a criterion.. Inconsistencies can arise for any of the following three reasons.
 Even when all of the redundant measurements are individually representative of the measurand, random fluctuations may result inmutual inconsistencies occurring from sample to sample.
 Each measurement is generated by a SEVA sensor, which should provide detailed and devicespecific fault detection. It might reasonably be assumed that a commercial SEVA sensor should be able to detect between 90% and 99.9% of all occurrences of faults within itself, allowing for faults which are inherently difficult to detect and commercial design limitations due to cost/benefit tradeoffs. Thus there exists the possibility that a sensor may fail to detect a fault within itself, and accordingly generate an unrepresentative or incorrect measurement. Within a single transmitter handling multiple transducers, consistency checking may be the primary form of validating the raw data, and so inconsistencies may arise more frequently.
 A sensor is only able to measure the available value of a process parameter, rather than its ideal or true value
Of course consistency does not necessarily imply that the measurements are correct, only that they agree with each other. However in practice, just as the true value of the measurand is never known the correctness of a measurement is never known.
The purpose of consistency checking being to identify incorrect measurements, its fundamental principle assumes that:
 Incorrect measurements are relatively rare.
 Correct measurements are likely to be consistent with one another.
It follows that if one measurement is inconsistent with the rest, it is likely that it is incorrect, or, as the alternative, that it is correct and all the other measurements are incorrect. The latter is much less probable. Generalizing, the principle of majority voting is derived: if a majority of measurements are consistent they are assumed to be correct; any minority of measurements inconsistent with the majority are judged to be incorrect. If there is no majority consensus, then special action will have to be taken. To summarize, the guiding principle is that inconsistency implies incorrectness. As always, the validity of conclusions based on this principle will of course be dependent upon, amongst other factors, the number of sensors involved.
Consistency checking: the case of two measurements
The detection of inconsistencies between redundant measurements is a wellestablished academic field of study. Typically, however, the measurements are treated as a time series. It is less common for consideration to be given to the uncertainty interval surrounding each measurement, as its magnitude is not usually available.
According to the classical techniques of analytical redundancy, given a set of redundant measurements, one or more residual functions are created, each of which is designed to remain `close' to zero as long as the measurements are consistent. When a fault occurs, a variety of techniques may be applied to determine which sensor (or other plant component) is responsible for the inconsistency. Normally such techniques entail modelling of plant dynamic behaviour and/or sensor fault modes, which can be difficult and/or expensive. Choices must also be made about each decisionmaking threshold, i.e. the value which, if exceeded by a residual, indicates a significant inconsistency.
The availability of the uncertainty of each measurement provides a richer set of information to work with; on the other hand it adds to the complexity and dimensionality of the problem. We do not seek to extend the techniques of analytical redundancy to exploit uncertainty information, but rather to develop an algorithm which does not require any modelling of the measurand M beyond the simplest possible assumption that the expected value of each measurement is equal to the measurand:

(3) 
It is expected that such an algorithm underperforms a more elaborate scheme in detecting subtle inconsistencies, but that it could be more widely and readily used in applications which do not justify the expense of detailed modelling. We are not expecting to detect all undiagnosed faults in a SEVA sensor, although it may be characteristic of specific fault modes, \cite{Yung1993}. Rather, the intention is simply to be able to detect faults which cause measurements to be inconsistent with their redundant peers. Moffat, suggests a method of testing consistency between two measurements x_{1} and x_{2}, given their uncertainties u_{1} and u_{2}. Under the hypothesis that the measurements are correct, i.e. they are representative of the same measurand, then the function
should be close to zero. In other words, it is expected that
where d_{12}^{M} is the Moffat distance, satisfies the following criterion (called here the Moffat criterion):

(5) 
at the usual, say, 95% probability. The Moffat consistency test can thus be seen as a simple static form of residual function. This definition of consistency is somewhat counterintuitive, in that uncertainty intervals may overlap and yet still be declared inconsistent.
The degree of overlap required for Moffat consistency is maximized when u_{1} = u_{2}. Suppose u_{1} is kept constant and u_{2} is increased, then the degree of overlap required for consistency, as a proportion of u_{1}, decreases asymptotically to zero. It can be shown that Moffat consistency ensures that the CBE of the two measurements falls within the uncertainty intervals of each. A logical corollary is that there must be an overlap between the two uncertainty intervals and that the CBE falls within the overlap. It can also be shown that if x_{1} and x_{2}are Moffat consistent, then the CBE (n.b. with its reduced uncertainty) is also Moffat consistent with x_{1} and x_{2}.
The Type I threshold of 5% is presumably acceptable for the analysis of experimental data, the context in which Moffat conceived the test, and of course all the uncertainties themselves are expressed at the 95% probability level. However, for the purposes of online monitoring of redundant measurements in an industrial context, this probability is too high, leading to a steady stream of trivial alarms. The alarm frequency could be reduced by modifying the test as follows: use the test criterion ku_{f} £ f £ ku_{f} to demonstrate consistency, where k is a fixed but arbitrary value which controls the probability of a Type I error. The value k = Ö2 has intuitive appeal, because then two uncertainty intervals of equal magnitude are declared consistent if there is any overlap between them, and has a reduced Type I error of about 0.25%. However, as say u_{2} increases relative to u_{1}, the counterintuitive result is derived that two intervals are consistent even if they do not overlap at all, indeed if there is a large gap between them. For example, using k = Ö2, all of the following uncertainty intervals pairs are consistent, even where the uncertainty intervals do not overlap:
 0 ±1 and 1.99 ±1
 0 ±1 and 14 ±10
 0 ±1 and 140 ±100
It is concluded that k = 1 is the only acceptable value. There remains the concern that the 5% probability of a Type I error is too high.
Consistency checking: three or more measurements
For the principle of majority voting to be applicable, a sensor fusion system needs to have 3 or more measurements. When two measurements are found to be inconsistent with each other, majority voting cannot resolve the issue.
However, for more than two variables, Moffat's definition of consistency introduces a problem, in that the consistency criterion is not transitive. Using the expression x_{1}©x_{2} to indicate that measurement x_{1} with uncertainty u_{1} is consistent with measurement x_{2} with uncertainty u_{2}, © can be thought of as a binary relation, which is reflexive (x_{1}© x_{1}) and symmetric (x_{1}© x_{2}Þ x_{2}© x_{1}). Unfortunately, the relation is not transitive: x_{1}© x_{2} and x_{2}© x_{3} \notÞ x_{1}© x_{3}.
Also, it has been shown that there is a 5% probability that any two correct measurements of the same measurand are not consistent. Thus, given a set of 3 or more independent measurements that need to be combined, two issues need addressing. First, the maximum subset of mutually consistent measurements must be found and declared the consistent subset. Second, the measurements outside this subset, termed outliers, must be dealt with bearing in mind that inconsistency may be due to probabilistic jitter rather than sensor error.
It can be shown that the problem of finding the maximum subset of mutually consistent measurements is equivalent to the maximum clique problem in graph theory. That is, given a set of nodes and arcs, find the maximum subset of nodes (called the clique) with the property that each node from the subset is connected to every other. If each node is a measurement and each arc is a consistency relation, then this is equivalent to the problem of measurement consistency checking.
The maximum clique problem is known to be NPhard, so an exhaustive search is required to find the solution. Consider a set of n SEVA measurements x_{i} with uncertainties u_{i}, i=1,2,...,n. A prerequisite for the search is the building of the measurement graph. The n nodes are the values x_{i}, while the existence of an arc between x_{i} and x_{j} is determined by whether they are consistent: i.e. whether x_{i} © x_{j}. Let p be the maximum clique order. The search starts by trying p=n (i.e. all measurements are consistent) and systematically works down until a clique is found or until p=1. When a clique is found the algorithm further searches for any other cliques of the same order. The algorithm is summarised below
1  Initialisation  


2  Search for maximum cliques  


Approximation of the maximum clique by linear search
The exhaustive search for the maximum cliques can become extremely onerous as the number of measurements increases and the order of the maximum clique decreases. A method for approximating the maximum clique is proposed which uses overlapping intervals instead of the Moffat criterion to check for consistency. Because this method is linear in the number of measurements it has far less complexity than the exhaustive search. Moffat consistency is ensured within the resulting cliques by a later processing stage called uncertainty expansion, which is described in detail in the next section.
Consider again the set of n SEVA measurements x_{i} with uncertainties u_{i}, i=1,2,...,n, and let the uncertainty interval for the ith measurement, i=1,2,...,n, be (l_{i},h_{i}), where l_{i}=x_{i}u_{i} and h_{i}=x_{i}+u_{i} are the lower and upper bound, respectively. The set of n measurements can then be described by an ordered bound list containing all l_{i} and h_{i}. Without loss of generality the x_{i} can be assumed ordered so that l_{1}< l_{2} < ... < l_{n}. Of course the h_{i} may occur in any order interleaved through the l_{i}, subject only to the constraint that h_{i} > l_{i} (and hence h_{i} > l_{k}, k = 1,2, ..i). The overlapping intervals are readily identified by stepping through the ordered list of bounds. The approximation of the maximum clique(s) is given by the measurements whose uncertainty intervals define the area(s) of maximum overlap. The method is illustrated below.
The bound list is in this case given by l_{1}l_{2}h_{1}l_{3}l_{4}l_{5}h_{4}h_{3}h_{5}h_{2}. The point of maximum overlap involves measurements 2, 3, 4 and 5, which are therefore considered as an approximation of the maximum clique.
The algorithm walks through the bound list in increasing order. When a lower boundary is encountered the corresponding measurement is added to the set of active measurements, whose order p is thus incremented. When an upper bound is encountered, the corresponding measurement is removed from the set of active measurements whose order is thus decremented. At each stage, if the order of the active measurement set exceeds all previous values, then the active set becomes the new maximum clique. If its order equals that of the current maximum clique then the set is stored as an additional maximum clique. The algorithm is summarised below.
1  Initialisation  


2  Search the bound list for maximum areas of overlap  


Dealing with outliers
Having found or approximated a maximum clique, the obvious next step would be to use it to calculate the CBE using Equations 1 and 2 and to ignore all outliers. This approach has a number of difficulties:
 Given the probabilistic nature of the uncertainty, even if all measurements are correct representations of the measurand, there is only a 95% chance of each pair being consistent. As the number of inputs increases, the probability of all measurements being consistent reduces. For example, with 10 normally distributed measurements of equal variance and mean, there is only a 37.3% chance of all 10 sensors being mutually consistent at any given time.
 If, on average, one measurement is only marginally consistent with the rest, then sample by sample it may regularly switch between being judged consistent and inconsistent. This will generate undesirable jitter on the CBE.
 It is possible that at any given time there ares more than one maximum clique. For example, with three measurements x_{1}, x_{2} and x_{3}such that x_{1} © x_{2} and x_{2}© x_{3}, while x_{1} © x_{3} is not true, then there are two maximum cliques, (x_{1},x_{2}) and (x_{2}, x_{3}). It is not obvious as to which of the maximum cliques should be used to calculate the CBE.
A simple strategy can be implemented to tackle these issues. The underlying idea is that any inconsistent measurement can be `made consistent' by a sufficient increase in its own uncertainty, and that such an increase will cause a reduction in the weight of that measurement in the CBE. This approach is not based on uncertainty theory, but rather is a heuristic approach which has the desirable characteristics of smoothing over probabilistic inconsistency jitter, and providing a smooth reduction of weighting for inconsistent measurements.
In the most general case when there is more that one clique, the measurements are partitioned into two sets:
 the core set  the intersection of all the maximumcliques
 the peripheral set  the rest of the measurements i.e. those being either in at least one of the maximum cliques, but not in the core set, or those outside every maximum clique
If the maximum cliques were found using exhaustive search, then the mutual Moffat consistency of the measurements inside it is ensured. However, this is not guaranteed to be the case with the linear search, thus, for the core and peripheral sets resulting from the linear search algorithm, additional consistency checking needs to be done before the CBE is computed. The maximum Moffat distance d^{M}_{max}between pairs of measurements from the core set is computed. If this value is greater than one, then at least one of the measurement pairs is inconsistent. The uncertainty u_{i} of all measurements from the core set is then increased to u¢_{i} = d^{M}_{max}×u_{i}, values which will ensure mutual consistency.
Each measurement from the peripheral set is then considered in turn and the maximum Moffat distance to the measurements in the core set is found. If this value is greater than a specified threshold (for example 3.0), then the measurement is judged to be a true outlier and is ignored. If, however, this distance is less than the specified threshold, then its uncertainty interval is expanded as described to make it consistent with the measurements in the core set. The measurements from the peripheral set thus processed are then merged with those in the core set to obtain the CBE. This technique of uncertainty expansion reduces, but does not eliminate, the influence of the involved measurements on the CBE. In particular, if a measurement slowly drifts into inconsistency with the rest, uncertainty expansion ensures a smooth reduction of influence on the CBE before it is finally labelled as an outlier. This is illustrated in the simulations at the end of the paper.
One circumstance not covered by the above procedure is where there are multiple maximum cliques with no intersection between them. In this case the "middle clique" is found as being the maximum clique closest to the mean of the merged values for each maximum clique. The middle clique is then considered to be the core set while the peripheral set contains the remaining measurements.
The Combination Block
Given a set of n SEVA measurements (x_{i}, u_{i}, status_{i}), the task of the Combination Block is as follows.
 calculate the CBE and its uncertainty using the techniques described above. Normally, the VMV output is set equal to the CBE and the VU to its uncertainty  see below.
 assign the MV status of the output block. As a configuration option, the user can assign the minimum acceptable size of the maximum clique (for example 2 out of 3 or 6 out of 10). If this size is not reached, then the CBE is not used but instead the VMV and VU are projected from past history in the usual way, and the MV status is set to DAZZLED or, if the condition persists, BLIND. If the minimum acceptable size of clique is reached, then the MV status is set to SECURE; COMMON if the sensors are of identical type, otherwise DIVERSE. A further configuration option is the minimum number of CLEAR (or better) consistent measurements required to declare the CBE to be SECURE. If this target is not met, then the CBE is assigned the best status of the consistent measurements (i.e. CLEAR, BLURRED, DAZZLED or BLIND).
 each SEVA measurement is also assigned a consistency flag.This takes the value 1 if the measurement was found to be in the core or was made consistent with the core by uncertainty expansion, and 0 otherwise. This flag may be used (possibly after further filtering to avoid jitter) to trigger additional diagnostic testing within any SEVA sensors whose measurements were found inconsistent with the majority.
Exhaustive search vs. linear search approximation
Simulations have been carried out to compare the performance of the two methods for finding the set of mutually consistent SEVA measurements i.e. the exhaustive search for the maximum clique and the approximation of the maximum clique by linear search. In these first studies, faultfree behaviour is considered. It is desirable to have a match between theoretical and simulation results for the following statistics:
 mean of the CBE
 variance in mean of the CBE
 reported uncertainty of the CBE
In addition, it is desirable for the reported uncertainty to be reasonably constant, and for the incidence of reported inconsistencies to be low (as there are no true faults, just random variations).
100000 random sets of 3, 6 and 10 SEVA measurements were generated as follows:
 the true measurand value is 0
 the measurements were randomly generated from a normal distribution with a mean of zero and unit variance. This corresponds to an uncertainty of 1.96.
The theoretical value of the standard deviation of the CBE is then [1/(Ön)]. This gives an uncertainty of 1.96[1/(Ön)]. The means and standard deviations of the reported values of the CBE and its uncertainty have been computed over the 100000 simulations, and they are contrasted with the corresponding expected values are given below
Exhaustive  Linear  Theoretical  
search  search  value  
3 sensors:  
Mean of CBE  0.002  0.002  0.0 
Std of CBE  0.581  0.579  0.577 
Mean of unc.  1.136  1.129  1.131 
Sets with k < n  0  0  
6 sensors:  
Mean of CBE  0.002  0.002  0.0 
Std of CBE  0.411  0.410  0.408 
Mean of unc.  0.806  0.804  0.800 
Sets with k < n  0  0  
10 sensors:  
Mean of CBE  0.0  0.0  0.0 
Std of CBE  0.320  0.320  0.316 
Mean of unc.  0.626  0.624  0.619 
Sets with k < n  0  0 
Note: k is the number of consistent measurements after uncertainty expansion and n is the total number of measurements in the set
In this faultfree simulation, all sensor values were included in the calculation of all the CBEs through the use of the expanded uncertainty weighting technique. By contrast, without this technique, a significant percentage of sets are found to be inconsistent (e.g. in the case of 10 sensors, only 37.3% of the sets were found fully consistent).
At this point it can be concluded that the exhaustive search for the maximum clique and the approximation of the maximum clique with the linear search give very similar results. Given the simplicity and computational efficiency of the linear search, it may be preferred, certainly for larger numbers of sensors (say > 5). Also, the results show a reasonable match between the expected value of the CBE uncertainty, its actual variation, and its reported uncertainty.
Simulation results
Experiments have been carried out to study the behaviour of the Combination Block when one of the SEVA sensors either signals a fault or gives an incorrect description of the measurand. The linear search method was used to generate the following results.
The experiments consisted of simulating the online behaviour of 3 SEVA sensors. Two of the SEVA sensors give a correct description of the measurand (as in the previous section), while the third SEVA sensor either signals a fault or generates an incorrect description of the measurand.
In each case, a constant true measurement value of 2 was considered. The simulated faults were as follows:
 Example 1: a spike fault  saturation at upper limit occurs at 125s; the fault is permanent; the SEVA sensor detects the fault and first changes the MV status to DAZZLED and then to BLIND.
 Example 2: a spike fault  a faulty ramping value is added to the true measurement with the slope of 0.001 units per second; the fault begins at 125s and is permanent; the SEVA sensor detect the fault and changes the MV status to BLURRED.
 Exanple 3: a drift fault  a faulty ramping value is added to the true measurement with the slope of 0.001 units per second; the fault begins at 125s and is permanent; the SEVA sensor does not detect the fault and reports the measured value along with an MV status of CLEAR.
For the cases when the third SEVA sensor gives an incorrect description of the measurand, this was accomplished by ensuring that the VU was of the usual magnitude and the MV Status was CLEAR, while the VMV in fact suffers a drift starting at 125s, with a slope of 0.001 units per second.
The time series of the VMV, the VU and the MV status for a typical sensor exhibiting faultfree behaviour and generating a correct description of the measurand is given on the right. Below we show the outputs of the faulty sensor and the Combination Block for examples 1, 2 and 3 respectively.
In example 1 the third sensor exhibits a permanent saturation fault. Its output is characterised by the usual SEVA response, i.e.
 The VMV is projected from past history. In this case the VMV remains reasonably accurate as the process is stationary.
 The MV status changes to DAZZLED and then BLIND when it is deemed that the saturation is permanent.
 The uncertainty increases at a rate learned from past history using conventional SEVA algorithms.
The response of the Combination Block is as follows:
 The MV status of the Combination Block can only remain SECURE COMMON if a configured number of input sensors are CLEAR. In this case the number is three, so as soon as sensor 3 changes MV status the Combination Block output reverts to CLEAR. Note that hysteresis is used to prevent excessive jitter on the combination block MV status.
 The measurements are combined according to their consistency and uncertainty weightings. In both cases the measurement from the faulty sensor remains consistent, but its influence declines rapidly, weighted by the inverse square of its increasing uncertainty. This also results in the rapid increase in the uncertainty of the combined measurement from about 0.06 to 0.075 after the fault.
In Example 2 a drift fault occurs in sensor 3, but the sensor detects the fault and attempts to compensate. Thus the raw measurement value (RMV) is seen to drift off quickly, but the SEVA sensor reduces the effect of the fault by internal correction (which still leaves some marginal drift). The sensor declares the measurement BLURRED and increases its uncertainty. In these cases, the slow increase in the VU of the faulty sensor is reflected in a very marginal increase in the uncertainty of the Combination Block. Again, the change in the MV status is also accounted for in the change of MV for the Combination Block.
In Examples 1 and 2, since the fault is compensated for inside the SEVA sensor, the reported VMV is a correct representation of the true measurand. Therefore the combination block finds all 3 measurements to be consistent and uses them all to calculate the CBE. The occurrence of the fault is then reflected in the value of the VU for the combination block and in the MV status of the Combination Block output (determined by the change in MV status of the faulty sensor).
Example 3 shows the most important case: when one SEVA sensor does not give a correct representation of the measurand. The chain of events is as follows:
 An undetected drift fault begins in sensor 3 at t = 125s.
 The CBE begins to rise as long as the faulty value remains consistent with the rest.
 From t=200s to t=275s sensor 3 becomes increasingly inconsistent with the other two (i.e. its Moffat distance from their combination is between 1 and 3). Accordingly, its influence diminishes, the CBE returns towards the true value and the uncertainty increases as reliance is placed on only two instead of three measurements.
 Finally, at t=275s sensor 3 is deemed to be persistently inconsistent (Moffat distance > 3) and the MV status of the output drops to CLEAR.
The results shown illustrate that the combination block is capable of detecting and compensating for both detected and undetected faults in one of a set of SEVA sensors. The VU of the CBE is increased accordingly to account for faults and, when necessary, the faulty sensor is excluded from the calculation of the CBE. The CBE provided by the combination block in these examples remains a correct representation of the measurand, and is smooth, while the MV status is free from jitter.