Introduction

There are lots of studies on source identification problem in heat transfer (Hon and Wei 2004), groundwater transport (Atmadja and Bagtzoglou 2001), and atmospheric constituent transfer (Kovalets et al. 2011). In recent years, accidents produced by the hazardous contaminant leakage and epidemics of the infectious diseases, such as SARS and influenza in indoor environment, have caused heavy damages and been paid more and more attention (Olsen et al. 2003; Li et al. 2007; Liu and You 2012). It is of great importance to locate rapidly the location of contaminant source in order to take effective measures and protect occupants.

Identification of source location may be more important than that of contaminant emission rate, since only once the source location is identified, the contaminant concentration vs. time can be calculated by Computational fluid dynamics (CFD) simulation. Therefore, the present paper is focused on the identification of source location, in which the source intensity can be obtained by the linear scale-up or scale-down of the normal source strength (Zhang and Chen 2007).

Other research works explored several methods for identifying the source location, such as the Bayesian probability model (Sreedharan et al. 2006), genetic algorithm (Arvelo et al. 2002), and probability-based inverse multi-zone model (Liu and Zhai 2009). The multi-zone model was used to calculate the airflow and contaminant transport. This method only provides the macroscopic information about contaminant transport; however, it is difficult to find out the exact source location and intensity (Cai et al. 2012). Moreover, it is found that the Bayesian method is time-consuming and it requires several sensors and thousands of monitor data. Therefore, simple, rapid, and economical method for contaminant source location identification is yet required urgently.

Cai et al. (2012) developed a method to identify the location of contaminant source with the help of CFD simulation. In this method, good identification results were obtained by applying five sensors. The method is time-consuming to analyze the large numbers of monitor data. The maximum correlation coefficient (MCC) method was adopted to identify the point contaminant source location in the groundwater and river (Sidauruk et al. 1988; Chen et al. 2011). In our knowledge, however, the application of the MCC method with the CFD simulation database in the identification of indoor contaminant source location has not been reported.

In this paper, minimum discrete degree (MDD) method and maximum correlation coefficient (MCC) method are developed to identify the contaminant source location with only a single sensor. Effects of the sampling time, the sampling interval, and the sensor response time and measurement error on the location identification of the contaminant source are critically analyzed. It is found that the above two methods can identify the location of unknown source with low cost and rapid response.

Methods

Our proposed methods (MDD, MCC) consist of two steps. One is the establishment of CFD simulation database, and the other is the identification of contaminant source location. It is assumed that the indoor airflow is steady and only one contaminant point source exists that is released at a steady rate in the room. Only one sensor is placed at the center of the air outlet. The detection threshold of the sensor is 1 ppm. The time-consuming CFD simulation database at all potential locations is constructed before the source location identification starts.

At the potential contaminant source locations, the sampled data are marked as S = {S 1, S 2, …, S N }. The corresponding CFD-simulated concentration of the potential sample contaminant sources is stored in the simulation database array M = {α 1, α 2, …, α N }. In the contaminant release event, the concentration information of the real contaminant source whose location is unknown and is going to be searched for by the sensor is marked as α 0. The contaminant concentration array of the unknown source and all the potential sample sources in the CFD simulation database are marked as α i  = {c i1, c i2, …, c im }, 0 ≤ i ≤ N, 1 ≤ j ≤ m, m = τ/Δt, m > 2, τ is the sampling time, Δt represents the sampling interval.

The procedure of identification is listed as follows:

  • Obtain the indoor steady-state flow field by CFD simulation;

  • For all potential sample source, the simulated concentration at the monitor (sensor) position is stored in the database M = {α 1, α 2, …, α N };

  • The monitored concentration of the unknown source α 0 is stored in the database α 0;

  • Identify the unknown source location by using the MDD and MCC methods.

MDD method

In the MDD method, the simulation database established by CFD and the monitor database of the sensor are used to identify the location of the unknown source by calculating the discrete degree between the two databases. The MDD method is to calculate the least concentration difference between the unknown and the potential sources. Performance of the MDD method can be improved by introducing a new virtual source α p to represent the concentration difference between the unknown source and the potential sample source. The item of concentration difference c ij of the virtual source is written as follows:

$$ {c}_{ij}^{\prime }=\left|{c}_{ij}-{c}_{0 j}\right|\;0 < i\le N;1\le j\le m $$
(1)

The number of virtual source is equal to the sample sources in the CFD simulation database. The database of virtual sources is set as N = {α 1′, α 2′, …, α N ′}. Taking the sample source α p of database M as an example, the identification process is illustrated as follows:

For the unknown source α 0, if α 0 → α p, there is a linear concentration relation between α p and α p and it is written as follows:

$$ \frac{c_{\mathrm{p}1}^{\prime }}{c_{\mathrm{p}1}}\approx \frac{c_{\mathrm{p}2}^{\prime }}{c_{\mathrm{p}2}}\approx \cdot \cdot \cdot \approx \frac{c_{\mathrm{p}\mathrm{m}}^{\prime }}{c_{\mathrm{p}\mathrm{m}}}\approx k $$
(2)

In order to clarify the linear relationship of concentration between α p′ and α p, a discrete degree parameter I is defined as follows:

$$ I=\sqrt{\frac{1}{m}{\displaystyle \sum_{j=1}^m{\left(\frac{c_{\mathrm{p} j}^{\prime }}{c_{\mathrm{p} j}}- k\right)}^2}} $$
(3)

where k is the following:

$$ k=\frac{1}{m}{\displaystyle \sum_{j=1}^m\left(\frac{c_{\mathrm{p} j}^{\prime }}{c_{\mathrm{p} j}}\right)} $$
(4)

If the location of sample source α p is closed to that of α 0, the discrete degree value, I, of α 0 and α p tends to be zero since there is a linear relationship of concentration between α 0 and α p. By looking for the minimum value of I, the possible source location is found.

The flow convection has significant influences on the contaminant transport and the source location identification. When the two adjacent sample sources (α p and α q) and the unknown source are all in the strong convection region, the pollutant is transported by convection mainly and it reaches the monitor sensor very fast. As a result, the concentration of the sample sources at the monitoring point is very close with each other, that is c pj  ≈ c qj . For this case, the accurate location of the potential contaminant source cannot be identified and only a possible area of location is found out.

When the two adjacent sample sources, such as α p and α q, locate in the strong and weak convection regions, respectively, the monitored concentration of the one in the strong convection region is much larger than that in the weak convection region for the same sampling time τ, that is c pj  > c qj . Furthermore, the accurate identification of unknown source α 0 is determined by the monitored concentration value c 0j . If the concentration difference between c pj and c 0j is large, the identified location of α 0 may be wrongly identified as α p even if α 0 is close to α q in fact.

If the two adjacent sample sources near the unknown source are all in the weak convection region, there is an apparent concentration difference at the monitoring point once they are separated by some distance. At this case, the location of the unknown source can be identified accurately by MDD method. The case for pollutant releasing in the weak convection area is more dangerous because the pollutant stays longer and many workspaces are often in the weak convection region. The MDD method is suitable to identify the location of contaminant source in such weak convection regions.

MCC method

The MCC method combines the CFD simulation and the correlation coefficient analysis. The correlation coefficient method is used to find out the concentration relationship between the monitor data of the unknown contaminant source and the sample sources in the CFD simulation database. For a steady indoor flow, Ma et al. (2012) analyzed the transient accessibility of contaminant source (TACS) for an arbitrary point p from the ith contaminant source at time t. When initial and inlet concentrations are zero and only a single contaminant source is released, the TACS was defined as the following (Ma et al. 2012):

$$ {a}_{i- p}(t)=\frac{C_p(t)}{C_{e, i}} $$
(5)

where \( {C}_{e, i}=\frac{J_i}{Q} \) is the average concentration of air outlet; J i is the emission rate of the ith source; Q is the air flow rate of the room; C p (t) is the concentration of the point p at time t. TACS is a function of flow characteristic and source location and has nothing to do with the emission rate of contaminant source. When location of the unknown source is close to that of the sample source in the CFD simulation database, there is a linear relationship between the two sources on the characteristic contaminant concentration. According to the correlation coefficient equation, the concentration relationship between the unknown source α 0 and the potential sample source α i in the CFD simulation database was written as follows (Chen et al. 2011):

$$ R\left({\alpha}_i,{\alpha}_0\right)=\frac{{\displaystyle \sum_{j=1}^m\left({c}_{i j}-\overline{c_{i j}}\right)\left({c}_{0 j}-\overline{c_{0 j}}\right)}}{\sqrt{{\displaystyle \sum_{j=1}^m{\left({c}_{i j}-\overline{c_{i j}}\right)}^2}}\sqrt{{\displaystyle \sum_{j=1}^m{\left({c}_{0 j}-\overline{c_{0 j}}\right)}^2}}}\kern0.37em \mathrm{where}\kern0.37em 1\le i\le N,1\le j\le m $$
(6)

where \( \overline{c_{ij}} \) and \( \overline{c_{0 j}} \) represent the mean value of contaminant concentration arrays.

Since there are N potential sample sources, an array of P is written as P = {R(α 1, α 0), R(α 2, α 0), …, R(α N , α 0)}. The largest value R(α i , α 0) is corresponding to the location of contaminant source.

Case study

The effectiveness of the two methods proposed above is validated by a three-dimensional case similar to one done by Cai et al. (2012). The geometry of the three-dimensional model is shown in Fig. 1a. The room is 9 m long (X), 3.2 m high (Z), and 4 m wide (Y). Two air inlets (0.4 m × 0.4 m) and an air outlet (0.8 m × 0.4 m) are set in the ceiling. The supply air rate is 0.128 m3/s. The eight persons in the room sit at a fixed place and their heights are all 1 m. The height of tables in front of each person is 0.6 m. The potential virus sources from the mouth of each person are numbered as S1–S8 in Fig. 1b. The emission rate of sample source is 100 ppm/s. The locations of the potential sources are as follows: S1 (1.5, 3.3, 0.85), S2 (3.5, 3.3, 0.85), S3 (5.5, 3.3,0.85), S4 (7.5, 3.3, 0.85), S5 (1.5, 0.7, 0.85), S6 (3.5, 0.7, 0.85), S7 (5.5, 0.7, 0.85), and S8 (7.5, 0.7, 0.85). Suppose one contaminant source is spreading virus with the emission rate of 50 ppm/s. Only one sensor is placed at the center of the air outlet (4.5, 2, 3. 5).

Fig. 1
figure 1

Schematic diagrams of the meeting room: a three-dimensional sketch map; b two-dimensional plane layout (1 lamp, 2 exhaust air outlet, 3 supply air inlets, 4 cabinet, 5 person, 6 table, 7 wardrobe, S1–S8 contaminant sources, R sensor)

CFD is used to obtain the accurate, detailed flow field and the contaminant transport information of sample sources. The RNG k − ε model is used in our simulation for turbulence calculations. After the grid independence test, the model is finally discretized by 1131716 unstructured tetrahedral meshes.

Results and discussion

The effect of sampling time

The measurement threshold is 1 ppm and the monitor time interval is 1 s. For the sampling time of 60, 120, 180, 240, and 300 s, the contaminant concentration information are obtained separately. Table 1 shows the identification results with the MCC method. It is seen that the three positions of S1, S3, and S4 are not identified when the sampling time is 60 s. When sampling time is 120 s, only S6 fails to be identified. When the sampling time is respectively 180, 240, and 300 s, the location of the unknown source is all identified. Although the value of R for 300-s sampling time becomes small, the location of unknown contaminant source is still identified. It is concluded that the suitable sampling time should be chosen for the source location identification.

Table 1 The identification results of MCC at different sampling times

Table 2 shows the identification results of the MDD method. All the locations of the unknown contaminant sources are identified accurately even if the sampling time is 60 s. From Tables 1 and 2, it is known that performance of the MDD method is better than that of the MCC method.

Table 2 The identification results of MDD at different sampling times

Of the eight source locations, the source S5 is the farthest away from the monitor sensor and its location during the identification is easily confused with those of the sources of S1, S4, and S8. Therefore, the source S5 is selected as an example to demonstrate the identification efficiency of our methods.

The identification results of MCC and MDD methods are presented in Fig. 2. The mean value represents the average identification value of S5 at the eight potential sample sources. It is found that the identification value of the source S5 is smaller than that of other sources and the identification value at the actual location of S5 is about one order of magnitude lower than that of the other seven sources. It is found that the unknown source location can be identified by both methods by taking the average identification value. It is also found that the MDD method has better reliability and validity than the MCC method.

Fig. 2
figure 2

Identification results of source S5 at different sampling time. a MCC method; b MDD method

The effect of sampling interval

For the five sampling times of 60, 120, 180, 240, and 300 s, the six sampling intervals of 5, 10, 15, 20, 30, and 60 s are set at the monitor sensor. Table 3 shows the effects of sampling interval on the identification results of the MCC and MDD methods. The number of monitor data is required to be more than two, i.e., m > 2. Therefore, the sampling intervals of 30 and 60 s cannot be applied to the sampling time of 60 and 120 s, respectively. For the MCC method, it is found that for a sampling time, the longer the time interval is, the worse the identification result is. For a fixed sampling interval, the longer the sampling time is, the better the identification result is. For the MDD method, it is found that only the sources of S6 and S7 fail to be identified for the sampling interval of 20 s. The reason is that the number of the concentration data of the unknown source and its adjacent sample source are few for the small sampling time when the sampling interval is large.

Table 3 The effects of sampling interval on the identification results

From the above identification results, it is concluded that the effect of sampling interval of the MDD method is less than that of the MCC method. The MDD method gives a better performance than the MCC method for identifying the unknown source location overall. Besides, the MDD method can identify locations of pollutant sources in a shorter sampling time over the MCC method; therefore, it can be used for rapid identification.

The effect of response time of sensor

The response time (RT) is a key performance indicator of sensor. For a given sampling interval of 10 s, three levels of RT are considered as 3 s (fast), 5 s (mid), and 10 s (slow). Table 4 shows the effects of RT on the identification results with the MCC method. It is found that for the fixed sampling time, the longer RT is, the worse the identification result is. When RT is fixed, the longer the sampling time is, the better the identification result is. If the sampling time is long enough (for example 300 s in our cases), the eight locations of the unknown source are all identified accurately for the above three RT levels.

Table 4 The effects of RT on the identification results with MCC method

Table 5 shows the effects of RT on the identification results with the MDD method. The identification results are obtained when RT increases from 3 to 10 s. Even for RT = 3 s, the MDD method cannot identify the unknown source location effectively and only the source at the location of S2 and S8 are well identified. With the increase of RT, the identification results of the MDD method become worse. When RT = 10 s, all eight locations of sources cannot be identified.

Table 5 The effects of RT on the identification results with MDD method

A further study is carried out for finding out the minimum RT. It is found that good identification results are obtained when RT is 1 s and the sampling time is more than 240 s. When RT increases to 2 s, the identification time must be larger than 300 s. This means that long sampling time is helpful to obtain accurate source location.

Comparing the identification results of the above two methods, it is concluded that the response time of sensor has a stronger influence on MDD method than that on MCC method. When the same sampling time 300 s is taken, the maximum RT is 2 s for the MDD method while the corresponding maximum RT is 10 s for the MCC method. In our cases, the requirement for sampling time and RT are 240 and 5 s for the MCC method, and 240 and 1 s for the MDD method, respectively. The results indicate that the MCC method is recommended if RT of sensor is of low response (larger than 1 s) while the MDD method is recommended if RT of sensor is of fast response (less than 1 s).

The effect of sensor measurement error

For a monitor sensor, the error of measurement always exists. It is necessary to analyze the effect of sensor error on the results of identification. Based on the above analysis, the requirement for accurate identification is that the sampling time is 240 s, the sampling interval is 10 s, and the RT is 1 s for the MDD method and RT 5 s for the MCC method. The measurement error on the performance of the above two methods is evaluated.

Under the above conditions, the methods of MCC and MDD are tested with two levels of the distributed random errors. The Gauss distribution function is as follows:

$$ \varPhi (x)=\frac{1}{\sqrt{2\pi}}{\displaystyle {\int}_{-\infty}^x{e}^{-\frac{t^2}{2}} dt}\kern0.62em \hbox{-} \infty < x<+\infty $$
(7)

The parameters of Gauss noises from S5 site are listed in Table 6. Equation (7) is used to produce a set of random numbers and the random error is calculated by multiplying the random number with the STDEV shown in Table 6. STDEV is the standard deviation of Gauss noise. The real measurement results are obtained once the random error is introduced. The mean value of Gauss noise equals the measured data without considering the noise. δ 1 and δ 2 represent the maximum and average absolute relative errors of perturbed data, respectively.

Table 6 The parameters of Gauss noise introduced to the monitored data at point S5

Although the measurement errors of Gauss type are introduced, the unknown source locations are still identified accurately. Figure 3 shows the effects of measurement errors on the values of R in all identified source locations. It is found that there are almost no R value difference between the low noise level and no noise cases. However, the R value in the high noise decreases obviously compared with that of no noise case. For the high Gauss noise case, the contaminant source location is still identified. The decrease of R has no strong effects on the source position identification by the MCC method.

Fig. 3
figure 3

The R values at exact and real measurements with the low and high level Gauss noises

Table 7 shows the identification results using the MDD method. The results show that when the Gauss noise is low, all eight contaminant source locations are identified exactly. When the Gauss noise is high, only the positions of S1 and S2 can be identified. The I value of high Gauss noise increases greatly. The reason is that Gauss noise increases the fluctuation of monitor data. The MDD method is used to calculate the difference between the monitor data and the sample data. The noise of monitor data leads to the discrete degree between monitor and sample data. For low Gauss noise, the I value has small changes since the monitor data change is minor. For high Gauss noise, large changes of monitor data makes the discrete degree large, which leads to the high value of I and wrong identification results.

Table 7 Identification results of MDD with respect to different Gauss noises

Comparing the identification results of the above two methods, it is found that MCC method has high interference-free ability and it identifies accurately the location of the unknown sources at both low and high Gauss noises considered, while the MDD method can do the good identification results only when Gauss noise is low.

Conclusions

The methods of the maximum correlation coefficient (MCC) and the minimum discrete degree (MDD) are developed to identify the location of contaminant source with a single sensor. The effects of sampling time, sampling interval, response time, and measurement error of sensor on the identification of source location are analyzed in details. The main conclusions are drawn as follows:

  • For these two methods, the longer the sampling time, the better the identification result. When sampling time is fixed, the identification results of the two methods become worse with increase of sampling interval. A long sampling time is necessary for accurate identification if the response time of sensor is large.

  • The effects of response time and measurement error of sensor on the accurate identification by the MCC method are small compared to that of the MDD method. This means that the MCC method has higher interference-free ability than the MDD method. The MDD method has an advantage of rapid identification; however, it requires the sensor to have a short response time and less measurement error sensor. For the MCC method, such requirement is unnecessary.

  • The two methods can identify the location of an unknown source when the sampling time, sampling interval, and response time are proper, and the measurement error of sensor is not very large.