Introduction

Molecular methods have been commonly used to identify enteric viruses in varied samples derived from environmental water (Ishii et al. 2014) and human feces (Thi Nguyen et al. 2015), but transmission electron microscopy (TEM) is still a standard tool for exhaustive virus detection, in which all virus particles present in a sample can be detected, and thus, it is considered as a catch-all method (Roingeard 2008). The steps of sample preparation and imaging with TEM are faster than those of some other methods, suggesting that TEM has potential applications in high-throughput screening (Doane 1980). However, the subsequent analysis of TEM images remains tedious and labor intensive. A typical analysis involves the assessment of many TEM images having large image sizes, with each image containing only a few virus particles. Image analysis requires highly technical knowledge about virus morphology and TEM. Thus, the manual analysis of TEM images is difficult and time consuming. If the image analysis could be effectively automated, the cost of virus detection using TEM would be reduced, allowing high-throughput analysis. These benefits of automating TEM analysis have created a strong demand for the development of suitable computational detection methods (Schramlová et al. 2010).

However, to date, few computational methods have been developed for virus detection in TEM images. The existing methods can be divided into template-matching and classifier-based approaches. The template-matching approaches, which include the cross-correlation method (Martin et al. 1997) and the modified cross-correlation method (Nicholson and Glaeser 2001), involve the computation of similarities between an input image and a reference image. The classifier-based methods, in contrast, involve the computational learning of the hyper-plane, which discriminates viral particles from nonviral image features, in the space of a local feature such as local binary patterns (Kylberg et al. 2011) or radial density profiles (RDPs) (Sintorn et al. 2004). RDPs, especially, are often employed for the detection and classification of viral particles.

In 1997, Matuszewski et al. (1997) presented a classifier-based method for the feature extraction and segmentation of TEM images containing adenoviruses and rotaviruses. Matuszewski and Shark (2001) later extended their method for the detection and classification of adenoviruses, astroviruses, caliciviruses, and rotaviruses. In their method, every viral particle is resampled to a fixed size and transformed to a spectral image. Local features compose the statistics of each spectral ring, and the major principal components are employed as the classification scores in the hierarchical classifiers.

Ong and Chandran (2005) developed a classification method using high-order spectral features, but their method was not equipped with a segmentation step. Sintorn et al. (2004) developed a template correlation matching method based on RDPs describing three capsids for the detection of positively stained human cytomegalovirus particles. They computed correlations to three templates, each of which describes the RDP of one of the three capsids, to obtain a correlation image. Areas with high correlations represent positive detection results.

Ryner et al. (2006) introduced a linear deformation analysis to represent the capsid variations of positively stained human cytomegalovirus. Kylberg et al. (2012) employed radial mean intensities for segmentation of an image. The method of Proença et al. (2013) detects adenoviruses by computing the entropy ratio of two rings surrounding each point of interest. The entropy ratio was also applied for the segmentation of polyomaviruses (Proença et al. 2013).

All the above-mentioned methods are based on handcrafted features that yield many false positives. To reduce the frequency of false positives, many of those methods have designed multiple filters, resulting in a complicated methodology. Besides, the performance often becomes substantially worse when changing datasets in general, the result of features that are too specialized in the benchmarking task used in each study. These problems motivated us to develop an end-to-end machine learning method for the segmentation and detection of viral particles. To this end, this study introduces the fully convolutional neural network (FCN) approach without any handcrafted features. This method was created to detect viral particles from TEM images such as that shown in Fig. 1. The most prominent difference of our method from the existing methods for viral particle detection from TEM images is its automatic acquisition of the features of viral particles. Instead of handcrafting features, TEM images and manually annotated reference images were fed to a convolutional neural network to obtain effective discriminative features. Recently, convolutional neural networks have enjoyed tremendous success in a variety of domains, including pure computer science as well as medical and biological fields (Havaei et al. 2017). The joint optimization of the feature extractors and classifier employed in the present study not only eliminates the workload of manually designing features but also has the potential to automatically adapt the features for each new detection task. The objectives of the study were the development of a computational method for virus detection in TEM images and a comparison of its detection performance with existing methods.

Fig. 1
figure 1

The problem settings of our study. a Input image (grayscale), b ground truth manually created by human experts (binary). The input image contains multiple viral particles, debris, and a noisy background. The contrast of the image is varied at some locations

Materials and Methods

Feline Calicivirus Dataset

Crandell-Rees feline kidney (CRFK) cells were cultured in Dulbecco’s minimal essential medium (DMEM) containing 0.2% (vol/vol) fetal bovine serum, 0.075% NaHCO3, 2 mM l-glutamine, 10 mM nonessential amino acids, 100 mg/ml penicillin, and 100 U/ml streptomycin. Cells were grown to a confluent monolayer at 37 °C with 5% CO2 in a humidifying incubator. Feline calicivirus was propagated in CRFK cells for 2 days at 37 °C, and the supernatant was collected and stored at 20 °C until use.

The dataset in our experiments was composed of 35 TEM images taken from four samples. The number of viral particles in each image ranged from 1 to 28. Each image was annotated by experts to obtain a binary image presenting the ground truth (e.g., Fig. 1b). Four samples containing purified feline calicivirus were negative-stained and imaged using TEM (Hitachi 7600, Hitachi, Japan) with an acceleration voltage of 100 kV and a magnification ratio of 200. The pixel resolution was 0.44 nm. The diameter of feline calicivirus is 35–40 nm, which appears in the TEM image with around 85-pixel diameter. The size of every TEM image used in the experiments was 3294 × 2461 pixels. Each image was resampled for reduction of the computational time for image processing and training of the neural network model.

Overview of Proposed Convolutional Neural Network

The proposed model is a variant of the FCN (Ciresan et al. 2012, 2013). Like the existing classifier-based methods, the proposed method has two phases: prediction and training. In the prediction phase, a probabilistic map, which is a prediction result, is produced from a given input TEM image. In the training phase, the parameters of the convolutional network are determined from a set of annotated TEM images. The two phases are detailed below.

Prediction Phase

In the classifier-based methods, to predict whether each pixel (xy) in an image is on a viral particle, features are extracted and fed to a predictor, as mentioned in the previous section. Typically, features are a vector \(\varvec{z}_{(x,y)} \text{ := }\left[ {z_{1}^{{\left( {x,y} \right)}} , \ldots ,z_{d}^{{\left( {x,y} \right)}} } \right]^{{\text{T}}}\) and, when choosing the logistic regression as the classifier, the prediction is provided through posterior probabilities given by

$$p_{{\left( {x,y} \right)}} \text{ := } \sigma \left( {\mathop \sum \limits_{i = 1}^{d} w_{i} z_{i}^{(x,y)} } \right),$$
(1)

where σ(·) is the sigmoid function defined as σ(u) = 1/(1 + exp(− u)). The d regression coefficients w1, …, w d in (1) are often determined in advance by maximum likelihood estimation.

In the proposed method, the features of each pixel (xy) are obtained by convolutional computation in the first two layers in the FCN as described in Figs. 2a and S1. The features are written in the form of a 7 × 7 × 16 tensor

Fig. 2
figure 2

a Network structure of the proposed FCN. Given an input image, our model uses three convolutional layers to produce a probability map of viral particles. The first convolutional layer nonlinearly transforms input image \(\varvec{I}\) to an intermediate feature map \(\varvec{Z}_{(x,y)}^{(1)}\) of size 122 × 122 × 64. The second convolutional layer transforms \(\varvec{Z}_{(x,y)}^{(1)}\) to produce a final feature map \(\varvec{Z}_{(x,y)}^{(2)}\) of size 120 × 120 × 16. From \(\varvec{Z}_{(x,y)}^{(2)}\), the last layer of a probabilistic map \(\varvec{ P}\) of size 114 × 114. b The size of each filter. The filters were designed in this study so that the final prediction for a pixel in the input image depends on a 15 × 15 subimage that covers the target particle and the surrounding region in the input image

$$\varvec{Z}_{(x,y)}^{(2)} \text{ := } \left\{ {Z_{z + i,y + j,k}^{\left( 2 \right)} \in \left[ {0, 1} \right]| i,j = - 3, \ldots , + 3, \;k = 1, \ldots ,16} \right\}.$$
(2)

The posterior probability of a pixel at (xy) being on a viral particle is defined as a sigmoid value of a convolution:

$$p_{{\left( {x,y} \right)}} \text{ := }\sigma \left( {\mathop \sum \limits_{k = 1}^{16} \mathop \sum \limits_{i = - 3}^{3} \mathop \sum \limits_{j = - 3}^{3} W_{i,j,k}^{(3)} Z_{x + i,y + j,k}^{(2)} } \right).$$
(3)

The posterior model is thus parameterized with a 7 × 7 × 16 tensor

$$\varvec{W}^{(3)} = \left\{ {W_{i,j,k}^{(3)} \in {\mathbb{R}} \big| i,j = - 3, \ldots , + 3, \;k = 1, \ldots ,16} \right\}.$$
(4)

It can be seen that flattening the two tensors, \(\varvec{Z}_{(x,y)}^{(2)}\) and \(\varvec{W}^{(3)}\), immediately reduces (3) to the standard form of posterior probability (1). Then, it can be seen that the final prediction for each pixel in the input image depends on a 15 × 15 subimage (Fig. 2b).

Given a gray-scaled 128 × 128 input image

$$I\text{ := }\left\{ {I_{x,y} \in \left[ {0, 1} \right]| x = 1, \ldots ,128,\; y = 1, \ldots ,128} \right\},$$
(5)

the feature map \(\varvec{Z}_{{(\varvec{x},\varvec{y})}}^{(2)}\) is computed in two stages. In the first stage, an intermediate feature map, which is 3 × 3 × 64 tensor

$$\varvec{Z}_{{(x^{\prime}, y^{\prime})}}^{(1)} \text{ := }\left\{ {Z_{{x^{\prime} + i,y^{\prime} + j,h}}^{\left( 1 \right)} \in \left[ {0, 1} \right]| i,j = - 1,0, + 1,\;h = 1, \ldots ,64} \right\} ,$$
(6)

is computed. Each entry in the intermediate feature map is given by the sigmoid value of a convolution:

$$Z_{{x^{\prime},y^{\prime},h}}^{(1)} \text{ := } \sigma \left( {\mathop \sum \limits_{i = - 3}^{ + 3} \mathop \sum \limits_{j = - 3}^{ + 3} W_{i,j}^{{\left( {1,h} \right)}} I_{{x^{\prime} + i,y^{\prime} + j}} } \right).$$
(7)

Similarly, the final feature map \(\varvec{Z}_{(x,y)}^{(2)}\) is then defined using \(\varvec{Z}_{{(x^{\prime},y^{\prime})}}^{(1)}\) as

$$Z_{{\left( {x^{\prime\prime},y^{\prime\prime},k} \right)}}^{(2)} \text{ := }\sigma \left( {\mathop \sum \limits_{h = 1}^{64} \mathop \sum \limits_{i = - 1}^{ + 1} \mathop \sum \limits_{j = - 1}^{ + 1} W_{i,j,h}^{{\left( {2,k} \right)}} Z_{{x^{\prime\prime} + i, y^{\prime\prime} + j,h}}^{\left( 1 \right)} } \right).$$
(8)

Note that the final output p(x,y) and the feature maps, \(\varvec{Z}_{{(x^{\prime},y^{\prime})}}^{(1)}\) and \(\varvec{Z}_{{(x^{\prime\prime},y^{\prime\prime})}}^{(2)}\), are defined using the multidimensional discrete convolution (see Fig. S1).

In the literature of convolution (Ciresan et al. 2012), the parameters for the intermediate features, say, for h = 1, …, 64,

$$\varvec{W}^{(1,h)} \text{ := }\big\{W_{i,j}^{{\left( {1,h} \right)}} \in {\mathbb{R}} \big| i,j = - 3, \ldots , + 3\big\}$$
(9)

and the parameters for the final features, say, for k = 1, …, 16,

$$\varvec{W}^{(2,k)} \text{ := }\big\{W_{i,j,h}^{{\left( {2,k} \right)}} \in {\mathbb{R}} \big| i,j = - 1,0, + 1, h = 1, \ldots ,64\big\}$$
(10)

as well as regression coefficients \(\varvec{W}^{(3)}\) are simply called a filter without distinction of the classifier and the feature extractors. If denoting the set of filters by

$${\varvec{\uptheta}}\text{ := }\left( {\varvec{W}^{{\left( {1,1} \right)}} , \ldots ,\varvec{W}^{{\left( {1,64} \right)}} ,\varvec{W}^{{\left( {2,1} \right)}} , \ldots ,\varvec{W}^{{\left( {2,16} \right)}} ,\varvec{W}^{\left( 3 \right)} } \right),$$
(11)

the posterior probabilities can be represented as \(p_{{\left( {x,y} \right)}} = p_{{\left( {x,y} \right)}} (\varvec{I}; {\varvec{\uptheta}})\) to clarity the dependencies on the input image \(\varvec{I}\) as well as the set of filters θ.

To obtain the posterior probabilities p(x,y) in the whole input image, \(\varvec{Z}_{{(x^{'} ,y^{'} )}}^{(1)}\) are computed for all \((x^{\prime},y^{\prime})\); \(\varvec{Z}_{(x,y)}^{(2)}\) are then computed using \(\varvec{Z}_{{(x^{'} ,y^{'} )}}^{(1)}\); and, finally, p(x,y) are computed from \(\varvec{Z}_{(x,y)}^{(2)}\). For the nature of convolution techniques, the posteriors at pixels near an edge cannot be computed. Such pixels are not analyzed in this study, and the resultant set of the posteriors, which has been referred to as the probabilistic map, is a 114 × 114 gray-scaled image:

$$\varvec{P} : = \left\{ {p_{(x,y)} \in \left[ {0, 1} \right] \big| x,y = 8, \ldots ,121} \right\}.$$
(12)

Pixels were predicted to be located within viral particles when the probabilities were over 50% in the probabilistic map of the output layer in the proposed convolutional network. This method shall be referred to as FCN hereinafter. We also considered the use of an additional step that removes particles with a radius of fewer than 2.5 pixels in the binarized probabilistic map for cleaning up small debris and noise artifacts. The FCN method including this additional step is denoted as FCN+.

Training Phase

To determine the value of filters θ, a set of N annotated images, \(\varvec{I}_{1} , \ldots ,\varvec{I}_{N} \in [0, 1]^{128 \times 128}\), was used and maximum likelihood estimation (MLE) was adopted. In the MLE approach, the negative log-likelihood function

$$E\left( {\varvec{\uptheta}} \right)\text{ := } - \mathop \sum \limits_{n = 1}^{N} \left( {\mathop \sum \limits_{{(x,y) \in {\mathcal{S}}_{n}^{ + } }} \log p_{(x,y)} (\varvec{I}_{n} ; {\varvec{\uptheta}}) + \mathop \sum \limits_{{(x,y) \in {\mathcal{S}}_{n}^{ - } }} \log (1 - p_{(x,y)} (I_{n} ; {\varvec{\uptheta}}))} \right)$$
(13)

is minimized with respect to θ, where the set of positive pixel locations and negative pixel locations in n-th annotated image \(\varvec{I}_{n}\) for n = 1, …, N are denoted by \({\mathcal{S}}_{n}^{ + }\) and \({\mathcal{S}}_{n}^{ - }\), respectively—positive pixels are on a viral particle and negative pixels are not. For the minimization, a stochastic gradient descent method was used. Recall that filters θ contain the model parameters of classification as well as those of feature extraction. Hence, finding the optimal filters θ is equivalent to acquiring the features and leaning the classifier in a unified framework.

Each acquired TEM image had orientation-independency in its structure such as viral particles, debris, noise, and backgrounds. To avoid overfitting, we therefore enriched our original datasets by adding the other seven sets of annotated images, which were obtained by rotating the original images with − π/2, π, and π/2. These were included in the original dataset. A Python deep learning library named Keras was employed to implement the proposed neural network. The TensorFlow backend was chosen. A computer with an Intel® Core™ i7 central processing unit, an NVIDIA TITAN X graphical processing unit (GPU; 12 GB), and 32 GB of random access memory was used for training and testing. The step size was adjusted using the RMSProp method with a learning rate of 0.001, ρ = 0.9, ε = 10−8, and no decay.

Existing Methods

The three existing methods, the cross-point method (CPM) (Martin et al. 1997), RDP (Sintorn et al. 2004; Kylberg et al. 2011), and spectral rings (SR) (Matuszewski and Shark 2001), were implemented using three Python libraries: Scikit-learn (Pedregosa et al. 2011), Scikit-image (van der Walt et al. 2014), and OpenCV (Bradski 2000), respectively. In the CPM (Martin et al. 1997), a circular window with a 15-pixel diameter was employed, and by sliding the window in the input image, the normalized correlations to the template at every location were computed. The template was the average of the circular images with a viral particle at the center of the window. In the RDP method (Sintorn et al. 2004; Kylberg et al. 2011), RDP features were extracted from every viral particle in the training images. The radius set for RDP was K′ = {0, 1,…, 7} and was used as positive examples for machine learning. Negative examples were picked from randomly chosen locations, and the number of negative examples was equalized to that of the positive examples. A linear support vector machine was employed to obtain the classifier of RDP. The implementation of the SR method (Matuszewski and Shark 2001) was similar to that of the RDP method. The difference compared with the RDP method was in the feature extraction; the SR features were extracted from a 15 × 15 pixel discrete Fourier transform image. For the existing methods, the score images were binarized using a threshold to obtain the predictions. The threshold for the existing methods was adapted to each testing image so that the highest detection performance was achieved, although the threshold of the proposed method was fixed to 0.5 as mentioned above.

Evaluation Metrics

For the quantitative evaluation of detection performance, we computed three widely used performance measures: precision, recall, and F-score. The three performance measures depend on the definitions of true positives, false positives, and false negatives. True positives are identified as correctly detected viral particles; false positives are wrongly detected particles, and false negatives are the particles that could not be detected. Precision is the ratio of true positives to all detected particles, recall is the ratio of true positives to the true viral particles, and F-score is the harmonic mean of the precision and recall. Of 35 TEM images in the dataset, 18 were randomly picked for training and the rest were used for testing. The two proposed new methods and three existing methods were examined with three different divisions for training and testing, yielding three sets of precision, recall, and F-score for each method.

Results and Discussion

Computational experiments were conducted to evaluate the detection performance against a dataset of TEM images containing feline calicivirus particles. The proposed neural network was compared with three existing methods: CPM (Martin et al. 1997), RDP (Sintorn et al. 2004; Kylberg et al. 2012), and SR (Matuszewski and Shark 2001).

Table 1 shows the averages and standard deviations of the values obtained for precision, recall, and F-score over three repetitions for each method. The proposed new methods achieved better recall values than did any existing method, meaning that the new methods produced fewer false negatives. In the context of viral particle detection, achieving fewer false negatives is preferable, and false positives are permissible because exhaustive detection is important to prevent the ignorance of infection. FCN+ achieved zero false positives, yet the degradation of the recall was small. It is also worth noting that the F-score of FCN+ was 0.999, demonstrating a near-perfect detection performance. The detection performance depends on the filter size and the network structure. To examine the effects of the filter size in the proposed convolutional network, the filter size of the first layer was varied with 3 × 3, 5 × 5, 7 × 7, 9 × 9, and 11 × 11 pixels. The resultant F-scores of FCN and FCN+ are presented in Fig. S2. The performance of FCN is improved monotonically with the increased filter size. However, larger filter sizes make the region to be analyzed narrower due to the characteristic of convolutional computation. FCN+ with the 7 × 7 filter achieved the best detection performance.

Table 1 Detection performance in TEM images

In addition, the detection performance on different numbers of units shown in Table S1 was examined. The highest performance was achieved when 64 and 16 units in the first and second layers, respectively, were employed. This tendency that fewer units are found in deeper layers reflects the nature of the convolutional network, which decomposes an input image into many primitive features and integrates them into high-level features in deeper layers. Typical deeper neural networks (LeCun et al. 1998, Wan et al. 2013; Krizhevsky et al. 2012) have several layers to decompose an input image into many primitives, whereas the proposed network uses only one layer for transformation to many primitives due to the shallow structure.

The probabilistic map of the input image depicted in Fig. 1a is exhibited in Fig. 3a. For the three existing methods, namely CPM (Martin et al. 1997), RDP (Sintorn et al. 2004; Kylberg et al. 2012), and SR (Matuszewski and Shark 2001), the score maps of the same input image are shown in Fig. 3b–d. As shown in Fig. 3a, the probabilistic map produced from FCN had a high contrast between the viral particles and the background. For CPM (Martin et al. 1997), the scores in the background were also suppressed, yielding few false positives, whereas the additional size filter step in FCN+ completely excluded small debris from the probabilistic map.

Fig. 3
figure 3

Comparison of the detection results. a Proposed method, b cross-point method (Martin et al. 1997), c radial density profiles (Sintorn et al. 2004; Kylberg et al. 2012), d spectral rings (Matuszewski and Shark 2001)

Figure S3(a) shows five examples of TEM images containing viral particles, and the ground truth is shown in Fig. S3(b). The probabilistic map of FCN and the score maps of CPM, RDP, and SR are shown in Fig. S3(c)–(f), respectively. A single viral particle is depicted in the first row of Fig. S3. Because of its weak response, RDP could not detect the viral particle in that image. Although the response of SR was blurred, the intensities were high, leading to successful detection. The second row of Fig. S3 shows an example of several viral particles located densely in a small area. SR recognized the cluster of particles as a single object due to the poor space resolution of SR. In the score maps of CPM and RDP, each viral particle appears clearly and was detected separately. In the example shown in the third row, the contrast was locally changed. The deteriorated contrast yielded a low score for the virus particle by CPM, leading to a failure of detection. SR showed large responses for the viral particles; unfortunately, SR perceived these particles as being connected and could not detect them separately. Two examples of debris are shown in the last two rows of Fig. S3. Every existing method wrongly assigned a high score to the debris, and some of them incorrectly detected it as a viral particle; only the proposed methods did not produce a false positive. The artifacts in the fifth row have a shape similar to that of a true virus particle, leading the proposed FCN method to wrong detection. However, the additional size filter step in FCN+ successfully excluded the debris.

To train deep neural networks with a huge scaled dataset, high-performance GPUs with large amounts of memory and a high computational ability are necessary. In our experiments, only 18 TEM images were available for training the convolutional network. Thanks to the simple shape of viruses used for testing the methods, an inexpensive GPU and a low computational time were sufficient to train our convolutional network. Overfitting, which is often caused by the small size of training data, was avoided by adopting a shallow structure. The shallow structure enjoys additional preferable properties. Our network is released from the vanishing gradient problem, which is a well-known concern in deep structures. In some recent deep networks, instead of the sigmoid function, an ReLU has often been employed to address the vanishing gradient problem. The ReLU function is not needed in our convolutional network because the network structure is shallow. It is also worth touching on two well-known techniques that have emerged as network structures. These two techniques are dropout (Wan et al. 2013; Hinton et al. 2012; Srivastava et al. 2014) and batch normalization (Ioffe and Szegedy 2015). Deeper neural networks are prone to overfitting, which can be addressed using dropout. However, dropout is not necessary for convolutional networks because they are not overly expressive compared to regular neural networks with a similar structure. It was shown that even for convolutional networks, dropout works well in fully connected layers (Wan et al. 2013; Krizhevsky et al. 2012), and Wu and Gu (2015) suggested introducing dropout to the max-pooling layers in convolutional networks. Nevertheless, those facts did not suffice to motivate us to introduce dropout into our method because our convolutional network is shallow and has no fully connected layer or max-pooling layer. Batch normalization (Ioffe and Szegedy 2015) is not attractive enough to be introduced into the current network structure. It does not always work well and sometimes makes the generalization performance worse and the computational burden heavier. The effects of drop-out and batch normalization were also investigated in the detection task to compare them with the classical stochastic gradient optimization. The results are summarized in Table S2, which suggests that the dropout and the batch normalization offer no improvements for our task.

The improvement in method performance is attributable to the automation of the feature design process, which represents a revolution in technology for virus detection from TEM images. In contrast, all existing techniques for this task resort to handmade features followed by several postprocessing steps of discriminating or filtering false positives. Our proposed approach is based on supervised learning that requires the set of input images and their segmented labels. Therefore, this method is limited for detection of already known viruses. However, once annotated datasets are available, the proposed method can detect virus particles accurately by learning trivial differences of each particle and variations of background changes.

In the future, we will face a more challenging detection task: the detection of multiple types of viruses in complex backgrounds. We are aware that many cases of detection of viral particles have complex backgrounds, such as a part of a specimen or debris. In such cases, it may be hard to detect viral particles accurately with shallow neural network architectures and simple pipelines. Since the majority of foodborne viruses are nonenveloped viruses with an icosahedral shape and similar diameter, which include Enterovirus, Hepatovirus, Norovirus, and Sapovirus (Bosch et al. 2008), we conjecture that exploitation of the spherical surface texture may improve the detection of such nonenveloped viruses. We believe that this paper is an important step for promoting future development and performance evaluation of virus detection methods in food and in the environment.