1 Introduction

Aquatic animal diseases pose significant challenges to the aquaculture industry, causing economic losses and compromising animal health. Therefore, precise prevention and control techniques are crucial. Extracting key events and information related to disease prevention is essential for developing effective strategies [1]. This study aims to enhance the accuracy and efficiency of event extraction using capsule network methodologies, supporting disease prevention and control efforts in aquaculture.

Event extraction is a technique used to automatically retrieve event-related information from natural language texts. It encompasses various approaches, including rule-based methods, machine learning-based methods, and deep learning-based methods [2]. Rule-based methods for event identification rely on manual rules and patterns, demanding significant manual effort and domain knowledge. Additionally, distinct rules are required for each event type, limiting their versatility. Machine learning-based methods, on the other hand, utilize algorithms to train models that can automatically recognize events. These methods necessitate annotated data for training but can adapt to various event types. Deep learning-based methods further leverage deep neural networks for automatic event information extraction from texts. These methods eliminate the need for manual feature engineering and can handle large-scale datasets, resulting in significant advancements in event extraction. Common deep learning models used include recurrent neural networks [3], long short-term memory networks [4], convolutional neural networks [5], among others. These models have proven effective in capturing complex relationships and dependencies within textual data, improving the accuracy and efficiency of event extraction.

Capsule networks have emerged as a prominent deep learning architecture in recent years, initially introduced by Hinton et al. in 2011 [6]. By incorporating dynamic routing, capsule networks offer advantages over traditional convolutional neural networks (CNNs) by considering the relationships between entities or parts in a more holistic manner. This allows capsule networks to capture richer spatial semantics and provide more comprehensive representations of the input data [7].

Capsule networks have shown promising results in various natural language processing tasks in recent years. Researchers have explored their potential in text classification, relation extraction, sentiment analysis, and other areas. For instance, Li Ranran et al. [8] introduced a hybrid model called GRU-ATT-Capsule, which incorporates capsule networks to capture the relationship features between local and global information in text, improving the performance of text classificat ion tasks. Wu Renbiao et al. [9] combined local and global features and leveraged capsule vectors to extract deep-level sentiment features, enhancing sentiment analysis for medium and long Weibo posts. In the domain of fine-grained sentiment classification, Jiang Tao et al. [10] addressed the limitations of shallow convolutional methods for sentiment feature extraction by proposing an improved capsule network-based approach. This approach achieved better performance in fine-grained sentiment classification tasks. Additionally, Dong Zhe et al. [11] proposed a relation extraction model that combines adversarial training and capsule networks.These studies highlight the potential of capsule networks in enhancing various natural language processing tasks and provide insights into their effectiveness in capturing complex relationships and features within textual data. In the field of event extraction, our research work has made breakthrough progress.

Deep learning methods based on pre-trained models have also been widely used in the field of aquaculture. The BiLSTM + Attention + CRF named entity recognition model proposed by Cheng et al. [12] can effectively utilize contextual structural features, avoid the problem of semantic dilution, and has good recognition performance for fishery standard named entity recognition. Yang et al. [13] proposed an entity relationship extraction method based on a dual attention mechanism to solve the problem of poor results caused by overlapping relationships in the fishery standard entity relationship extraction task. FU et al. [14] have proposed the method of data enhancement using pseudo-events to improve the performance of event extraction tasks. Nevertheless, we always adhere to the in-depth exploration of this field, and constantly adjust and optimize research focus, in order to obtain a more comprehensive and in-depth understanding.

In summary, this study further explores the realm of aquatic animal disease prevention and control, introducing a deep learning-based approach, BTCapMB, for extracting events pertaining to this crucial domain. The proposed method integrates capsule networks to enhance the extraction of pertinent information and capture the spatial semantic relationships among diverse event entities. By harnessing the strengths of capsule networks, BTCapMB strives to enhance the overall efficacy of event extraction in this field.

2 Materials and methods

2.1 Related work

2.1.1 Data collection and preprocessing

In this study, web scraping techniques were utilized to collect and build a dataset known as DLOU-FZ, which encompasses approximately 300,000 characters of data pertaining to the prevention and control of waterborne animal diseases. The dataset was compiled by extracting information from diverse sources, including websites dedicated to aquatic animal diseases, books focusing on aquatic animal disease prevention and control, and Baidu Baikie. It encompasses a wide range of preventive and control strategies for different types of waterborne animal diseases.

In order to maintain the semantic integrity of the lengthy texts within the DLOU-FZ dataset, a segmentation process was implemented to divide the texts into shorter segments while preserving the original formatting. This segmentation was achieved by utilizing punctuation marks such as commas, pauses, semicolons, colons, periods, and hyphens. Additionally, regular expressions were employed to eliminate irrelevant characters like emoticons and ellipses from the corpus of aquatic animal diseases. Subsequently, the corpus was formatted to ensure one character per line using the Ultra Edit text editing tool. This formatting facilitated further data processing and standardization. As an illustration, Table 1 demonstrates the control events associated with grass carp hemorrhagic disease.

Table 1 Grass carp hemorrhagic disease control incident case

2.1.2 Labeling process

In the DLOU-FZ dataset, the BIO word-level annotation method is employed. The labels are defined as follows:

  • B indicates the beginning of an entity.

  • I indicates positions inside an entity.

  • O indicates positions outside of any entity.

For disease entities, the label “H” is used to indicate the beginning and non-starting positions of a disease entity. Regarding trigger word recognition, the categories “YF” and “ZL” are defined for prevention and treatment events, respectively. The annotation of arguments in aquatic animal disease prevention and control events includes the following:

  • Dosage: labeled as “Y”.

  • Administration method: labeled as “F”.

  • Treatment duration: labeled as “T”.

  • Treatment frequency: labeled as “P”.

The annotation of argument roles for medications uses the initials of the first two letters of each category as labels. The categories include:

  • Environmental modifiers: labeled as “HJ”.

  • Disinfectants: labeled as “XD”.

  • Antimicrobial agents: labeled as “KW”.

  • Insecticides and acaricides: labeled as “SC”.

  • Metabolic enhancers and tonics: labeled as “DX".

  • Traditional Chinese medicine: labeled as “ZC”.

  • Biological products: labeled as “SW”.

  • Auxiliary drugs: labeled as “FZ”.

The definitions of trigger words and argument tags of events can be found in Table 2.

Table 2 Trigger words and argument label definitions

After analyzing the events in the DLOU-FZ corpus and consulting with experts in the aquatic field, it was found that the aquatic animal disease control events were more concerned with the correct use of aquatic animal disease-related drugs. After discussion with aquatic experts, this paper categorizes the trigger words of aquatic animal disease control events into two kinds: prevention of aquatic animal diseases (entities) and treatment of aquatic animal diseases (entities). A total of eight categories of event thesis elements were classified as follows: environmental modifiers, disinfectants, anti-microbials, insecticides and repellents, metabolism improvement and strengthening drugs, herbal medicines, biologics, and adjuvants. The roles of the thesis element include drug dosage, drug administration, duration of administration and frequency of administration. The event argument role categories, their meanings, and functions are provided in Table 3.

Table 3 Event argument role and function

Figure 1 is an example of the extraction and labeling of grass carp hemorrhagic disease prevention and control events.

Fig. 1
figure 1

Grass carp hemorrhagic disease label example

2.1.3 BTCapMB event extraction model

To address the issue of recognizing long-tail event entities in the extraction of aquatic animal disease prevention and control events, the BTCapMB event extraction method is proposed. The method involves the following specific steps:

  1. 1.

    Multi-model joint feature extraction: The initial features of the text are extracted using BERT [15], a pre-training model. These learned features are then passed through the Text CNN [16] model to further extract local features of the text. The integration of BERT and Text CNN helps capture both contextual information and local patterns in the text. Additionally, to model the spatial semantic relationships between event entities, the method incorporates the capsule neural network model, which facilitates spatial semantic information extraction.

  2. 2.

    Improved BiLSTM [17] model: The Multi-BiLSTM model is employed to model the features extracted by the upper-layer network. This model utilizes a layer-by-layer decreasing number of hidden nodes to learn features at different dimensions. The multi-BiLSTM model enhances the representation learning of the extracted features and captures dependencies between different components of the events.

The framework of the BTCapMB model is depicted in Fig. 2, illustrating the integration of BERT, Text CNN, and the Multi-BiLSTM model for effective event extraction in aquatic animal disease prevention and control tasks.

Fig. 2
figure 2

Model framework structure

2.1.4 Initial feature extraction layer

In this study, the BERT Chinese pre-trained model is employed to model the initial features of the DLOU-FZ dataset. Each character is represented by a 768-dimensional vector, extracted using the modeling approach shown in Eq. (1).

$$sentence = \left\{ \begin{gathered} w_{k} = BERT(\mathop w\nolimits_{i} ) \hfill \\ w1 \cup w_{2} \cup \cdots \cup w_{L} :L \times 768 \hfill \\ \mathop D\nolimits_{{\mathop w\nolimits_{k} }} = 768 \hfill \\ \end{gathered} \right.$$
(1)

In the equation, BERT (wi) represents the initialization of the current sentence using the BERT pre-trained model. The dimension of the initialized vector is 768. L denotes the length of the sentence, and Wk represents the word vectors. The operator indicates the vertical concatenation of all word vectors in the current sentence, resulting in an L × 768 matrix that maps the input sentence.

The Text CNN model is used to further extract local features from the text information learned by the BERT model through the operation of convolutional dimension reduction. The Eqs. (2) and (3) represent this process.

$$V_{i} = f\left( {W_{c} \cdot h_{{i:i + m = 1}} + b} \right)$$
(2)
$$V = \left[ {V_{1} ,V_{2} ,...,V_{n} } \right]$$
(3)

In the formula, \(W_{c}\) is the weight matrix, \(m\) is the sliding step of convolution, \(h_{i:i + m - 1}\) represents the matrix representation composed of moving \(m\) word vectors from the i-th position of the word vector matrix, \(b\) is the deviation item, and \(f\) is the Sigmoid activation function. \(V_{i}\) represents the convolutional feature value at the i-th position, and \(V\) represents the set of convolutional features.

2.1.5 Multi-BiLSTM layer

The multi-BiLSTM model is used to extract semantic features of different dimensions from the text of aquatic animal disease prevention and control events. By stacking multiple layers of BiLSTM, it achieves dimension reduction of high-dimensional features and captures more effective low-dimensional feature vectors. The final layer of BiLSTM outputs a matrix of low-dimensional features by vertically stacking the output vectors of each cell. The feature extraction process is represented by Eq. (4).

$$\mathop {output}\nolimits_{k} = \left\{ \begin{gathered} current\xleftarrow{L \times 768}g\left( {\mathop {\mathop \theta \nolimits_{k} ,\mathop {hdcell}\nolimits_{k} |output}\nolimits_{k - 1} } \right) \hfill \\ \mathop {output}\nolimits_{0} = g\left( {\mathop \theta \nolimits_{k - 1} \mathop {,hdcell}\nolimits_{k - 1} |sentence} \right) \hfill \\ \end{gathered} \right.$$
(4)

Among them, θk is the current layer network parameters, g (θk, hdcellk) is the formal representation of the current layer network, where hdcellk is the number of hidden layer nodes of the k-th layer BiLSTM, current is the output of the current layer network, and the last layer of BiLSTM The output is used as a text feature, denoted as output ‘n’, where ‘n’ is the number of layers of BiLSTM, and output is the last layer.

2.1.6 Capsule network layer

The Capsule Neural Network is utilized to capture the spatial semantic relationships and local features among different event entities within the current batch. The feature modeling of the input sentences by the Capsule Network is represented by Eq. (5).

$$feature\_cap = \left\{ \begin{gathered} BasicConv\xleftarrow{d \,= \,B\, \times \,Steps\, \times \,768}CapsB\left( {\mathop \theta \nolimits_{{{\text{caps} - 1}}} ,\mathop \theta \nolimits_{BERT} |sentence} \right) \hfill \\ Dig\left( {\mathop \theta \nolimits_{{{\text{caps} - 2}}} ,r|BasicConv} \right) \hfill \\ \end{gathered} \right.$$
(5)

In the equation, Caps B represents the formal representation of the basic layer convolution. \(\theta_{{{\text{caps}} - 1}} ,\theta_{caps - 2} ,\theta_{BERT}\) denotes the network parameters associated with the basic convolution layers. Dig represents the formal representation of the routing layer, and r indicates the number of routing iterations.

2.1.7 Feature fusion layer

Due to the utilization of two parallel networks in this approach, the features of the original input text are first extracted separately using multi-BiLSTM and the improved Capsule Neural Network. Subsequently, the features obtained from both networks are fused together. The feature fusion process is represented by Eq. (6).

$$feature\xleftarrow{{\mathop {\dim }\nolimits_{cap} + \mathop {\dim }\nolimits_{BiLSTM} }}\alpha \times \mathop {output}\nolimits_{n} \oplus \beta \times feature\_cap$$
(6)

In the equation, the outputs from the last layer of multi-BiLSTM and the features extracted by the Capsule Network are combined using a weighted sum. The weights \(\alpha\) and \(\beta\) are the weighting vectors, representing the contributions of the features extracted by the two methods to the extraction of event entities.

2.1.8 Experiment and result analysis

The experiments were conducted using the DLOU-FZ dataset, and several parameters were set to evaluate the performance of the BTCapMB model. These parameters included:

  • Word embeddings: BERT pre-trained word embeddings were utilized, with a dimensionality of 768. Word embeddings are representations of words in a continuous vector space, capturing semantic relationships between words.

  • Batch size: A batch size of 64 was used during the training process. Batch size refers to the number of samples processed in one iteration of training.

  • Optimizer: The Adam optimizer was employed for model optimization. Adam is a popular optimization algorithm that combines the benefits of both AdaGrad and RMSProp.

  • BiLSTM cell size: The BiLSTM cell size was set to 768. BiLSTM (Bidirectional Long Short-Term Memory) is a type of recurrent neural network that processes sequential data in both forward and backward directions, capturing dependencies in both directions.

The evaluation of the BTCapMB model was conducted utilizing standard metrics, namely Precision, Recall, and F1 score. Precision quantifies the proportion of correctly predicted positive samples, while Recall calculates the fraction of actual positive samples accurately identified. The F1 score, as the harmonic mean of Precision and Recall, offers a balanced metric for assessing the model's overall performance. This comprehensive evaluation framework ensures a rigorous and scientific assessment of the BTCapMB model's capabilities in extracting aquatic animal disease prevention and control events.

By analyzing these experimental parameters and evaluation metrics, the effectiveness of the BTCapMB model in extracting aquatic animal disease prevention and control events can be assessed.

2.1.9 BTCapMB comparison experiments

The experimental design for evaluating the proposed BTCapMB event extraction method includes the comparison of multiple models. Here is a summary of the experimental design:

  1. 1.

    BERT Model: The BERT model is used as a baseline for comparison. BERT is a pre-trained language model that has shown excellent performance in various natural language processing tasks.

  2. 2.

    Text CNN Model: The Text CNN model is included as another baseline model. Text CNN is known for its ability to capture local features from text using convolutional neural networks.

  3. 3.

    BiLSTM Model: The BiLSTM model, a bidirectional recurrent neural network, is utilized for its effectiveness in capturing long-range dependencies in text. In this study, an improved version called multi-BiLSTM is employed.

  4. 4.

    BTCapMB Model: The proposed BTCapMB model, which integrates the capsule network module, is the main focus of the experiments. The capsule network is expected to capture spatial semantic information and perform feature fusion, potentially enhancing the extraction of aquatic animal disease prevention and control events.

By comparing the performance of these five models, the effectiveness of the BTCapMB model can be evaluated. The experimental results, including Precision, Recall, and F1 score, are presented and compared in Table 4, providing insights into the performance of each model in extracting aquatic animal disease prevention and control events.

Table 4 BTCapMB model comparison experiment results

The comparison results presented in Table 4 demonstrate that the proposed BTCapMB model surpasses other models in extracting aquatic animal disease prevention and control events.With precision, recall, and F1 scores of 75.09, 76.59, and 75.83% respectively, the proposed model exhibits the highest overall performance. Notably, the F1 score of the BTCapMB model is elevated by 11.19% in comparison to other models, indicating its superiority in event extraction. Furthermore, the experiment utilizing the Text CNN model for capturing local features demonstrates superior performance compared to the CNN model, underscoring the effectiveness of Text CNN in this particular research task.

The experimental results firmly support the conclusion that the BTCapMB model, which seamlessly integrates BERT, Text CNN, Multi-BiLSTM, and CapNet models, effectively captures comprehensive text representations and spatial semantic information of event entities within the DLOU-FZ dataset. This integrated approach significantly boosts the performance of extracting aquatic animal disease prevention and control events. Consequently, the research establishes that the proposed BTCapMB model serves as an efficacious method for extracting such events in the realm of aquatic animal disease prevention and control.

To provide a more intuitive demonstration of the performance of our research model on the extraction of aquatic animal disease prevention and control events, we utilized a bar chart for visual representation. Model names were abbreviated using the initials of each model. Figure 3 displays the bar chart comparing the experimental results of the BTCapMB model with other models.

Fig. 3
figure 3

Histogram of BTCapMB model comparison experiment results

Based on Fig. 3, it is evident that the BTCapMB model outperforms other comparative models in terms of precision, recall, and F1 score, thereby providing further evidence of the effectiveness of this approach. The visual representation of the experimental results allows for a clearer depiction of the outstanding performance of the proposed model in the task of extracting aquatic animal disease prevention and control events.

2.1.10 BTCapMB ablation experiments

To validate the necessity of each module in the proposed BTCapMB model framework, ablation experiments were conducted on several specific functional modules, including Text CNN, Multi-Belts, and Caps Net. The results of these ablation experiments are presented in Table 5.

Table 5 BTCapMB model ablation experiment results

Based on the findings presented in Table 5, it is evident that the removal of each module individually from the proposed BTCapMB model results in reduced precision, recall, and F1 scores compared to the BTCapMB model itself. This observation underscores the BTCapMB model's effectiveness in enhancing the extraction of aquaculture disease prevention and treatment events. By incorporating spatial semantic information extraction and local feature extraction among event entities, along with integrating features from other models, the BTCapMB model successfully addresses the challenge of recognizing long-tail entities, thereby enhancing the overall extraction performance.

To offer a more intuitive visualization of our research model’s performance in extracting aquatic animal disease prevention and control events from textual data, we utilized bar charts for illustrative purposes, as presented in Fig. 4.

Fig. 4
figure 4

Histogram of BTCapMB model comparison experiment results

After comparing various models, it is evident that the proposed BTCapMB model exhibits exceptional performance in extracting events related to aquatic animal disease prevention and control. This model demonstrates superior precision, comprehensive recall, and a higher F1 score. The BTCapMB model's strength lies in its seamless integration of diverse deep learning models, each with its unique characteristics and capabilities. The BERT model adeptly learns semantic information of events from contextual cues, while the Text CNN model effectively captures local features. Additionally, the Caps Net model handles the spatial semantic information of events, and the multi-BiLSTM model models event sequences.This multi-model fusion strategy enables the BTCapMB model to capitalize on the strengths of each individual model, effectively tackling the challenge of recognizing long-tail entity instances in the extraction of aquatic animal disease prevention and control events. Consequently, it enhances the overall performance of event extraction. Therefore, the proposed BTCapMB model represents an effective and robust approach for event extraction, providing valuable support and guidance for disease prevention and control in the field of aquaculture.

3 Conclusion

The integration of capsule networks within the BTCapMB framework proposed in this paper provides a unique perspective for event extraction. Capsule networks are known for their ability to encode and represent hierarchical relationships between entities, and are good at capturing complex patterns associated with long-tail entities in aquaculture. This not only improves the accuracy of event extraction, but also enhances the interpretability of the results, making them more valuable to practitioners in the field. The practical importance of this research stems from its potential to revolutionize disease prevention and treatment strategies in aquaculture. By precisely extracting relevant events from large amounts of data, the BTCapMB framework can help identify disease outbreaks in a timely manner so proactive measures can be taken. This could therefore significantly reduce economic losses and antibiotic demand, thereby promoting sustainable aquaculture practices. However, it is important to acknowledge that, despite the promising results of the BTCapMB approach, it is not without challenges. While reliance on multi-model fusion provides performance benefits, it also introduces training and deployment complexities. Future research should strive to achieve a harmonious balance between performance and practicality, exploring ways to optimize the model while maintaining its effectiveness.

In conclusion, the BTCapMB event extraction method is a valuable addition to the arsenal of aquaculture disease prevention and treatment tools. Its practical significance lies in its ability to accurately extract long-tail entities, enabling smarter and more timely decisions in this critical area. With continued research and development, it has great potential to play a key role in promoting the sustainable development of the aquaculture industry.