Skip to main content

Table 1 Classification performance of the tested methods

From: Automatic detection of procedural knowledge in robotic-assisted surgical texts

Method Procedural Non-Procedural A Macro Weighted
P R F1 P R F1   P R F1 wP wR wF1
RandomForest 0.738 0.913 0.816 0.747 0.443 0.556 0.740 0.743 0.678 0.686 0.741 0.740 0.721
MultinomialNaïveBayes 0.717 0.965 0.823 0.852 0.344 0.491 0.737 0.785 0.655 0.657 0.767 0.737 0.701
LinearSVM 0.706 0.964 0.815 0.835 0.308 0.450 0.723 0.770 0.636 0.633 0.753 0.723 0.681
LogisticRegression 0.678 0.981 0.802 0.861 0.199 0.323 0.694 0.770 0.590 0.562 0.745 0.694 0.626
FastText 0.821 0.846 0.833 0.720 0.683 0.701 0.786 0.771 0.765 0.767 0.784 0.786 0.785
FastText[bal] 0.824 0.846 0.835 0.722 0.689 0.705 0.788 0.773 0.767 0.770 0.786 0.788 0.787
1D-CNN 0.889 0.834 0.861 0.742 0.821 0.780 0.829 0.816 0.828 0.820 0.835 0.829 0.831
1D-CNN[bal] 0.881 0.851 0.866 0.758 0.803 0.780 0.833 0.819 0.827 0.823 0.836 0.833 0.834
BiLSTM 0.894 0.896 0.895 0.820 0.817 0.818 0.867 0.857 0.856 0.857 0.867 0.867 0.867
BiLSTM[bal] 0.887 0.910 0.898 0.837 0.801 0.819 0.870 0.862 0.855 0.859 0.869 0.870 0.869
BERT 0.875 0.916 0.895 0.843 0.775 0.808 0.864 0.859 0.845 0.851 0.863 0.864 0.863
BERT[bal] 0.867 0.922 0.894 0.850 0.757 0.801 0.862 0.859 0.840 0.847 0.861 0.862 0.860
ClinicalBERT 0.886 0.915 0.900 0.845 0.797 0.821 0.872 0.866 0.856 0.860 0.871 0.871 0.871
ClinicalBERT[bal] 0.874 0.922 0.897 0.851 0.8771 0.809 0.866 0.862 0.846 0.853 0.865 0.866 0.865
  1. “[bal]” indicates training on a 50–50 balanced dataset (upsampling)
  2. Bold values indicate the highest values of the Macro-F1 and Weighted-F1 for each category of classification method considered