Earthquake catalogs record hypocenter information such as the earthquake origin time, location, and magnitude (M). These extremely important datasets are used in many areas of seismology and disaster mitigation, including for understanding the physics of earthquakes from their fault geometry and nucleation processes, monitoring interplate slip, and statistically estimating the probability of foreshocks and aftershocks (Arrowsmith et al. 2022). Performing these analyses at high resolution requires many hypocenters.

The earthquake catalog in Japan (hereafter, the JMA unified catalog) has been compiled by the Japan Meteorological Agency (JMA) in cooperation with the Ministry of Education, Culture, Sports, Science and Technology through visual inspection of seismic waveforms by experts. Tamaribuchi et al. (2016) and Tamaribuchi (2018) developed the phase combination forward search (PF) method to automatically determine hypocenters for the JMA unified catalog. The PF method, which can efficiently determine hypocenters even during high seismicity, was put into operation in JMA’s seismic monitoring system on 1 April 2016. The monitoring system achieved remarkable results during the 2016 Kumamoto earthquake sequence that started on 14 April 2016, automatically determining around 70,000 hypocenters over two months, and it has contributed to real-time seismic monitoring.

In retrospect, however, many small earthquakes remain undetected because of limited human resources. In particular, the 2011 Mw 9.0 off the Pacific coast of Tohoku Earthquake (hereafter, the 2011 Tohoku earthquake) caused many earthquakes over a wide area (Hirose et al. 2011). The magnitude of completeness (Mc, i.e., the minimum magnitude at which earthquakes are completely detected, which depends on the location of the stations and the seismicity) along the Tohoku coast is Mc ≈ 1 under normal conditions, but increased to Mc ≥ 2–3 after the 2011 Tohoku earthquake (Tamaribuchi 2018). Therefore, the details of the widespread microseismicity following the 2011 Tohoku earthquake remain unclear.

Tamaribuchi and Nakagawa (2020) attempted to apply the PF method to seismic waveforms during March 2011 to detect small earthquakes after the 2011 Tohoku earthquake. They detected twice as many earthquakes as recorded in the conventional JMA unified catalog, but up to 10% of the automatically determined hypocenters were false detections due to noise. Of course, the rate of false detections depends on the noise environment and the seismicity. Furthermore, correct phase picking of P- and S-waves is important because falsely picked phases generally require additional computational time for hypocenter estimation.

Traditionally, many statistical approaches have been used for seismic wave detection and phase picking (Allen 1978; Baillard et al. 2013; Nakamula et al. 2007; Saragiotis et al. 2002; Yokota et al. 1981). However, statistical approaches can also detect non-stationary noise that needs to be eliminated after phase picking, which requires feature engineering. Later, Gibbons and Ringdal (2006) developed a template-matching technique that uses the seismic waveform itself as a template to search for similar waveforms. For example, Ross et al. (2019a) used template-matching to detect 1.81 million earthquakes in Southern California from 2008 to 2017, successfully lowering Mc from 1.7 to 0.3. Machine learning, including deep learning, has also been applied to seismology, and many deep learning approaches have been proposed for phase picking (Feng et al. 2022; Mousavi et al. 2020; Ross et al. 2018a, b; Yano et al. 2021; Zhu and Beroza 2018). Machine learning methods have also advanced fundamental techniques required for earthquake monitoring, such as phase association, which compiles the phase picks for each earthquake (Ross et al. 2019b; Zhang et al. 2019; Zhu et al. 2022). These methods are similar to the PF method in that they search for the best combination from a large number of phase picks, and each has been successful. For a summary of applications of machine learning to earthquake monitoring, see the review by Mousavi and Beroza (2023).

Deep learning approaches have also been applied retrospectively to earthquake catalogs. Tan et al. (2021) applied deep learning phase picking and phase association to 1-year waveforms following the 2016 Amatrice earthquake in Italy to determine ~ 900,000 earthquakes, which lowered Mc from 2.2 to 0.5 (Beroza et al. 2021). Yoon et al. (2023) also used an automated workflow based on deep learning to detect ~ 180,000 earthquakes in Southwestern Puerto Rico from late 2019 to 2023, successfully adding ~ 7 times as many events as the conventional catalog.

Using state-of-the-art technology to reanalyze earthquakes previously undetected by experts is extremely important for understanding microseismicity. High-resolution catalogs can provide insights into important physical processes such as dynamic and static earthquake triggering and nucleation processes. For more information on big-data-driven advances in seismology, see, for example, the review by Arrowsmith et al. (2022).

Understanding how seismicity changed after the 2011 Tohoku earthquake is an extremely important and urgent issue when considering large earthquake processes and the risks of aftershocks and induced earthquakes along the Nankai Trough and the Kuril Trench, which are feared to occur in the near future. Therefore, we here incorporated machine learning into the PF method to further improve the accuracy of automatic hypocenter determinations. The objective of this study was to clarify the more detailed seismicity after the 2011 Tohoku earthquake by detecting hidden earthquakes using a fully automated process.

Data and methods


For our analysis, we used seismic waveforms from around 1,400 stations throughout Japan (Fig. 1) and recorded during the year from 00:00 on 1 March 2011 to 24:00 on 29 February 2012 (all times in this article are reported in Japan Standard Time, JST = UTC + 9 h). Most of the seismometers were high-sensitivity, three-component velocity meters sampling at 100 Hz, which were also used for the JMA unified catalog. After the 11 March 2011 Tohoku earthquake, the widespread power outage in the Tohoku region rendered the observation network within an epicentral distance of about 400 km temporarily unavailable. The period of unavailability depends on the station. At 14:00 on 11 March, just before the Tohoku earthquake, the number of available stations was 1374, while at 17:00 on 11 March, it was 1129, with more than 200 stations temporarily unavailable. The number of available stations recovered to 1181 at 17:00 on 12 March, 1255 at 17:00 on 13 March, and 1300 at 17:00 on 14 March. The number of available stations is shown in Fig. 2a.

Fig. 1
figure 1

Tectonic setting. Triangles indicate the seismic observation network; stations in light blue were available at 00:00 JST on 1 March 2011, and stations overlaid by dark blue are those that were still available at 17:00 on 11 March, when much of the seismic network was offline. The red star indicates the epicenter of the Mw 9.0 Tohoku earthquake, and the red contour indicates the area of > 10 m of coseismic slip according to Yoshida et al. (2011). Bold gray arrows indicate the movement of the Pacific Plate and the Philippine Sea Plate relative to the Amur Plate (DeMets et al. 2010)

Fig. 2
figure 2

Number of automatically determined hypocenters per day during the study period. a Number of stations available, and b number of events for each label. Labels K and S are quality labels from the JMA unified catalog and labels A–D are quality labels from this study (see “Quality control label classifier”); only labels K, S, A, and B are included in our merged catalog

Method overview

Our analytical procedure is illustrated in Fig. 3. The automatic hypocenter determination (Fig. 3a) consisted of (1) phase picking, (2) phase association, and (3) hypocenter calculation. (1) Phase picking was performed using conventional phase pickers such as the Autoregressive-Akaike’s Information Criterion (AR-AIC) method (Yokota et al. 1981). We then used a convolutional neural network (CNN) to discriminate picks based on the waveforms around the arrival times (hereafter phase classifier; see “Phase classifier” section). (2) Phase association was used to search for the optimal combination of phase picks using the importance sampling algorithm. For details on phase picking and phase association in the PF method, see Tamaribuchi (2018) and Tamaribuchi et al. (2021). (3) Hypocenters were then calculated using the same method used for the JMA unified catalog (Ueno et al. 2002; Hamada et al. 1983).

Fig. 3
figure 3

Flowcharts of our automatic hypocenter determination and catalog merging procedures. a Hypocenter determination. b Procedure for creating the merged catalog

The automatically determined hypocenters were further processed for (4) blast removal, (5) quality control, and (6) comparison with the JMA unified catalog, and then they were merged with the JMA unified catalog (Fig. 3b). (4) Because the automatically determined hypocenters included blasts, the blasts were identified by waveform correlation and removed (“Removal of blasts” section). (5) For quality control, each hypocenter was assigned a quality label (A–D, in order of decreasing quality) based on various hypocentral and phase picking features (e.g., hypocentral error, the number of phase picks) using a light gradient-boosting machine (LightGBM) algorithm (“Quality control label classifier” section). (6) Finally, we compared our catalog of automatically determined hypocenters with the JMA unified catalog, and added the unmatched automatically determined hypocenters to the JMA unified catalog to create a merged catalog (“Creation of the merged catalog” section).

Phase classifier

After phase picking, we fed 4-s-duration (i.e., 2 s before and after the P and S picks) three-component seismic waveforms to a convolutional neural network (CNN) to obtain the probabilities that each pick was a P-wave, S-wave, or noise using the generalized phase detection model architecture of Ross et al. (2018b). The specific model used in this study was trained on approximately 900,000 waveforms by Kudo et al. (2023). The waveforms were high-pass-filtered at 2 Hz; if any of the P-wave, S-wave, or noise probabilities were above the threshold probability (pth = 0.9, see following paragraphs), the other phase types detected within ± 2 s were removed. That is, if the P-wave probability was greater than 0.9, the S-wave phase picks within ± 2 s were discarded. Similarly, if the S-wave probability was greater than 0.9, the P-wave phase picks within ± 2 s were discarded, and if the noise probability was greater than 0.9, the P- and S-wave phase picks within ± 2 s were discarded. This process is called the phase classifier. Although the CNN can be used for phase picking, as originally proposed by Ross et al. (2018b), a preliminary study by Kudo et al. (2023) on Japanese continuous waveforms showed an increase in both false negatives and false positives compared to the conventional picker. Therefore, we used the CNN as the phase classifier in this study.

To evaluate the performance of the phase classifier under normal and swarm conditions, we applied different pth values to 12-h waveforms under normal (00:00–12:00 on 1 March 2011) and swarm conditions (12:00–24:00 on 11 March 2011), respectively. Figure 4a–d shows an example over a 3-min period under swarm conditions. The waveforms at station ODAWA2 near the epicenter (Fig. 4b) show that when pth = 1.0 (i.e., without the phase classifier), the P- and S- waves were incorrectly read simultaneously upon arrival of the P-wave, whereas when pth = 0.9, the P- and S-waves were appropriately read separately. In the case of the conventional picker without the phase classifier, the AR-AIC method simply detected discontinuities in the seismic waveform, which was insufficient to classify P- and S-waves (Tamaribuchi 2018). Figure 4c, d show the same comparison for the other stations as shown in Fig. 4a.

Fig. 4
figure 4

Example of the phase classifier. a Station distribution (blue triangles). Green stars are epicenters from the JMA unified catalog during 23:52–23:55 on 11 March 2011. b The waveform of the unfiltered vertical component at station ODAWA2. Symbols plotted above the waveform are pick times when pth = 1.0, and those below are pick times when pth = 0.9. Circles and crosses are P- and S-wave picks, respectively. c, d P- and S-phase picks when pth = 1.0 and 0.9, respectively. The dashed lines indicate the latitude of station OADAWA2. e, f Total number of picks and precision (i.e., the percentage of automatically determined hypocenters that coincide with the JMA unified catalog) during 00:00–12:00 on 1 March 2011 and 12:00–24:00 on 11 March 2011, respectively

We also compared our automatically determined hypocenters with those in the JMA unified catalog. Figure 4e, f show the number of phase picks and the precision under normal and swarm conditions, respectively, where the precision is the ratio of the number of automatically determined hypocenters that matched the JMA unified catalog to the total number of automatically determined hypocenters. With decreasing pth under both conditions, the number of detections decreased significantly, but the precision increased until pth = 0.90. This result indicates that the phase classifier removed noise and discriminated P- and S-waves adequately. Because the highest precision was obtained at pth = 0.90 under both normal and swarm conditions, we set to pth = 0.90. Compared to pth = 1.0 under both conditions, the number of automatically determined hypocenters that matched the JMA unified catalog remained almost the same, but the total number of P- and S-wave picks was reduced by about 30%. The reason for the low precision (Fig. 4e, f, up to ~ 40%) is that the automatically determined hypocenters contained many true microearthquakes that are not included in the JMA unified catalog (discussed in “Results” section).

Removal of blasts

In general, automatically determined hypocenters include blast events. Therefore, we performed a waveform correlation analysis using 1655 blast waveforms collected by JMA during 2017–2021 as templates to remove events with similar waveforms. First, we extracted automatically determined hypocenters within 5 km of the known blast sites (Fig. 5a). We calculated cross-correlation coefficients between the blast templates and the automatically determined hypocenters, and hypocenters with correlation coefficients ≥ 0.5 at three or more stations were considered blast candidates. A 2–8 Hz bandpass filter was applied to the waveforms, the waveform window for calculating the correlation coefficient was 3 s, and the search range for the maximum correlation coefficient was within ± 2 s of the P- and S-wave arrival times.

Fig. 5
figure 5

Distribution of blast templates. a Blue circles indicate blasts, and red crosses indicate possible earthquakes that were detected by cross-correlation but are not considered to be blasts. # is the template number. b Histogram of events with high cross-correlation with template #107; these were considered blast events in central Oita Prefecture and removed because the activity was concentrated during morning working hours. c Histogram of events with high cross-correlation with template #1598 in Aizu, Fukushima Prefecture; these events were not removed from the catalog because they did not vary with time of day and because a nearby earthquake swarm was detected

Although the extracted blast candidates were highly correlated with the blast templates, they sometimes contained true earthquake clusters. Therefore, we plotted a time–frequency histogram of blast candidates for each template; blast candidates observed throughout the day and night were considered to be true seismicity, whereas those observed only during the day were considered to be artificial blasts and were removed from the automatically determined hypocenters. Examples of temporally biased and unbiased candidates are shown in Fig. 5b, c, respectively. For example, template index #107 was a blast in central Oita Prefecture and was removed because the activity was concentrated during morning working hours. In contrast, template index #1598 was not actually a blast in Aizu, Fukushima Prefecture, and was kept because it detected a true earthquake cluster occurring in the vicinity of a blast template site.

Overall, 16,371 of 927,899 automatically determined hypocenters from March 2011 to February 2012 were determined to be blasts and removed, leaving 911,528 hypocenters for further processing. Based on our time–frequency analysis, 22,633 events were highly correlated with the blast templates, but were determined to be earthquake clusters.

Quality control label classifier

After removing blasts, we further eliminated characteristic false detections and mislocations within the automatically determined hypocenters. Tamaribuchi et al. (2021) proposed a supervised ensemble learning method to discriminate between earthquakes and noises based on the hypocenter and its phase data determined at 20 stations near the epicenter. In addition to their recommended input phase features, we added maximum amplitudes, the period of the maximum amplitude, and the time of the maximum amplitude after the arrival time at each station (Additional file 1: Table S1). We also checked the status of available stations every hour and selected 20 stations in the vicinity of the epicenter, excluding any stations that were unavailable.

We used the automatically determined earthquake catalog (36,806 events from 1–16 March 2011) developed by Tamaribuchi and Nakagawa (2020) as training data. We considered automatically determined hypocenters within 5 s of the origin time and 50 km of a hypocenter in the JMA unified catalog to be matched, resulting in 12,982 matched hypocenters. The remaining 23,824 hypocenters were visually checked and given one of four labels: A, earthquakes with generally correct picks (12,633); B, earthquakes with one or two clear errors in the picks (4785); C, earthquakes with three or more errors in the picks or one earthquake erroneously separated into several events (3843); and D, false detections caused by noise (2563). Refer to Fig. 2 in Tamaribuchi et al. (2021) for the corresponding waveform examples mentioned in each label. That is, 7% of the hypocenters automatically determined by Tamaribuchi and Nakagawa (2020) contained noise during this period. The 12,982 matched earthquakes were also assigned label A. Figure 6a–d shows the distribution of the training data. This dataset was then randomly separated in a 6:2:2 ratio for training, validation, and testing, respectively. We used LightGBM (Ke et al. 2017) because it had the highest accuracy after a preliminary comparison of several models in the Pycaret library (Ali 2020). LightGBM is a robust and efficient gradient boosting framework. It employs tree-based models to handle small and large datasets promptly, making it popular for tasks such as classification and regression. Its primary advantages include velocity, low memory consumption and high accuracy.

Fig. 6
figure 6

Epicentral distributions for quality labels (a, e, i) A, (b, f, j) B, (c, g, k) C, and (d, h, l) D. Symbol colors indicate depth. ad Training dataset, i.e., automatically determined hypocenters from Tamaribuchi and Nakagawa (2020) that we visually classified into each label. el Automatically classified hypocenters using LightGBM label classifier in this study during the period 1–31 March 2011. il An enlarged view of the area shown by the green box in (f)

Table 1 shows the confusion matrix and performance evaluation metrics for the test data subset. We evaluated model performance based on accuracy, precision, recall, and the F1-score. Accuracy is the proportion of correctly predicted labels among all predictions, calculated for all labels as (TP + TN)/(TP + FP + TN + FN), where TP represents true positives (i.e., the model correctly predicted the label), TN true negatives (correctly predicted other labels), FP false positives (incorrectly predicted the label), and FN false negatives (incorrectly predicted other labels). Precision represents the proportion of correctly predicted labels among all predicted labels, calculated for a given label (or combination of labels) as TP/(TP + FP). Recall represents the proportion of correctly predicted labels among all true labels, calculated as TP/(TP + FN). The F1-score represents the harmonic mean of precision and recall, calculated as 2 (Precision × Recall)/(Precision + Recall). Overall accuracy was 81.9%. Although labels A and D had F1-scores of around 0.8–0.9, labels B and C had lower values of around 0.5, indicating that, even if there were some errors in the phase picks, they did not have a significant impact on the hypocenter calculation. Because we focused on noise elimination, accuracy was 97.7% when the binary label classification of A–C vs. D was considered, which is almost equivalent to the conventional noise classifier proposed by Tamaribuchi et al. (2021).

Table 1 Confusion matrix of the classification results using our LightGBM quality control classifier

Figure 6e–h shows the classification results when this classifier was applied to our refined automatically determined hypocenter dataset during March 2011. Most of the automatically determined hypocenters were classified as label A, whereas those in island areas were mostly classified as label D. Figure 6i–l shows an enlarged image of the off-Tokachi region, where hypocenters were most often assigned label C. This result is due to incorrect phase association. The phase picks of an earthquake that occurred off the coast of the Tohoku region were associated with multiple earthquakes, and one of the hypocenters was incorrectly determined to be off Tokachi. Therefore, labels C and D were excluded from further consideration to eliminate such characteristic false determinations.

Creation of the merged catalog

We created our merged catalog by merging the JMA unified catalog with our refined automatically determined hypocenters. Upon comparison of the two catalogs, hypocenters with origin times and epicentral locations that differed by ≤ 5 s and ≤ 50 km, respectively, were considered to be coincident. If two hypocenters were coincident, we adopted the one in the JMA unified catalog. If there were two or more automatically determined hypocenters within 5 s and 50 km of the hypocenter in the JMA unified catalog, the one with the shortest origin time difference was considered to be coincident. This means that if there are several automatically determined hypocenters within 5 s, one is coincident and the rest are not. Finally, automatically determined hypocenters with labeled A or B that did not match any hypocenter in the JMA unified catalog were considered to be earthquakes and merged with the JMA unified catalog.


Number of earthquakes

When we applied our analysis to seismic waveforms recorded from 1 March 2011 to 29 February 2012, 927,899 hypocenters were automatically determined. We eliminated 16,371 events identified as blasts by waveform correlation, and 21,556 and 7,323 events classified as having quality control labels C and D, respectively. Among the remaining earthquakes with quality control labels A and B, 583,561 and 18,892 hypocenters, respectively, did not match the JMA unified catalog, i.e., they were not already listed in the JMA unified catalog. These were merged with 320,427 events in the JMA unified catalog (labels K, S, excluding low-frequency earthquakes) to obtain a total of 922,880 events in the merged catalog. Here, quality labels K or S are assigned for high or low accuracy, respectively, in the JMA unified catalog. The obtained 922,880 events in our merged catalog (labels K, S, A, and B) were roughly triple the number in the JMA unified catalog alone (320,427 events, labels K and S, excluding low-frequency earthquakes). Figure 2 presents the daily numbers of earthquakes and available stations. Seismic network outages in much of the Tohoku region immediately after 11 March 2011 reduced seismic detectability for a few days. The number of available stations also temporarily dropped to around 1000 twice after April 2011, but that effect was only one to several hours and thus had only minor impacts on the overall catalog; therefore, we will not go into further detail here.

The daily number of earthquakes in the merged catalog (labels K, S, A and B in Fig. 2b) gradually decreased, whereas that in the JMA unified catalog (labels K and S) remained almost constant after June 2011. The constant number seen in the JMA unified catalog is an artifact resulting from the gradual increase in the number of M ~ 1 earthquakes, and there is no difference in the number of earthquakes above M 2 ~ 3 between the two catalogs. This indicates the upper limit of manually determinable earthquakes per day.

Improvements achieved using the phase classifier

Compared to the seismicity reported during March 2011 by Tamaribuchi and Nakagawa (2020), the hypocentral distribution obtained here shows that the application of the phase classifier reduced false detections and mislocations in especially offshore areas (Fig. 7a, b) and resulted in a slight reduction of the number of events with quality label A (by 4%), but a marked reduction of the numbers of events with labels B, C, and D (by 22%, 26%, and 78% respectively; Table 2a).

Fig. 7
figure 7

Comparison of automatically determined hypocenters before and after applying the phase classifier. a, c Automatically determined hypocenters from Tamaribuchi and Nakagawa (2020), i.e., without the phase classifier; b, d the same data after application of our phase classifier. a, b Hypocenter distributions; symbol colors indicate depth. c, d Percentage of false P- and S-phase picks within ± 0.5 s of S- and P-wave arrivals, respectively, in the JMA unified catalog. The vertical blue dashed line indicates the origin time of the 2011 Tohoku earthquake

Table 2 Comparison of the results of Tamaribuchi and Nakagawa (2020), abbreviated TN20, before and after the application of the phase classifier in this study

We counted the number of automatically picked phases at the same station within ± 0.5 s of the P- and S-wave arrival times in the JMA unified catalog and calculated the percentages of falsely detected P- and S-phases as the proportion of S-phases picked within ± 0.5 s of a P-wave arrival and that of P-phases picked within ± 0.5 s of an S-wave arrival, respectively. A comparison of our results with those of Tamaribuchi and Nakagawa (2020) shows that the application of the phase classifier reduced falsely picked P- and S-waves from 1.6 to 0.6% and from 4.2 to 1.6%, respectively (Fig. 7c, d). The proportion of false picks in our results remained unchanged after the 2011 Tohoku earthquake.

Table 2 reports the number of events matched with the JMA unified catalog and their residuals and standard deviations. Tamaribuchi and Nakagawa (2020) detected 122,353 events, whereas we detected 112,004 events, a decrease of nearly 10,000. Of those, however, the number of matches with the JMA unified catalog was almost the same: 44,575–44,566 events, respectively. This result indicates that the number of false detections and mislocations in our results was greatly reduced by the reduction in false picks, which also resulted in lower residuals between the JMA unified catalog and the matched hypocenters.

Frequency-magnitude distribution

The frequency-magnitude distributions of the JMA unified and our merged catalogs are shown in Figs. 8 and Additional file 1: Fig. S1 to illustrate the change in earthquake detectability in the aftershock area of the 2011 Tohoku earthquake. The number of hypocenters more than tripled from 215,709 in the JMA unified catalog to 701,516 in the merged catalog. We determined Mc based on the “MAXC” approach proposed by Wiemer and Wyss (2000), which considers the highest frequency of events in the non-cumulative frequency-magnitude distribution (MAXC) and defines Mc as MAXC + 0.2 (Woessner and Wiemer 2005).

Fig. 8
figure 8

Changes in earthquake detectability along the Japan Trench. a Hypocentral distribution in our merged catalog; symbol colors indicate depth. b Frequency-magnitude distribution for each region. Red and black symbols indicate the number of events in the JMA unified catalog and our merged catalog, respectively. Mc is based on MAXC + 0.2 (Woessner and Wiemer 2005). Region 1 does not include the epicenters in region 4

In our results, Mc in the aftershock area decreased along the coast (areas 1 and 4 in Fig. 8), although the number of large earthquakes was almost unchanged because they were already included in the JMA unified catalog. Farther from the coast (areas 2 and 3 in Fig. 8), our method did not improve detectability because of the sparse station distribution. In contrast, in our results, Mc decreased in many inland areas (Additional file 1: Fig. S1). In particular, the number of hypocenters in the inland areas of the Tohoku region, such as Akita Prefecture (areas 6 and 7 in Additional file 1: Fig. S1), increased by a factor of 4 to 5. Magnitude-time diagrams (Additional file 1: Fig. S1d) show that microearthquakes during April to July 2011 were not included in the JMA unified catalog, resulting in a large discrepancy in the number of events compared with our merged catalog. Although our estimation of Mc is simplified and that in the JMA unified catalog is clearly underestimated, Mc is significantly lower in our merged catalog. Mc was also decreased at distances up to 800 km from the epicenter of the 2011 Tohoku earthquake (star in Fig. 1), such as in area 23 (Additional file 1: Fig. S1), which seems to reflect the limitations of manual inspection.


Swarm migration

Seismicity increased in the northern part of Akita Prefecture (area 6 in Additional file 1: Fig. S1) after 11 March 2011 (Additional file 1: Fig. S1d), even though detectability may have decreased due to station outages during 11–13 March. We applied the double-difference (DD) method (Waldhauser and Ellsworth 2000) to determine the precise locations of events in Akita Prefecture between 1 March and 30 April 2011 in the merged catalog. We used travel-time difference data calculated by waveform cross-correlation. Of 1210 events with labels of K or A, 1118 could be redetermined by the DD method, compared to just 140 in the JMA unified catalog alone. The distribution of the relocated events is shown in Fig. 9, as well as the location of Moriyoshi-zan, a Quaternary volcano. Many of the hypocenters were distributed in clusters aligned on the northwest and north sides of Moriyoshi-zan (cross sections X–X′ and Y–Y′) and hypocenters aligned on the northwest side were several inclined clusters (red arrows in cross section X–X′). Although the focal depths under Moriyoshi-zan were shallow (0–5 km), they tended to deepen farther from the volcano (~ 10 km depth, the northwest side in cross section X–X′), suggesting that geothermal activity near Moriyoshi-zan was higher than in the surrounding areas.

Fig. 9
figure 9

Precise hypocentral distribution in northern Akita Prefecture (region 6 in Fig. S1); except (b), symbol colors indicate depth. a Hypocentral distribution of the merged catalog with events recalculated by the DD method with cross-correlation (1 March–30 April 2011, 0–20 km depth). Blue triangles show observation stations, and the black triangle indicates the Moriyoshi-zan volcano. Bottom panels show cross sections X–X′ and Y–Y′. Green and blue dotted circles indicate probable scatterers estimated by Kosuga (2014) and Amezawa et al. (2019), respectively. b Magnitude-time diagram of events in (a). Red and gray circles are events in the JMA unified catalog and our merged catalog, respectively. c, d Spatiotemporal and depth-time distributions, respectively, projected on line X–X′. e, f Spatiotemporal and depth-time distributions, respectively, projected on line Y–Y′. Vertical blue dashed lines indicate the origin time of the 2011 Tohoku earthquake

Kosuga (2014) applied the DD method to estimate the precise hypocentral distribution of a cluster north of Moriyoshi-zan (cross section Y–Y′, Fig. 9) and estimated the hydraulic diffusivity. They also estimated the locations of scatterers from the reflected/scattered phases. However, their study covered only the period after May 2011. Amezawa et al. (2019) also estimated the location of scatterers north of Moriyoshi-zan (cross section Y–Y′, Fig. 9) using a small seismometer array operated during the period from November 2012 to May 2014. Here, we focused on the period from March to April 2011, during which most of the events in the merged catalog were not listed in the JMA unified catalog, and we identified some clusters consistent with the probable scatterers estimated by Kosuga (2014); red arrows in cross section X–X′, Fig. 9a). This activity has not been identified before because it subsided by early April 2011. The dashed arrows in Fig. 9c show the complicated migration of the hypocenters. The spatiotemporal distribution projected on line Y–Y' in Fig. 9e also shows northward migration during the early phase of the activity (16–21 March), followed by a slight southward migration from 21 March to 8 April. The uncertainties on the locations of their estimated scatterers are beyond the scope of our study, but the seismic clusters were distributed shallower than the scatterers they reported. As interpreted by Kosuga (2014), these activities likely reflect fluid migrations from the scatterers, which are probably geofluid reservoirs. Another possibility is that aseismic slip may also be involved in the complex migrations, as suggested in other volcanic areas (e.g., Yukutake et al. 2022). Further investigation using the merged catalog will contribute to our understanding of swarm migrations recently after a giant earthquake.

Foreshock activity

An M 6.7 earthquake (6 + on the JMA intensity scale) occurred in northern Nagano Prefecture at 3:59 on 12 March 2011 (area 14 in Additional file 1: Fig. S1). In this region, the JMA unified catalog did not list any earthquakes until 21:48 on 11 March, but the merged catalog detected earthquakes as early as 14:55 on 11 March. As a result, we detected 86 earthquakes that occurred after the Mw 9.0 Tohoku earthquake at 14:46 on 11 March and preceding the M 6.7 earthquake (Fig. 10), only 10 of which were listed in the JMA unified catalog. During that period, seismicity was observed south of the epicenter of the M 6.7 earthquake, but largely subsided after the M 6.7 earthquake. The JMA unified catalog did not adequately capture such changes in seismicity. However, there is no evidence of nucleation processes such as the migration or acceleration of foreshocks in this region. Shimojo et al. (2014) detected 139 events during this period by template matching using data from 10 Hi-net stations and suggested that the southern seismicity was likely caused by the migration of geothermal crustal fluids activated by the surface wave of the 2011 Tohoku earthquake. Although we detected fewer hypocenters, the observed trend is consistent with the results of Shimojo et al. (2014). We emphasize that our method does not require a template. We highlight the potential of our automated process to monitor changes in small foreshock activity following a large earthquake. Zaliapin and Ben-Zion (2013) and Tamaribuchi et al. (2018) extracted foreshock activities from many earthquakes in the catalog and contributed to the statistical analysis of foreshocks. Other methods have also been proposed, for example, to statistically extract foreshock conditions from the past catalog and use them to predict mainshocks (Hirose et al. 2021). Because the number of earthquakes in the catalog is important for such analyses, our method of merging earthquake catalogs should prove useful to future contributions on this topic.

Fig. 10
figure 10

Hypocentral distribution in northern Nagano Prefecture (region 14 in Additional file 1: Fig. S1). a Hypocentral distribution in our merged catalog (11–12 March 2011, 0–50 km depth). Symbol colors indicate depth. b Spatiotemporal distribution projected in the N–S direction. c Magnitude-time diagram; red and gray circles are events in the JMA unified catalog and our merged catalog, respectively. Curves are plotted against the right vertical axis and show the cumulative numbers of events. Vertical blue dashed lines indicate the origin time of the 2011 Tohoku earthquake

Seismicity in and around the Tohoku region

Map and cross-sectional views of the hypocentral distribution in the Tohoku region during the year following the 2011 Tohoku earthquake (Figs. 11, 12) clearly highlight the associated aftershock activity. The northern limit of the 2011 Tohoku earthquake aftershock area is about 40.3°N, and little seismicity occurred near the plate boundary in cross section A–A' during that time. In contrast, aftershock seismicity was particularly active along the coast from off Iwate to off Fukushima. Although detectability decreases away from the coast, most of these earthquakes were associated with plate subduction at depths of 10–50 km. Figure 12 highlights particularly active areas from off southern Iwate Prefecture to off Miyagi Prefecture (between lines C–C′ and E–E′) and off Fukushima Prefecture (south of line F–F′). In contrast, relatively little seismicity occurred off northern Iwate Prefecture (between lines B–B′ and C–C′) and off the Miyagi-Fukushima prefectural border (between lines E–E′ and F–F′). Tanioka et al. (1997) explained that the plate boundary off Iwate Prefecture is rough because of a well-developed horst-and-graben structure on the subducting Pacific plate, which can only form small patches of earthquakes there. Thus, the aftershock seismicity observed here may reflect the characteristics of the subducting plate.

Fig. 11
figure 11

Hypocentral distribution (1 March 2011 to 29 February 2012, ≤ 100 km depth). a Hypocentral distribution of the merged catalog; symbol colors indicate depth. AO, AK, IW, YM, MY, and FK indicate Aomori, Akita, Iwate, Yamagata, Miyagi, and Fukushima Prefectures, respectively. b Cross-sectional views of events in the merged catalog (black) and the JMA unified catalog (red). The cross-sectional views show earthquakes occurring within 40 km of each line in (a). The pink shaded area represents the continental crust and the green shaded area represents the oceanic crust of the subducting Pacific Plate from the JIVSM model (Koketsu et al. 2012)

Fig. 12
figure 12

Heat maps of the events in Fig. 11. a Frequency distribution binned in a 0.02° × 0.02° grid. b Cross-sectional views of the frequency distributions counted in 2-km bins. Other details are as in Fig. 11. The blue rectangle in (a) marks the region off Kamaishi, Iwate Prefecture, shown in Fig. 13

Cross sections B–B′ and C–C′ in Figs. 11b and 12b show a clear eastward dip from the continental crust to the subducting Pacific plate between 50 and 100 km along the horizontal axis. Such seismicity was also slightly observed in the 11 years preceding the 2011 Tohoku earthquake (Additional file 1: Figs. S2, S3) but made clearer from the merged catalog after the 2011 Tohoku earthquake. Although inland seismicity within Iwate Prefecture was low both before and after the 2011 Tohoku earthquake, making quantitative evaluation difficult, this unique pattern of seismicity may have been due to changes in the stress field caused by the 2011 Tohoku earthquake (Yoshida et al. 2012), in turn changing the stresses on inland faults.

In cross sections D–D′, E–E′, and F–F′, abundant seismicity is observed in the mantle wedge (Figs. 11, 12), but such activity was also observed before the 2011 Tohoku earthquake (Additional file 1: Figs. S2, S3). Uchida et al. (2010) called these “supraslab” earthquakes and attributed them to seamounts detached from the subducting plate. Because these earthquakes are near the hypocenters of interplate earthquakes, it will be important to evaluate intraplate, interplate, and mantle wedge seismicity separately in future, more detailed hypocenter analyses.

Repeating M 5 earthquakes occur off Kamaishi, Iwate Prefecture, and surrounding seismicity tends to increase in the latter half of the recurrence interval of the M 5 earthquakes (Matsuzawa et al. 2002; Uchida et al. 2007; 2012). Here, we applied the DD method (Waldhauser and Ellsworth 2000) to our merged catalog (labels K and A only) to determine the precise locations of hypocenters in the area 39.3–39.5°N, 141.93–142.15°E, and at 0–70 km depth during 1 March 2011 to 29 February 2012. We used travel-time difference data calculated by waveform cross-correlation. Of 557 events in the merged catalog, 524 could be redetermined by the DD method, compared to only 97 events in the JMA unified catalog alone.

Figure 13b, c show the magnitude-time diagram and cumulative number of events, respectively. Overall, the number of earthquakes decreased with increasing time following the 2011 earthquake. However, focusing on each cycle as delimited by the repeating M 5 events, seismicity seems to increase before the M 5 earthquakes. When normalized to the total number of earthquakes that occurred within each interval, seismicity was consistently low during the first half of the recurrence interval and increased at an accelerating rate in the latter half of the interval (Fig. 13d). Based on the repeating earthquakes before the 2011 Tohoku earthquake, Uchida et al. (2012) concluded that heterogeneous structures exist in the main asperities of the fault on which repeating M 5 earthquakes occur, and that stress concentrations are caused by aseismic slip in the surrounding region. Okuda et al. (2018) detected microearthquakes off Kamaishi immediately after the 2011 Tohoku earthquake using a template-matching technique and observed similar phenomena over two cycles. We obtained similar results over seven cycles during the 1-year period after the 2011 Tohoku earthquake. This temporal consistency supports the existence of heterogeneous structures, such as in the hierarchical asperity model of Hori and Miyazaki (2010). Because such repeating earthquakes can be used to estimate stress on their associated fault patches, the microseismic catalog obtained in this study will provide valuable information for such estimations. Further analyses of similar seismicity may provide detailed information on the slip distribution over a wide area.

Fig. 13
figure 13

Precise hypocentral distribution off Kamaishi, Iwate Prefecture. a Hypocentral distribution of the merged catalog with events recalculated by the DD method with cross-correlation (1 March 2011–29 February 2012; 40–50 km depth; area indicated in Fig. 12a). The bottom panel is a cross-sectional view of earthquakes occurring within a rectangle in (a). Symbol colors indicate depth. b Magnitude-time diagram; red and gray circles are events in the JMA unified catalog and the merged catalog, respectively. c Cumulative number of events. Red stars indicate the repeating M ≥ 4.8 earthquakes (Uchida et al. 2015). Vertical blue dashed lines in (b) and (c) indicate the origin time of the 2011 Tohoku earthquake. (d) Normalized cumulative number of events over seven recurrences. The times immediately after the repeating M ≥ 4.8 earthquakes and immediately before the next one are represented by x = 0 and 1 on the horizontal axis, respectively. The M ≥ 4.8 repeating earthquakes are not included in the count

Summary and future challenges

We incorporated machine learning into automatic hypocenter determination and reduced false detections by approximately 80%. By merging the obtained automatically determined hypocenters with the JMA unified catalog, we created a merged catalog of ~ 920,000 events, nearly triple the ~ 320,000 events in the conventional catalog. Our merged catalog contributes to the detection of fine fault structures and small foreshock activities, as well as to high-resolution monitoring of interplate slips based on repeating earthquakes.

We used a CNN alone for noise reduction and P- and S-wave discrimination. However, many deep learning methods have been proposed for phase picking, and incorporating those methods will further improve the identification of small earthquakes. Nonetheless, we note that Park et al. (2023) pointed out that deep learning methods are not easily reproducible. Therefore, it may be necessary to increase model robustness to unsupervised data while continuing to use robust conventional methods.

We focused on eliminating false detections, which is similar to the method used to detect far-field and deep earthquakes that cannot be detected even by conventional methods. Although we made a strong effort to reduce noise, our results include ~ 1% false detections. It is therefore necessary to improve the phase-association method to achieve sufficient performance for earthquakes with long S–P times. This is a pressing issue for monitoring large earthquakes that occur near trench axes and subsequent far-field seismicity.