Introduction

Since the 1960s, electronic design automation (EDA) research has produced transformative theories, algorithms, and tools to assist the specification, simulation, design, verification, and testing of electronic systems and circuits. EDA has enabled a spectacular increase in electronic design productivity (Scheffer et al., 2018; Sherwani, 1999; Wang et al., 2009). It has been argued that effectively tackling design scaling as described by Moore’s law would have been impossible without EDA research (Rohrer, 2003). Developing popular products based on billion transistor chips, like modern computers, cellular phones or smart consumer products, has benefited from advances in EDA.

However, over the last 10 years, there have been concerns about the future development of EDA, especially in comparison with other computer-related domains, like Machine Learning (ML) and Cybersecurity. These discussions started around year 2009 with a workshop organized by National Science Foundation (NSF) (Brayton & Cong, 2009). More recent analyses argue the need to reexamine EDA’s future to exploit new opportunities in computing, e.g., Internet of Things (IoT), Cloud computing, ML, and Cybersecurity, while continuing to support the traditional customer base, i.e., existing semiconductor companies (Kahng & Koushanfar, 2015; Kahng et al., 2015; Potkonjak et al., 2015). Strategic decisions must focus on effectively addressing new and current needs and opportunities in EDA with the existing human, infrastructure, and financial resources. Nevertheless, even though strategic decisions have a major impact as they affect many members, involve a lot of resources, and any miscalculation can have significant negative consequences, there are currently few methods and tools for strategic decision making in research communities. New techniques must be devised not only to aid the future development of EDA, but also to improve the effectiveness of any research community in general.

There has recently been a growing interest in studying the behavior of research communities in an attempt to improve their effectiveness, like the degree to which the obtained research results match societal needs (Kahng & Koushanfar, 2015; Liu et al., 2021; Nonaka, 1994; Woolley et al., 2015). The behavior of a research community is the process through which new knowledge, e.g., theories, models and methods, is produced, spread, and prioritized in response to articulated needs. Kuhn explains in his seminal work that imbalances between needs and research results are opportunities for new, high-impact work (Kuhn, 1962). The behavior of research communities is usually analyzed using publication citation graphs (Bi et al., 2011; Cagliero, 2013; Cai et al., 2015). The studied topics include finding the factors that influence publication visibility and impact (Dietz et al., 2007; Gerrish & Blei, 2010; McFadyen et al., 2009; Uzzi et al., 2013; Wang et al., 2013), tracking how explicit and implicit knowledge produces new knowledge (He et al., 2009; Kogut & Zander, 1992; Nonaka, 1994; Zhou et al., 2006), and characterizing group-specific processes and routines (Woolley et al., 2015). Another body of work focuses on text summarization using Neural Networks (Cao et al., 2015, 2017; Khatri et al., 2018) and Generative Adversarial Networks (Liu et al., 2018), and topic modeling (Blei & Lafferty, 2007; Boyd-Graber et al., 2017) using probabilistic Bayesian networks (Vayansky & Kumar, 2020) and publication clustering by the similarity of key concept (Blei & McAuliffe, 2010; Kahng & Koushanfar, 2015; Kahng et al., 2015). In summary, research community behavior is mostly described using models that offer interesting predictions about outcomes, but give little causal insight into how the overall behavior emerges. However, causal insight is important in devising and making strategic decisions to improve a community’s behavior in addressing current and new needs and opportunities.

This paper presents a novel modeling method to explain the possible causes for the behavior of a research community, like the observed trends and states, so that these causes can be addressed to improve the community. The method was used to study EDA research but it is not limited to it. The method integrates traditional scientometric and webometric metrics and the model in Liu et al. (2021). The latter describes community behavior as the result of idea combination and refinement that cause up-hill publication trends and idea blocking that produce down-hill trends. However, it uses customized metrics based on citation graphs. This paper extends the applicability of the existing model by considering traditional scientometric and webometric metrics as well as prediction results about future trends. The following metrics were used in this work: Number of patents per year, Number of publications per year, Number of authors per paper, Normalized citations per year, Individual normalized h-index, Impact factor of the main journals and peer-reviewed conferences, and Interest over time on the Web. Time series prediction using Autoregressive Integrated Moving Average (ARIMA) method was added to forecast metric trends for the mid range future, e.g., the next 6 years. Model analysis discussed EDA community behavior based on four components: research needs, produced outputs, knowledge propagation, participants, and connections to other research domains. The analysis then found three ways to improve EDA community. In general, the obtained causal insight can support strategic decision making to improve a community’s effectiveness over a mid range period, such as up to 5 years.

The paper has the following structure. “The past and present of EDA” section offers an overview of the past and current state of EDA. The third section presents the scientometric and webometric analysis of EDA research. A discussion of the analysis follows in fourth section. Conclusions end the paper.

The past and present of EDA

EDA research has created new theories, models, and algorithms to design electronic applications of increasing size and complexity, like Integrated Circuits (ICs), computing systems, such as embedded systems and Systems on Chip (SoCs), and systems of systems, e.g., Cyber-Physical Systems (CPS), cloud computing, and Internet of things (IoT) (Scheffer et al., 2018; Stackhouse et al., 2008; Subramanian et al., 2013). Independent of their targeted applications, EDA work can be grouped into the following categories: simulation and analysis, specification, performance prediction, automated design (synthesis), testing, and verification (Ferent & Doboli, 2013; Huang et al., 2021; MacMillen et al., 2000; Sherwani, 1999; Wang et al., 2009). Simulation algorithms simulate the functionality and performance (e.g., timing, power and energy consumption, thermal properties, and so on) of an electronic design by approximating the electromagnetic and material properties of an IC. They are extensions and adaptations of general-purpose numerical methods, like Finite Element Methods and grid techniques to solve large second-order partial differential (Huang et al., 2009; Sehgal et al., 2016; Stojcev, 2006; Vladimirescu, 1994). Specification expresses the different facets of an electronic design using Hardware Description Languages, i.e., Verilog, VHLD and VHDL-AMS, to describe the functionality or structure or both of a design (Doboli et al., 1999; Doboli & Vemuri, 2002, 2003; Huang et al., 2009). Automated design (synthesis) algorithms transform and optimize a design across the abstraction hierarchy. It starts from the higher-levels of abstraction usually concerned with the procedural presentation of the desired functionality, followed by creating customized and optimized single-, multi-, custom-, and Intellectual Property (IP)-core based architectures and structures (Das et al., 2015; Sehgal et al., 2016; Subramanian et al., 2013; Yu et al., 2018), and finally devising the layouts of ICs and SoCs (Agnesina et al., 2020; Maxfield, 2008; Schaefer, 1981; Sherwani, 1999; Thepayasuwan & Doboli, 2004; Wang et al., 2009; Ward et al., 2012). A broad range of single- and multi-level (hierarchical) optimization techniques have been explored (Brayton & Cong, 2009; Doboli & Vemuri, 2003; Li et al., 2016; Tang et al., 2006; Zuluaga et al., 2013). The goal of formal verification is to ensure the correctness of a circuit or system design by using mathematical approaches, like theorem proving and model checking (Brayton & Mishchenko, 2010; Eder et al., 2006; Hu et al., 2018).

The current status of EDA was analyzed in a series of workshops (Bahar et al., 2014; Brayton & Cong, 2009) and then summarized in a number of reports and academic papers (Kahng & Koushanfar, 2015; Kahng et al., 2015; Potkonjak et al., 2015). The goal of the NSF workshop entitled “EDA: Past, Present, and Future” was to discuss the accomplishments of EDA and to identify the potential impact of EDA on other domains (Brayton & Cong, 2009). The report proposes three main directions for further EDA development: (i) devising and implementing new algorithms to accelerate design, (ii) applying existing EDA methodologies and techniques to emerging domains, and (iii) increasing the attractiveness of EDA to students. The following workshops organized by the Computing Community Consortium (CCC) focused on student education and training, CPS, and Cybersecurity. The identified opportunities included improving the connection between teaching and real-world and a greater focus on modern, application-specific designs. Another strategy based on these workshops suggests three future directions (Potkonjak et al., 2015): (i) repositioning EDA to address new scientific and technological needs, (ii) adjusting the domain to recent high-impact fields and applications, and (iii) incorporating modern theories, models, and methods, e.g., using ML in modeling and optimization. Other work suggests that EDA can increase its visibility through its interdisciplinary nature (Kahng & Koushanfar, 2015; Kahng et al., 2015). A systematic effort towards connecting with other domains can lead to new and interesting research opportunities.

In summary, EDA community has offered an important and continuous research contribution that enabled the design of increasingly more complex electronic applications. However, a slow-down in EDA growth was observed over the last decade. This situation triggered discussions about strategic directions and opportunities that should be pursued by the community. Current strategic analysis is mostly based on prediction models about expected research outcomes but uses little causal insight into how the community behavior emerges. New ideas on extracting causal insight about research outputs and community behavior were recently proposed (Liu et al., 2021; Liu & Doboli, 2016; Jiao et al., 2018), but they are hard to apply in decision making based on popular scientometric and webometric data.

Scientometric and webometric analysis of EDA research

This section presents the data acquisition process, the used time-series prediction method, and metric computation. Then, “Discussion of the analysis results” section 4 utilizes the computed metric trends to characterize the behavior of EDA research community.

Data acquisition

The scientometric and webometric data used in this work was extracted from the following databases: Google databases, e.g., Google Patents, Google Scholar and Google Trends, and Clarivate Analytics databases, i.e., Web of Science Core Collection (WOSCC). All data searches used the exact query-sequence “electronic design automation”. Tool Publish or Perish (PoP) (Harzing, 2021) using Google Scholar database was utilized to compute some metrics. Note that while including more databases, like Elsevier’s Scopus database, would increase the covering of EDA publications, the principal journals and conference proceedings are part of the two databases used in this work (Kahng & Koushanfar, 2015).

Journals IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) and ACM Transactions on Design Automation for Electronic Systems (TODAES) were considered as they are arguably the two main journals dedicated to publishing EDA research by the two principal professional associations in computing, IEEE CEDA and ACM SIGDA (Sherwani, 1999). Additionally, another eight journals were analyzed too because of their high topic/content similarity with TCAD: IET Computers and Digital Techniques, Journal of Electronic Testing: Theory and Applications, ACM Journal on Emerging Technologies in Computing Systems, IEEE Embedded Systems Letters, ACM Transactions on Reconfigurable Technology and Systems, IEEE Design and Test, Integration, the VLSI Journal and IEEE Transactions on Computers. Similarly, Design Automation Conference (DAC), Design, Automation, and Test in Europe (DATE), International Conference on Computer-Aided Design (ICCAD) and Asia and South Pacific Design Automation Conference (ASPDAC) were analyzed as they are main EDA-related conferences (Sherwani, 1999). Given their status and visibility, it is reasonable to assume that these publications offer a good covering of all EDA research problems, theories, and solutions over time. While individual work might not have been published by these publications, it is unlikely that a main EDA topic was not discussed there.

The following nine metrics were computed using the extracted data: (1) Metric Number of patents per year was retrieved from Google Patents for the period 1993–2021. It describes the degree to which EDA results are expected to be used in real-life. (2) Metric Number of publications per year was retrieved from Google Scholar for the period 1980–2021. It denotes the scientific publication production in EDA. (3) Metric Number of authors per paper was found using PoP for the period 1980–2021. It presents the degree to which researchers team up to solve complex problems. (4–5) Metrics Normalized citations per year and Normalized individual Hirsch-index were obtained with PoP using Google Scholar database for the time period 1980–2021. They describe the impact of individual papers and researchers. (6–8) Metrics Journal Impact Factor and Normalized eigenfactor score for journals and metric Conference Impact Factor for conferences were found based on database WOSCC. Data was extracted for the period 2002–2020 for journals and for the period 2001–2018 for conference proceedings. These metrics measure the scientific impact of the main EDA journals and conferences. (9) Metric Interest over time on the Web was found using Google Trends for the time period 2004–January 2022. It describes the interest in a domain as compared to other research domains.

Time-series prediction

Time-series prediction was performed for each of the computed metrics to express the likely trends of the metrics for the next 6 years. The popular Autoregressive Integrated Moving Average (ARIMA) model (Box et al., 1970) was employed for prediction. ARIMA is a class of statistical models that are effective in non-seasonal time series forecasting (Brownlee, 2018; Durbin & Koopman, 2012). An ARIMA model is formally described as ARIMA(pdq), and includes three basic components, each being characterized by one parameter: (i) the autoregressive (AR) part indicates how the output variable depends on its own prior values, where p is the autoregression order, (ii) the moving average (MA) part reveals the way the output depends on past or current values of a stochastic variable, where q is the order of moving-average model, and (iii) the integral (I) part denotes that the data values are substituted by the difference between their current and previous values, where d being the degree of differencing.

The forecasting procedure was automatized by implementing an auto-ARIMA procedure (Hyndman & Athanasopoulos, 2018), in which parameters p, d and q were varied in given intervals (\(p\in \left\{ 1;2;3 \right\}\), \(d\in \left\{ 0;1;2 \right\}\), \(q\in \left\{ 0;1;2 \right\}\) in this work). The best fitted model was selected using the Akaike Information Criterion (AIC) score (Akaike, 1974), as it offers a good trade-off between the simplicity of the model and the quality of the fit of the model. In the following figures, the values predicted with the model having the smallest AIC score were presented as pink solid bars, while vertical solid black lines for each predicted value are the intervals of all forecasts obtained using ARIMA models.

Predictions of the analyzed metrics were offered for the next 6 years, e.g., the period 2022–2027. Predictions are less accurate beyond this period. However, the used method is expected to be reasonably accurate for mid-range strategic decisions, like framing research needs for the next 3 to 5 years.

Evaluation metrics

The following metrics were analyzed to describe the behavior of EDA research.

Number of patents per year

The number of patents published each year was analyzed as EDA is strongly driven by applications (Kahng & Koushanfar, 2015). Patent data has been also utilized in decision making (Akers, 2003), as it is a reliable indicator of the level of innovation in technology (Saheb & Saheb, 2020). EDA patents are also important in spawning new applications, methods, and ideas, hence are innovative stimuli for further research and innovation. Thus, metric Number of patents per year expresses to some degree the scientific and technical progress of a society.

Data on the number of patents per year was acquired from Google Patents database that was searched for “electronic design automation” exact sequence for the period 1993–2021. Figure 1 plots the trend. If until 2015 there was an almost exponential growth, in recent years, there was a steady decline, which suggests less innovation. For the future period 2022-2027, the best prediction model, namely ARIMA(1, 1, 0), has an AIC score of 386.03, which predicts an approximately constant number of new patents per year.

Fig. 1
figure 1

Number of patents per year

Fig. 2
figure 2

Patent classification

Also, the most relevant EDA subdomains were identified based on the patent code data from Google Patents for the period 2003-2021 and the patent classification scheme provided by the Cooperative Patent Classification (CPC) of the US Patent and Trademark Office (USPTO). The four subdomains and their codes are as follows: Electric Digital Data Processing (G06F), Measuring Electric Variables, Measuring Magnetic Variables (G01R), Semiconductor Devices, Electric Solid State Devices (H01L), and Photomechanical Production of Textured or Patterned Surfaces for Processing of Semiconductor Devices (G03F). Figure 2 shows the percentages of the patents published each year pertaining to the four CPC subdomains. The majority of EDA patents in each year, between \(51\%\) and \(74\%\), belongs to code G06F (Electric Digital Data Processing). Code H01L (Semiconductor Devices, Electric Solid State Devices) has experienced a growing trend, ranging from \(3.1\%\) in 2003 to \(14.3\%\) in 2020. Moreover, the diversity of CPC subdomains for EDA has increased over time, as shown by the growth of category “Others” in Fig. 2.

Number of publications per year

The second metric to quantify EDA research production is the number of publications (e.g., papers, books, technical reports, etc.) published in one year.

Fig. 3
figure 3

Number of publications per year

Data was collected from Google Scholar for the period 1980–2021 using “electronic design automation” as the exact search query. Figure 3 shows a constant increase in the number of publications every year, with a short stagnation between 2014 and 2018. Regarding predictions for period 2022–2027, the best fitted model obtained using the auto-ARIMA procedure was ARIMA(1, 1, 1) with a minimum AIC score of 410.21. As shown in the figure, predictions suggest a slowly increasing trend in the number of publications for the near future, which could be caused by an increasing number of EDA professionals and more EDA publications.

Number of authors per paper

An increase in the complexity of research themes likely results in larger research teams. Hence, metric Average number of authors per paper was computed for period 1980–2021 using data provided by PoP for exact search for “electronic design automation”. The number of papers considered for each year was the total number of retrieved papers, but not more than one thousand papers.

Fig. 4
figure 4

Number of authors per paper

The average number of authors per paper has steadily increased as shown in Fig. 4. Since year 1980, when each paper had an average of 1.5 authors, an almost linear growth followed until year 2021, with the metric surpassing the threshold of 3 authors per paper. The trend suggests a progressive increase of EDA research teams to address more complex problems. In the near future, the average number of authors per paper, according to the best prediction model (i.e., ARIMA(1, 1, 0) with a AIC score of \(-\,51.12\)), will stay about the same.

Normalized citations per year

The impact of EDA domain was characterized through several metrics based on the number of citations.

The first of these metrics was Normalized citations per year. This metric was calculated for the period 1980–2021 by PoP tool Harzing (2021). PoP retrieves a maximum of one thousand papers per search, and the results are presented in the descending order of their citations number. However, the average number of citations was normalized with the number of retrieved papers, as the number of papers containing the exact sequence “electronic design automation” that correspond to each year of the period 1980–2002 is less than 1000. Normalization leads to a fairer comparison of the years with less papers and the years that were more productive.

Metric Normalized citations per year (NC) is the average number of citations received in one year by a specified number of the best ranked papers divided by the specified number of papers:

$$\begin{aligned} NC\;(publication\_year) = \frac{\sum _{i=1}^{N}citations\_paper_i}{N\; (actual\_year-publication\_year-1)}, \end{aligned}$$
(1)

where N is the number of considered papers (the maximum value of N was 1000 in this paper).

Fig. 5
figure 5

Normalized citations per year

Figure 5 shows an almost exponential increase of metric Normalized citations per year for the period 1990 - 2004. However, since then, the metric was approximately constant, around 1.5 citations per paper per year. This trend is expected to continue for the next 6 years as suggested by the best prediction model, namely ARIMA(1, 1, 1) with the AIC score of 96.02.

Normalized individual h-index (PoP variation)

The Hirsch index (h-index) (Hirsch, 2005) was calculated to capture both a researcher’s publication productivity and his/her impact on the domain (Glänzel, 2006). However, to address two limitations of h-index, namely the problem of domain and career stage differences, metric Normalized Individual h-index (hI-norm) was also found using PoP. The metric is defined as follows: first, the number of citations for each paper is normalized by dividing it by the number of authors for that paper, and then metric hI-norm is computed as the h-index of the normalized citation counts (Harzing et al., 2014).

Fig. 6
figure 6

Normalized individual h-index

Figure 6 presents hI-norm evolution for EDA domain. Data was collected using PoP for the period 1980–2021. Two parts of convex shape can be distinguished. The first part covers the period 1980–1990, and has a peak in year 1987. The second part corresponds to the period 1992–present with its peak in year 2002. The two ascending periods are superposed with the intervals when milestone theories and technologies were developed, suggesting the presence of high visibility researchers in EDA. After reaching its peak in 2002, hI-norm metric has continuously declined until its current low level in year 2021. The decreasing trend will continue in the following years too, according to the predictions using the best fitted model ARIMA(1, 1, 1) with an AIC score of 240.65. A possible reason is that in recent years well ranked publications published more work by authors in their earlier career stages, while work by more seasoned researchers decreased.

Impact factor and normalized eigenfactor score of Main EDA journals

A different perspective on the impact of EDA research is offered by the altmetric indices of the domain’s flagship journals and conferences.

First, the time-evolution of Journal Impact Factor (JIF) acquired from Clarivate Analytics database for the period 2002-2020, was found for the two main EDA journals, namely IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (TCAD) and ACM Transactions on Design Automation for Electronic Systems (TODAES). Metric JIF for a given year was computed by dividing the total number of citations received in that year by the journal’s papers published in the two previous years to the total number of papers published during the two previous years (Garfield, 2006). The evolution of metric JIF for the two journals was presented in Figs. 7 and 8.

Fig. 7
figure 7

Impact factor for IEEE Transactions on CADICS

Fig. 8
figure 8

Impact factor for ACM Transactions on DAES

TCAD was placed by Clarivate Analytics in the Q2/Q3 quartile. However, there was an increasing trend for metric JIF from 1.05 in 2002 to 2.81 in 2020. But this trend will likely slow down in the future according to the best predictions, e.g., ARIMA(1, 1, 0) model having an AIC score of 13.99. Metric JIF for TODAES was lower and was ranked in Q4 quartile. Stagnating JIF values around 0.95 are expected according to the best forecasting model offered by ARIMA(2, 1, 0) with an AIC value of \(-13.18\).

Another metric of interest calculated by Clarivate Analytics is Normalized eigenfactor (Bergstrom et al., 2008). It analyzes the activity of journals based on the number of citations of its articles published in the past 5 years. It also considers which journals have contributed to these citations. Highly cited journals influence the network more than less cited ones. The advantage of this metric is in its simple interpretation, such as an average journal included in Journal Citation Report (JCR) database has the score of one, and thus, metric Normalized eigenfactor enables a fast comparison between journals (Bergstrom et al., 2008).

TCAD has an almost constant Normalized eigenfactor score over the past 8 years, between 0.72 and 0.95. A similar constant Normalized eigenfactor score was observed for TODAES too, between 0.10 and 0.23. Thus, the two journals have a lower-than-average impact.

For additional insight, the JIF trends for another eight journals was analyzed and then summarized in Table 1: IET Computers and Digital Techniques (IET-CDT), Journal of Electronic Testing: Theory and Applications (JETTA), ACM Journal on Emerging Technologies in Computing Systems (ACM-JETC), IEEE Embedded Systems Letters (IEEE-ESL), ACM Transactions on Reconfigurable Technology and Systems (ACM-TRTS), IEEE Design and Test (IEEE-DT), Integration, the VLSI Journal (ItVLSIJ) and IEEE Transactions on Computers (IEEE-TC). These publications were selected from Clarivate’s Master Journal List based on the highest topic/content similarity with TCAD as indicated by SJR-SCImago Journal & Country Rank (SCImago, 2022). These analysis results as well as the predicted JIF values displayed in Figs. 9, 10, 1112, 13, 14, 15 and 16 confirm an almost constant EDA trend.

Table 1 Additional EDA journals
Fig. 9
figure 9

Impact factor for IET-CDT

Fig. 10
figure 10

Impact factor for JETTA

Fig. 11
figure 11

Impact factor for ACM-JETC

Fig. 12
figure 12

Impact factor for IEEE-ESL

Fig. 13
figure 13

Impact factor for ACM-TRTS

Fig. 14
figure 14

Impact factor for IEEE-DT

Fig. 15
figure 15

Impact factor for ItVLSIJ

Fig. 16
figure 16

Impact factor for IEEE-TC

Impact factor of main conferences in EDA

The time evolution of the impact of four main EDA conferences, namely Design Automation Conference (DAC), Design, Automation, Test in Europe Conference (DATE), International Conference on Computer-Aided Design (ICCAD) and Asia and South Pacific Design Automation Conference (ASPDAC) was also computed based on metric Conference Impact Factor (CIF). CIF was calculated using a similar formula like for JIF for data retrieved from Clarivate Analytics Web of Science database for the period 2002-2020. The evolution of CIF for the four conferences was displayed in Figs. 17, 18, 19 and 20.

Fig. 17
figure 17

Impact factor for DAC

Fig. 18
figure 18

Impact factor for DATE

Fig. 19
figure 19

Impact factor for ICCAD

Fig. 20
figure 20

Impact factor for ASPDAC

The best prediction models were of type ARIMA(1, 1, 0) for all four time series, having an AIC score of 15.6 for DAC, 4.98 for DATE, 1.2 for ICCAD and 1.9 for ASPDAC. These models forecast a constant trend around 1.7 for DAC, 1.6 for DATE, 32.3 for ICCAD and 2.08 for ASPDAC, suggesting that the four EDA conferences are well ranked among IEEE conferences. In other fields, the main results tend to be published in high-ranked journals, but EDA researchers seem to choose the main conferences to publish their work.

Interest over time on the web

Finally, webometric metric Interest over time on the Web was found using data acquired from Google Trends, which assesses how a given topic was searched on the web over a period of time.

Figure 21 illustrates how EDA evolved in terms of its web-popularity. It compares it with three other computing-related areas, Cybersecurity, Cloud Computing, and IoT. Each value of metric Interest over time for the four domains was presented as a percentage relative to the highest point on the plot for the considered period. A value of one hundred denotes the peak popularity among all represented terms. All worldwide web searches on the selected terms were considered for the period January 2004–January 2022. While EDA was the most popular domain among the four at the beginning of the period, this changed after year 2008 when the popularity of cloud computing and IoT exceeded that of EDA. At the end of year 2021, the web-popularity of EDA was much smaller than for the other domains.

Fig. 21
figure 21

Web-Interest over time in selected computing fields

Discussion of the analysis results

The experimental results were analyzed using the white box modeling methodology described in Liu et al. (2021) and Liu & Doboli (2016). The produced models describe the likely causes and dependencies that produced the observed behavior of the metrics. Devising systematic, global analysis methods to give insight into the causality of a research domain’s behavior is a complex task that currently lacks a mature solution.

The analysis integrates the metrics presented in the previous sections into a global model for EDA behavior over time. It starts from the process that creates and propagates new knowledge in a domain to address the needs posed to the research community, and then uses the behavior of the process to explain the trends characterized through metrics (Liu et al., 2021). Figure 22 illustrates the knowledge creation and propagation process in a community, and Fig. 23 presents the possible trends that can result for the process. Then, trend analysis starting from the metrics values offers insight about possible causes that influenced the EDA community behavior.

Fig. 22
figure 22

Flow diagram showing knowledge creation and propagation in a technology-driven research domain

Knowledge creation and propagation is a complex process that includes four, highly-connected components: research outputs, research needs, knowledge dissemination (propagation), and participants (Liu et al., 2021) (Fig. 22):

  • Research outputs comprise of all research-related knowledge, artifacts (e.g., hardware and software), results, and skills developed as a result of conducting research work. It includes new basic knowledge that lays out the fundamentals for possibly new research directions, and knowledge that extends the state-of-the-art of an existing direction, but for which the basic knowledge is already sufficient. Even though it refers only to peer-reviewed publications, metric Number of publications over time was used to quantify the produced research outputs over time.

  • Research needs describe all research problems that the community should consider. Needs include two components: explicit needs and implicit needs. Explicit needs are already framed problems and challenges, like tackling new types of designs and performance. Implicit needs refer to problems that even though not being defined, their defining characteristics are likely to emerge out of the development of explicit needs. The model considered that metric Number of patents over time describes the explicit needs. As opposed to publications, patents are driven mainly by devising solutions to technological challenges. Technology has been an important driver behind EDA research. Note that the gap between research needs and outputs is characterized by the degree to which needs were satisfied or not (Fig. 22). The slope change for metric Number of patents over time was used as a descriptor for the satisfied needs, which further gives insight about the unsatisfied needs too.

  • Knowledge dissemination refers to how the produced research outputs propagate through the community. It includes the vehicles for dissemination, like journals and conference proceedings, as well as the way in which dissemination affects the nature of knowledge, such as research outputs. Depending on how they disseminate, research outputs can be of three kinds: (i) Active knowledge front includes the concepts that are at a certain time the main drivers for devising subsequent research outputs, (ii) Latent knowledge front are the concepts that could have been further explored but never gained visibility within the community, and (iii) Blocked knowledge front consists of the concepts that became arguably obsolete after new research outputs were devised. Metric Number of citations over time was used in the model to describe the size of Active knowledge front, and metric Change in the number of citations expressed Blocked knowledge front. Knowledge granularity describes the complexity of the tackled problems. Metrics Number of authors per papers and Number of citations over time are qualitative descriptors of granularity.

  • Participants are the community members. They interpret research needs and use Active knowledge front to create research outputs that can block current solutions. Their collaboration with other members was described by metric Number of authors per paper. Their participation to Active knowledge front was reflected by metric Normalized individual h-index.

In addition to the four components, the connections to related domains was described by the correlation coefficients between metric Web-interest over time of the domains.

Figure 23 relates the metrics values to the knowledge creation and propagation flow model in Fig. 22. The figure shows that the up-trend in Number of patents ended around year 2008, and was followed by stagnation until year 2016 followed by a down trend. The up-trend was mainly correlated with higher percentages of patents in manufacturing (G03F) and Others. Patents on data processing (G06F) always represented most of the patents, hence they seem to be less correlated to the general trend of patent publications. The percentage of patents on semiconductor devices (H01L) was high during the up trend, but stayed high during the down trend too. The percentage of patents on measuring (G01R) was mostly constant. Metric Number of papers had an up trend, which significantly slowed down recently, but did not enter a down trend.

Fig. 23
figure 23

Connection between metrics and knowledge creation and propagation

The above results plus the observation that there was an increase of metric Average number of authors per papers suggests that the complexity of published research work has increased. The increase in complexity slowed down the up trend, as it became more difficult to jointly address the constraints of new problems. A lower productivity can be argued as more effort was needed to generate new research outputs, e.g., publications. Also, the trends in Number of patents and Number of publications were uncorrelated after year 2008. This is an interesting observation, and could suggest that there is a mismatch between the perceived and satisfied needs as described by patents (arguably pursuing a more pragmatic goal) and the satisfied and unsatisfied needs as identified through peer-reviewed publications (arguably targeting a more academic goal). It is noteworthy mentioning that the trend of metric Number of patents was not reversed by the increasing trend of metric Number of publications.

Moreover, the decrease of metric Number of average number of citations suggests that there is a greater diversity of concepts that are not further pursued by the community. Hence, the ratio of Latent knowledge front increased with respect to Active knowledge front. Thus, the percentage of common knowledge agreed on by the community is a lower fraction. Combined with the decreasing Normalized individual h-index, these trends suggest a higher knowledge granularity at the community level, which possibly suggests that more paper focus on specific needs but are not continued by subsequent work.

The observed trends of the impact factors of the two journals and the two conference proceedings are correlated to the increasing number of publications only after year 2013. This suggests an increasing importance of these dissemination vehicles for the community, even tough the average impact of a paper declined. Moreover, there was little correlation between metric Web-interest for EDA and the three related areas, IoT, Cybersecurity, and cloud computing.

Possible Opportunities

Even though it is hard to precisely identify causal interventions that would change the concerning trends mentioned in Potkonjak et al. (2015), Kahng & Koushanfar (2015), Kahng et al. (2015), it is still important to suggest ideas to tackle the imbalances observed during trend analysis. The following three opportunities were identified based on the observed trends:

  1. 1.

    Increase Active knowledge front by reducing the amount of latent knowledge This step avoids having new ideas of promising potential being ignored by the community. As a result, knowledge diversity would also increase, with the possibility of new clustering of knowledge and more connections to other domains (see Fig. 22). For example, EDA developed interesting methods to optimize and model dynamic, strongly-connected, mixed-signal systems, including techniques for hierarchical and adaptive optimization of systems with many continuous-valued variables (Gielen & Rutenbar, 2000; Tang et al., 2006) as well as techniques for automated symbolic model generation (Doboli & Vemuri, 2001; Gielen & Rutenbar, 2000). Some of these methods did not become part of Active knowledge front as suggested by their lower citation numbers, even though new ideas on optimization of continuous-valued variables can help improve Deep Neural Network (DNN) training (Goodfellow et al., 2016), while concepts on automated symbolic model generation can lead to explainable DNNs, e.g., IF-THEN rule extraction for a trained DNN (Zhang et al., 2007). This goal can be achieved by encouraging work to comprehensively characterize the state-of-the-art, e.g., to systematically compare and highlight the advantages and limitations of present methods based on quantitative metrics. It would require a new way of selecting, analyzing, and experimentally evaluating the discussed related work. It is expected that this change would frame better the solved needs as well as the unsatisfied needs (including needs in basic theory), which could increase the convergence of the community around a set of high impact needs and concepts to be pursued by the community, and less on work that focuses on specific situations. Increasing the focus on main concepts and their combinations would also identify more implicit needs, which could potentially lead to more new ideas to serve as recovery points.

  2. 2.

    Reduce the complexity of the tackled problems While this is hard to control, history suggests that advances in manufacturing and other domains produced situations in which more innovation is possible due to the less demanding physical implementations. Often, manufacturing processes devised only to achieve higher integration densities can pose complex design constraints in order to tackle other needs, like energy/power, thermal, or reliability requirements. Traditional solutions, which use a library-based, hierarchical EDA flow, have often been challenged by low-level, physical constraints stemming from manufacturing and semiconductor device design. Therefore, making strategic decisions, like deciding to switch to another (potentially more expensive) manufacturing process or a different semiconductor device type instead of continuing with the current solutions can be aided by new prediction methods and metrics that could indicate the expected future design challenges and costs. In general, techniques are needed to predict when design complexity becomes untenable.

  3. 3.

    Address the gap between the satisfied and the unsatisfied needs This gap is reflected by trends in number of patents and number of publications over time. Patents indicate technological solutions that satisfy to a certain degree needs, while research papers also propose solutions to needs, but without necessarily being ready to be applied in the real world. Therefore, these needs arguably remain unsatisfied. This opportunity is similar to some recent efforts of agencies to support commercialization of funded academic research, like NSF’s I-CORPS and SBIR programs. Increasing the focus on the basic needs of the traditional customer base and related areas, is likely to close the gap, as new advancements in basic knowledge will trigger more practical opportunities and thus patent applications. Also, it would lead to contributions that are more unique to the domain, and thus increase its connection to related domains that could benefit from these advancements. Traditionally, EDA has studied methods that can go beyond silicon Integrated Circuit design, such as numerical solving of linear and nonlinear equations, scaling, optimization, transformation, and so on. For example, incorporating needs and constraints from related domains could help in identifying EDA-specific opportunities, but which are important for other domains too. An example is to use model checking, initially devised to formally verify complex ICs, to solve other complex verification problems, like verifying power grids (Sugumar et al., 2019). However, this step requires a more tight involvement of businesses in framing the faced needs, including articulating how these needs relate to present limitations in theory. We feel that new collaboration structures, mechanisms and metrics are required to offer a superior locking between real-world needs and academic solutions.

Limitations of the trend analysis

The limitations of this work are summarized as follows:

  • Beyond mid range trend prediction The current analysis identified mid-range strategic objectives for a research community, like opportunities to improve EDA effectiveness by addressing the complexity of the tackled problems, and the gaps between active and latent knowledge or between the unsatisfied needs and researched problems. These objectives assume a time period of up to 5 years during which the fundamental research needs remain the same. Beyond this time frame, it is likely that trend analysis should address new emerging research directions and problems that might disrupt the present needs. This requires not only keyword extraction and topic modeling but also identifying gaps between different domains as well as retargeting and reframing current theories and solutions to new domains.

  • Data collection The data sets could include additional sources, like Elsevier’s Scopus database, books, book chapters, research reports, and white papers. While it is unlikely that significant research outcomes were missed in this work by using only Goggle and Clarivate Analytics databases, it is possible that a shift in the relevant publications can occur if more authors opt for open-access repositories of electronic publications, like arXiv. Such repositories are popular for areas like ML but not yet for EDA.

  • Trend prediction Trend analysis beyond mid-range prediction requires devising new algorithms that do not exclusively rely on prior values and/or use other approaches than regression. Also, situations in which major changes in the pursued directions or disruptions occurred for a research domain should be part of trend prediction too.

Conclusions

This paper presents a method to support mid-range strategic decision making by identifying causal insight that explains the behavior observed for a research community. It combines traditional scientometric and webometric metrics with a recent model that captures the causal factors that condition the up- and down-trends of metric values. In addition, the method includes time series prediction using Autoregressive Integrated Moving Average (ARIMA) method to forecast metric trends for the next 6 years. Past and present trends in EDA were analyzed to suggest possible ways to increase the domain’s impact.

The analysis for EDA showed disconnections between the trends for the number of publications per year, the number of granted patents per year, and the impact of its main journals and conference proceedings. The higher number of authors per paper suggests an increased difficulty level of the addressed problems. Together with a low interest on the web and decreasing individual normalized h-indexes, these metrics indicate a decreasing productivity and visibility of the area, and a higher fragmentation of the domain’s knowledge. Having new knowledge accepted by the entire community seems less likely.

As compared to existing work, the causal insight gained through the method supports strategic decision making to address the imbalances in a community’s behavior. For EDA, they include reducing solution complexity, e.g., by focusing on new manufacturing and the basic needs and problems of the domain, as previously the number of patents increased when the percentage of manufacturing patents was higher. Improved manufacturing could simplify design by eliminating some of the current constraints. Encouraging work on basic needs should include detailed comparisons to current solutions, so that advantages and limitations are better framed. In-depth contributions for EDA-related basic needs and opportunities are likely to increase the domain’s links with emerging fields, e.g., IoT, ML, and Cybersecurity.

The presented model can be extended beyond supporting mid-range strategic decision making, such as to include situations when major changes in needs or disruptions occurred. This requires enhancing data acquisition and trend prediction by adding topic modeling and research idea tracking. Also, more databases and publication types, like books, technical reports, and white papers, could be added to the analysis. Future work will address these issues.