Understanding Ajax applications by connecting client and server-side execution traces
- First Online:
- Cite this article as:
- Zaidman, A., Matthijssen, N., Storey, MA. et al. Empir Software Eng (2013) 18: 181. doi:10.1007/s10664-012-9200-5
Ajax-enabled Web applications are a new breed of highly interactive, highly dynamic Web applications. Although Ajax allows developers to create rich Web applications, Ajax applications can be difficult to comprehend and thus to maintain. For this reason, we have created FireDetective, a tool that uses dynamic analysis at both the client (browser) and server-side to facilitate the understanding of Ajax applications. We evaluate FireDetective using (1) a pretest-posttest user study and (2) a field user study. Preliminary evidence shows that the FireDetective tool is an effective aid for Web developers striving to understand Ajax applications.
KeywordsAjax Web applications Program comprehension Reverse engineering Dynamic analysis execution traces
Before the dawn of Ajax, Hassan and Holt already noted that “Web applications are the legacy software of the future” and “Maintaining such systems is problematic” (Hassan and Holt 2002). We expect that the interactivity and complexity that Ajax adds will certainly not improve this situation.
Software maintenance starts with building up understanding and subsequently making the necessary modifications. This understanding step is known to be very costly, with Corbi reporting that as much as 50% of the time of a maintenance task is spent on understanding. However, papers focusing on program understanding specifically for Ajax applications are scarce, as observed by Cornelissen et al. (2009a).
These observations, together with the rapidly growing number of Ajax enabled Web applications, motivated us to examine ways to support Web developers in maintaining this new breed of Web applications. In particular, in this paper we investigate what kind of problems Web developers struggle with when understanding an Ajax application and how we can leverage dynamic analysis (Ball 1999) to better support Web developers in understanding Ajax applications.
In order to facilitate a better understanding of Ajax-based Web applications, we have built FireDetective, a tool that records execution traces on both the client (browser) and server, and subsequently visualizes them in a combined way.
RQ1 Which strategies do Web developers currently use when trying to understand Ajax applications?
RQ2 Can we use dynamic analysis, as presented through the FireDetective tool, to improve program understanding for Ajax applications?
The rest of this paper is organized as follows. Section 2 describes the design and implementation of FireDetective. Section 3 documents the design of our user study, while Section 4 describes and discusses the findings of this user study. Sections 5 and 6 respectively describe the experimental design and the findings of the field user study that we performed. Threats to validity are covered in Section 7. Section 8 discusses related work. Finally, Section 9 presents our conclusions and identifies future opportunities.
2 Tool Design
2.2 Using Abstractions to Link Traces
Web template invocations are not specific to Ajax, and are used in many Web applications. In our case, we are working with JSP templates. Since these templates are compiled prior to use, they do not end up in the trace in their original form. Therefore, we reconstruct JSP invocations from the original trace and link them to the points in the traces where they took place.
The abstractions were identified through our own experiences as Ajax developers. In Section 4 we offer possible additions to this list. We used different mechanisms for recording and reconstructing these abstractions, and linking them to the relevant traces. These mechanisms are briefly described in Section 2.6.
2.3 Interactive Visualization
Three main views are used, each of which shows a different level of detail. The first view is a high-level view, which shows a tree representation of the aforementioned abstractions (except template invocations). Expandable tree nodes may reveal more detail, e.g., expanding an Ajax request node shows its relation to particular traces and calls, i.e., the life cycle of the request. The second view is a trace view which displays one execution trace at a time, as a call tree (this also means that different invocations of the same functions are represented separately). Each tree node represents a single call, with expandable subcalls. The third view is a standard source code view.
A disadvantage of execution traces is that they can quickly grow to massive proportions. In order to reduce the size of traces, we use two simple, well-known trace reduction mechanisms (Cornelissen et al. 2008a). The first one is to filter out all library calls and only keep calls that are specific to the Ajax application that is being analyzed. Both client-side libraries (such as Dojo4) and server-side libraries (such as Java EE server internals) are filtered out. The second mechanism concerns allowing the user to start and stop trace recording. This allows the user to time slice the Ajax application, and, for example, to find out how a particular interaction with the Ajax application is handled.
2.4 Relation to FireBug
As a tool, FireBug is very important as it is currently widely used by Web developers. FireBug is also be part of our experiment in Section 3.
2.5 Barriers to Comprehension
The tool solves this problem by computing a hash code for the response text of every Ajax request, and every “eval”-ed string. When the tool shows a fragment of “eval”-ed code and finds a matching Ajax response text hash, the tool can reconstruct the filename of the “eval”-ed code.
2.6 Implementation Details
3 Design of the User Study
We used an exploratory pre-experimental user study to address our research questions: which strategies do Web developers currently use, and, can dynamic analysis improve program understanding for Ajax applications? The type of experiment is called pre-experimental to indicate that it does not meet the scientific standards of experimental design (Babbie 2007), yet it allows us to report on facts of real user-behavior, even those observed in under-controlled, limited-sample experiences. In particular, we are using a one group pretest posttest design, which entails that there is only an experimental group and no control group. This type of experiment is called pre-experimental because it does not allow to identify an event related to the dependent variable that intervenes between the pretest and the posttest where the effects could be confused with those of the independent variable (Babbie 2007).
Part A: Observing current understanding strategies. Participants used a standard set of Web development tools: Eclipse and Firefox with the popular FireBug add-on. The purpose of this part is to provide insight into which strategies Web developers use when trying to understand Ajax applications, and whether these strategies are sufficient (RQ1).
Part B: Support through dynamic analysis. Participants used Eclipse and Firefox with FireDetective. The purpose of this part is to provide insight into whether dynamic analysis techniques as provided through FireDetective can improve understanding, and if so, how (RQ2).
Our approach is exploratory as we are still at an early stage in this research project. We focus on observing participants as they work on assigned tasks with and without the FireDetective tool. We asked participants to think aloud during the study, and since the study was conducted in a lab setting we were able to make audio and screen recordings for later analysis. We also gave questionnaires to the participants to determine their perceptions of the benefits of using dynamic analysis both before and after using FireDetective. After each part, participants were subjected to a short interview. In the following sections, these aspects are described in more detail.
3.1 Design of Part A: Observing Current Understanding Strategies
Part A started with a background interview and questionnaire, to gauge the development experience of the participant. This was followed by a 10-minute introduction to the tools used in this part of the study: Eclipse and FireBug. Since participants were likely to have experience with these tools (this was indeed the case, see Section 4), the introduction served mostly to refresh the participants’ memory.
After the introduction, participants worked on a set of program understanding tasks for 35 min. We emphasized that they could use any feature they wished, to minimize bias towards using the features that we had shown them. Participants were informed that they could move on to the next task if they failed to make progress on their current task, and that they could ask questions about the tools at any time (questions about the target application itself were not answered, for obvious reasons). Also, if the experiment leader noticed that a participant was struggling with a particular tool feature, the participant would be given a short explanation of the feature. Since our goal was to find out as much as we can about the strategies that participants use, we did not want them to be stalled with a tool issue for too long. A short interview asking participants about encountered problems concluded part A.
3.2 Design of Part B: Support Through Dynamic Analysis
Better understanding. Will the tool allow Web developers to understand Ajax applications more effectively?
Quicker understanding. Will the tool allow users to understand Ajax applications more efficiently?
More confident about understanding. Will use of the tool make Web developers more confident about their understanding of an Ajax application?
Minimal value. This attribute is inversely related to the above attributes. Will the tool provide value?
A. Software development experience
1 = never used it
2 = used it for a couple of hours or less
3 = used it for one or two projects
4 = I use it regularly
5 = I’ve been using it regularly for over two years now
2. Java Server Pages (JSP)
8. Java PetStore
B. Understanding Web applications
For each of the next statements, please indicateto what extent you agree with them, ranging from 1 (completely disagree) to 5 (completely agree).
1. Such a tool could allow me to better understand Web apps.
2. Such a tool could allow me to be more confident that I really understand the Web application that I’m investigating.
3. The value added by such a tool will be minimal.
4. Such a tool could save me time.
For each of the next statements, please indicate to what extent you agree with them, ranging from 1 (completely disagree) to 5 (completely agree).
A. Tool user experience
1. I found FireDetective easy to use.
2. FireDetective should have been integrated with Eclipse.
B. Tool adequacy
1. There’s added value in using dynamic (i.e., runtime) information for analyzing Web applications.
2. The value added by a tool like FireDetective is minimal.
3. A tool like FireDetective saves me time.
4. A tool like FireDetective allows me to better understand Web apps.
5. A tool like FireDetective makes me more confident that I really understand the Web application that I’m investigating.
C. Tool features
Below are a number of features of the FireDetective tool. Please select your top 3 features. Put a “1” next to the best feature, a “2” next to the second best, and a “3” next to the third best feature.
Please mark features that you didn’t find useful with an ‘X’.
2. Being able to directly jump from (Ajax) requests to the corresponding server side code.
3. Real-time trace analysis, i.e.: (almost) no delay between capturing traces and analyzing them.
4. Marking sections of a trace using the Firefox add-on, by using “Begin mark” and “End mark”.
5. Being able to easily track xhr (Ajax) requests.
6. Filtering packages and java files based on the current page or trace.
After a 10-minute introduction in which we showed all features of FireDetective, participants worked on a different set of program understanding tasks for 25 min. The decision to have less time for this part was made to keep the complete duration of the study under two hours.
Working on the tasks was followed by the posttest questionnaire. We also asked participants to rate their top 3 features in FireDetective. Finally, another short interview was conducted, asking about encountered problems, least and best liked parts of the FireDetective tool and suggestions for improvement.
3.3 Target Application
The Java BluePrints library is used extensively in the Pet Store, and we found that not including its client-side code limited us in the task design. Moreover, this code would show up in FireBug and FireDetective anyway. Hence, we made sure that all client side code that was potentially visible in FireBug and FireDetective could also be found in Eclipse. This amounted to +6KLoc for BluePrints and +97KLoc for Dojo.
3.4 Task Design
Task set A
Task 1—the headline bar
Near the top of most pages of the Java Petstore application is a gray headline bar. The headline text switches from time to time.
b) Explain how these functions call each other when the text switches, and how they keep the switching going.
c) From what Web URL does the application get the headlines? Where in the code can you change that? (give file name + line number)
Task 2—server code
The Petstore consists of 6 sub pages: home, seller, search, catalog, maps and tag.
a) Which of those sub pages call—either directly or indirectly—methods of the GeoCoder class? (package: com.sun.javaee.blueprints.petstore.proxy)
b) Is the class SQLParser (package: com.sun.javaee.blueprints.petstore.search) being called—either directly or indirectly—on the search page?
Task 3—seller page
Navigate to the seller page.
a) Clicking on the next button does not trigger the form’s validation check. The Java Petstore manager has encountered several users who complained about this. He asks you to change the pet store, such that validation is also performed after clicking the next button. Which function or method do you need to modify? How do you modify it?
b) The user is required to enter a city and state on the second page of the form. As the user types in these text fields, an auto complete box shows up that allows the user to select cities and states but only US cities are listed. Of course, this is unacceptable! Which parts of the application (e.g. which functions or class methods) need to be modified for Canadian provinces and cities to show up in the auto complete box?
Task 4—popup view
Navigate to the search or tags page. Note that a popup appears when you hover over a pet name with the mouse.
b) What Java classes and JSP files are involved on the server side?
c) How come the popup doesn’t appear if you quickly hover over a description?
Task set B
Task 1—search & tag page
Navigate to the search page and click “Submit”. Clicking the little icons under “Map” allows you to (un)check all checkboxes.
Navigate to the tags page. Notice how you can click the tags to update the list.
b) Does clicking the tags trigger an ajax request? If yes, which JSP file(s) and server class(es) are involved?
Task 2—server code
a) Is the IndexDocument class (package: com.sun.javaee.blueprints.petstore.search) really being used on the search page? If yes, give a possible chain of events/calls leading to a use of the class (e.g., “user moves mouse” → handled by handleEvent → etc. → calls IndexDocument).
b) The Petstore consists of 6 sub pages: home, seller, search, catalog, maps and tag. Which of those sub pages make use—either directly or indirectly—of the EntryFilter class? (package: com.sun.javaee.blueprints.petstore.controller)
c) What is the purpose of the ImageAction class (package: com.sun.javaee.blueprints.petstore.controller.actions)?
Task 3—catalog page
Navigate to the catalog page.
a) The Pet store owner asks you to speed up the scrolling of the filmstrip at the bottom. Which parts of the code do you modify? (give file names(s) + line numbers)
b) Go to the “fish” and then “small fish” category. Scroll to the right in the bottom bar and try to flag “Nick’s goldfish” as inappropriate. This is supposed to delete the pet. Why does this not work? (i.e.: where’s the bug?)
Task 4—catalog & search page
Go to the catalog page.
a) You can click on the “star strip” to rate a pet. How is the rating computed?
Go to the search page.
c) At the moment, searching causes the whole page to refresh. In order to improve the user experience of the pet store, we want to ajaxify this process: i.e., we want search results to appear without refreshing the page. Which parts of the application (client side + server side) would we need to modify?
For the generalizability of the study it is important to make sure that the tasks are realistic and that they accurately represent a significant part of the program understanding task domain. Therefore, we used open-ended questions rather than multiple choice questions. Moreover, we designed tasks using Pacione et al.’s taxonomy of 9 principal activities (Pacione et al. 2004), and strove for coverage of the first 6 principles he suggests: A1. Investigating the functionality of (a part of) the system; A2. Adding to or changing the system’s functionality; A3. Investigating the internal structure of an artifact; A4. Investigating dependencies between artifacts; A5. Investigating runtime interactions in the system; A6. Investigating how much an artifact is used. We did not cover the last three principles, (1) to limit the number of tasks, (2) to reduce the risk of our participants becoming fatigued during the study and (3) because these three activities are less atomic and can be composed of several activities that are already captured in the first six activities. For completeness, the other three principle activities from Pacione et al. are: A7. Investigating patterns in the system’s execution, A8. Assessing the quality of the system’s design and A9. Understanding the domain of the system.
Since we were keen to observe how FireDetective would be used on unfamiliar code, we strove to choose tasks for the second set that would involve code not inspected in part A of the study. Nevertheless, a learning effect might be possible due to the fact that the software system remains the same. However, this should not impact our results, as we are not measuring an increase (or decrease) in efficiency from developers using FireDetective, but instead, we are gauging the FireDetective user experience.
3.5 Pilot Sessions
Three pilot sessions were conducted to fine tune the study. Two of the three pilot participants were co-workers of the second author of this paper; the third pilot participant was recruited via the same route that we recruited all of the other participants. All pilot sessions were done in a similar way as the actual study, except for the fact that we were particularly interested in whether all tasks were clear, whether the tasks were deemed too difficult and in what other ways could we improve the settings of our experiment.
The first pilot session did not use think aloud, and it turned out to be hard to reconstruct the participant’s thinking steps. As a result, we switched to think aloud with audio and screen recordings. Also, the questionnaires were reduced in size, with more emphasis on participant interviews. To keep the total length of the study under 2 h, the duration of the second part (during which participants use FireDetective) was reduced from 35 to 25 min.
During the second pilot we found that the tasks were too difficult, so they were altered to make them slightly easier. To reduce pressure on participants, we decided to give out tasks one at a time. Also, at the beginning of the study we made it clear that if participants were unsure what to do next, they could indicate this and move on to the next task. Finally, FireDetective’s user interface was improved and simplified.
The third pilot session ran without major problems and only a few minor adjustments were made afterwards. In particular, we altered the introduction to Eclipse to exclude explanations of Eclipse features (such as “Call hierarchy”) as such explanations may bias participants towards using these features. Also, some of the task descriptions were adjusted to make them clearer.
3.6 Participant Profile
Our eight participants represent our target population quite well. Five had a professional Web development job: one full-time and four part-time. Two others had a professional software development job: one full-time and one part-time. Both of these participants indicated that they worked on Web development projects for at least a part of their jobs. Except for the two full-time developers, the six other participants were either computer science or software engineering students: four undergraduate and two PhD students. Participants’ median number of years of Web development experience was 2 years (min. 1 year, max. 5 years); it can be argued that this is a low number. However, technologies like Ajax have not been around for that long: at the time of our study, the term Ajax had been coined less than 5 years ago (Garrett 2005). Moreover, the median number of years of software development experience was 5.5 years (min. 2 years, max. 10 years), which shows that participants did have general software development skills.
4 Findings and Discussion of the User Study
Our findings cover the Ajax understanding strategies currently used (part A, Section 4.1), as well as the way in which dynamic analysis in general and FireDetective in particular can support this understanding (part B, Section 4.2).
Correctness scores for part A of the user study (0.5 scores were given in case of a partially correct solution)
Correctness scores for part B of the user study (0.5 scores were given in case of a partially correct solution)
4.1 Part A: Observing Current Understanding Strategies
Central to the first part of the study is our first research question: “which strategies do Web developers currently use when trying to understand Ajax applications?” While participants were working with Eclipse and FireBug, we were able to make a number of observations.
First of all, participants relied almost solely on bottom-up comprehension strategies, i.e., starting at the lowest level—e.g., code fragments—and trying to piece the fragments that they found together. Participants mainly focused on exploring control flow relationships (Pennington 1987), i.e., finding definitions and/or occurrences of functions, methods and classes.
Another use of text search, specific to Web applications, was mapping an id of an element in the DOM tree (usually found through the FireBug element inspector) to where the id was used in the code. We also noticed more ad hoc uses of text search, such as searching for (part of) a URL or searching for some text of the Web page, used both successfully and unsuccessfully by participants to get an idea of where a particular element or URL was generated on the server.
Text search leads to a number of problems. Important results are sometimes missed because of cluttering of the search results window or choosing the wrong search scope. The biggest problem is that text search only allows the user to explore one control flow link at a time, making it easy to lose track. During a task when participants were required to follow a small but branching call tree, participants quickly lost track of which branches they had already explored, causing them to make mistakes: only two participants were able to provide a correct answer.
From this we conclude that the strategies that Web developers currently use can be improved. Participants rely mostly on looking at code and text search, which can be better supported by tools. Since following control flow constitutes a fairly big chunk of participants’ actions, supporting this process seems useful. Considering the incompleteness of static analysis and the highly dynamic nature of Web applications, our findings support our argument that dynamic analysis would be beneficial in tool support.
4.2 Part B: Support Through Dynamic Analysis
Central to this part of the study is our second research question: “Can dynamic analysis improve program understanding for Ajax applications?” We explore this question by considering whether dynamic analysis as provided by the FireDetective tool can be used to improve understanding of Ajax Web applications. Furthermore, if this is indeed the case, we would also like to learn more about how this works, and what we can do to further improve understanding. We obtained insights into these questions via four different routes: the pretest-posttest questionnaires, the questionnaire about feature usefulness, observations of participants using the tool and the final interview.
In particular, participants indicate that the tool can help them to understand Web applications more effectively (Fig. 4a) and more efficiently (Fig. 4b). Participants also seem convinced that the tool helps them to be more confident about their understanding of the Web application they are investigating (Fig. 4c), although their answers are somewhat more distributed compared to the other questions. One participant answered “strongly disagree” during the posttest, as can be seen from the figure. Interestingly enough, when asked why this was, the participant answered that the tool made some tasks almost too easy: “It seemed like I caught [the answer] a lot quicker than I was expecting, so that questioned how much I really trusted the results that I came up with.” Finally, participants acknowledged that the tool adds value (Fig. 4d).
While these are preliminary findings, we consider them encouraging. They suggest that FireDetective, which leverages dynamic analysis techniques, is indeed capable of improving program understanding for Ajax applications.
All participants used the first three features (F1, F2, F3). This is not too surprising, since these features are central to the tool. Six out of eight participants used the time slice feature (F5) and 4 participants briefly explored the life cycle feature (F4). Use of F6 is implicit. Participants’ subjective preferences towards features are shown in Fig. 5. We can see that there is no clear preferred feature. However, we can observe some trends, which may give us some insight into how FireDetective helped improve program understanding.
The high-level overview (F1) and time slicing of the high-level view (F5) seem to be popular with three #1 votes each, as well as jumping between client and server (F3)—two #1 votes. A possible explanation for this popularity could be that these three features all play a role in enabling a more top-down understanding process, which, as we could see from part A of the study, participants did not previously use. Rather than starting with low-level code, participants can now look at abstractions such as Ajax requests and DOM events and use them as starting points to explore the code. The filtered files view (F2) has the largest number of votes in general, and may play a similar role. From part A of the study, we saw that participants often did not know all of the files that were relevant to a certain page of the Pet Store: the filtered files view provides an initial overview of these relevant files, such that participants have a better starting point for investigation.
It is hard to determine exactly which elements of FireDetective are the main contributors to its usefulness. Some features are untestable via a questionnaire, such as “code view” and “naming of anonymous functions” (automatic): these features are used all the time, but because of that it can be hard for participants to determine whether these features are actually useful.
4.2.3 Observations and Interview
Besides the expected learning curve and usability issues (see Matthijssen 2010), participants encountered a number of issues when working with FireDetective.
One interesting issue that several participants encountered had to do with Java servlet filters, server-side classes defined by the Web application that process requests. Since the tool records calls to all methods, it also shows calls made to filter classes. However, it cannot show why these calls occur, since the internal server logic that calls the filters is hidden from view, and even if the tool were to show these internal calls, it would produce a distorted picture, since the real cause of the filter being called is a binding specified in an XML file. During the study, several participants encountered this problem. They were wondering why the EntryFilter class of the PetStore was invoked, but the tool was unable to give them this information.
Finally, participants were slightly confused by the way the tool presents full-page requests. The high-level view was filtered to show only the last full-page request. However, participants did not always notice this, causing them to think that they were dealing with an Ajax request, while it was actually a full-page request. This confused them because they were looking for an Ajax request that did not exist.
When asked about potential tool improvements, participants often indicated integration with FireBug, providing a first indication that features of FireBug and FireDetective are considered complementary. Participants also asked for mechanisms to reduce the amount of visible information: they were sometimes overwhelmed by the information shown. Since we used only basic trace visualization and reduction techniques, this was to be expected. Participants asked for particular static analysis techniques, such as full text search, possibly because they are attached to their old way of working, but probably because static and dynamic analysis are complementary techniques.
Different kinds of visualizations. FireDetective’s visualizations are straightforward representations of the recorded abstractions and traces. Only simple trace reduction techniques were used, which—expectedly—caused participants to be overloaded with information on various occasions. We should investigate how to visualize the connected network of abstractions, traces and code in better ways.
Integration with existing tools. From the study it became clear that FireDetective and FireBug are complementary tools. It could be interesting to investigate how these tools exactly complement each other and how they can be integrated more tightly.
5 Design of the Field User Study
The user study we describe in Sections 3 and 4 helps us understand how a tool such as FireDetective can aid in understanding Ajax applications. However, the time that each participant spent with FireDetective was limited to 25 min. Moreover, the study was also conducted in a controlled lab setting using mainly student programmers (although most had industrial experience) and doing preassigned tasks. What we lack is a more in-depth look at how FireDetective can be used in more open ended tasks. Furthermore, we also wanted to gauge whether professional Web developers with more background in Ajax and related technologies would have a different opinion on the strengths and weaknesses of FireDetective. This section describes the experimental setup of a user study conducted in a field setting with these goals in mind. This field user study is meant to strengthen and broaden our insights into RQ2 in which we investigate whether dynamic analysis, as presented through the FireDetective tool, can improve program understanding for Ajax applications.
5.1 Field Setting
In order to get a more in depth perspective of how FireDetective would be used by seasoned and experienced professional programmers, we set up a field user study with two professional Web developers from a company called Mendix.11 Mendix was founded in 2005 as a spin-off from the Delft University of Technology and the Erasmus University of Rotterdam. In 5 years, the company has grown from a start-up to a company with 75 employees. Their core business is to rapidly develop business applications that can easily be integrated into existing IT environments. Mendix is at the forefront of technology, as they are actively using relatively new technologies from the realms of model-driven engineering (MDE) and Ajax. Mendix was kind enough to be willing to cooperate in evaluating FireDetective and two experienced Web developers volunteered to participate in our field user study.12
5.2 Experimental Setup
The field user study consisted of a single half-day session with the two developers from Mendix. The study was again centered around the Java Pet Store (see Section 3.3). This provided us with the benefit that we already had a reasonable level of knowledge of the application, while the two developers that participated in the field user study had no experience with it.
Step 1: Demo of FireDetective
We started the session with a short demo of FireDetective, in which we showed the two developers all features of FireDetective on a very small toy Ajax application.
Step 2: Free exploration
In this step we asked the two developers to freely explore the Java Pet Store application for around 1.5 h. We gave them the goal of getting a good understanding of the implementation of the major functionality in the Java Pet Store, and told them that we would discuss the implementation details of this functionality later on during the session. We provided technical assistance during the user study. The two developers were allowed to work in pair-programming style and we asked them to think aloud, which allowed us to gain insight into their way of working with FireDetective.
Step 3: Questionnaire
When the two developers were satisfied with their reconnaissance mission of the Java Pet Store, we gave them the same same set of questions that we also used in the posttest of our user study (see Section 4.2). This step only took a few minutes and was mainly done to be able to compare between both groups of subjects.
Step 4: Contextual interview
While we already gained quite a lot of information during the free exploration phase, we intensified the interview once the developers felt they were comfortable with their understanding of the Java Pet Store. In particular, we used a contextual interview (Holtzblatt and Jones 1995). This contextual interview already started during the second step, where we observed how the developers explored the system. In particular, we took note of which questions they were asking and how they were using FireDetective to answer these questions. Subsequently, we continued the interview and we aimed to further explore the possibilities of FireDetective and identify circumstances in which FireDetective can be of benefit. In order to steer this conversation, we used the work by Sillito et al. (2006), who identified a set of 44 typical questions developers have when maintaining a piece of software. In order to save the participants’ time, we focused on questions that could benefit from the dynamic analysis of Web applications. The eliminated questions could either easily be answered using other static analysis features in an IDE such as “Where does this type fit in the type hierarchy?” or they were not pertinent for Web applications. See Table 8 for the 25 questions that we asked the developers. For each of the questions that we discussed, we asked the two developers to rate the usefulness of FireDetective to answer the question. For this, we used a 5-point Likert scale that ranged from “totally disagree” (score 1) to “totally agree” (score 5). This interview took close to two hours and during the interview, the developers frequently went back to FireDetective to get a good appraisal of FireDetective on specific aspects that are highlighted by Sillito et al.’s questions.
5.3 Participant Profile
6 Findings and Discussion of the Field User Study
We already indicated in Section 5 that the developers first had the opportunity to freely explore the Java Pet Store application for 1.5 h. During this free exploration they formed and refined hypotheses. When they were satisfied with an initial hypothesis on the implementation of a particular feature, they started investigating the application itself in order to verify their hypothesis.
After the free exploration of FireDetective, we presented the two Mendix developers with a short questionnaire (Section 6.1), and we continued our contextual interview with the aim of getting more in-depth feedback on tool requirements (Section 6.2).
About FireDetective as a tool
Results to the questionnaire about the FireDetective tool on a scale of 1 (totally disagree) to 5 (totally agree)
I found FireDetective easy to use
FireDetective should be integrated into Eclipse (or another IDE)
There’s added value in using dynamic analysis for analyzing Web applications
There’s added value in using dynamic analysis for understanding Web applications
A tool like FireDetective is likely to save me time
A tool like FireDetective allows me to better understand Web applications
A tool like FireDetective makes me more confident that I really understand the Web application that I’m investigating
While both developers filled in their questionnaire forms individually, their scores agree on almost all statements. When we then compare their answers to Fig. 4, subgraphs (a), (b) and (c) we see that these two experienced developers are equally positive (if not slightly more) about FireDetective than the participants from the user study were.
Features of FireDetective
6.2 Contextual Interview
The 25 questions that we selected from Sillito et al. (2006) and that we asked the developers, including the score of the developers with 1 indicating “Totally disagree” and 5 indicating “Totally agree”
Where in the code is the text in this error message or UI element?
Where is there any code involved in the implementation of this behavior?
Where is this method called or type referenced?
When during the execution is this method called?
Where are instances of this class created?
Where is this variable or data structure being accessed?
What are the arguments to this function?
What are the values of these arguments at runtime
What data is being modified in this code?
How are instances of these types created and assembled?
How are these types or objects related?
How is this feature or concern implemented?
What is the behavior these types provide together and how is it distributed over the types?
What is the “correct” way to use or access this data structure?
How does this data structure look at runtime?
How can data be passed to (or accessed at) this point in the code?
How is control getting (from here to) here?
Why isn’t control reaching this point in the code?
Which execution path is being taken in this case?
Under what circumstances is this method called or exception thrown?
What parts of this data structure are accessed in this code?
What is the mapping between these UI types and these model types?
Where in the UI should this functionality be added?
To move this feature into this code, what else needs to be moved?
How can we know this object has been created and initialized correctly?
We first gave the two Mendix developers the aforementioned set of questions and asked them to think about the questions and rate the usefulness of FireDetective for answering each of the questions, on a 5-point Likert scale. In a subsequent step, we compared the responses of the developers in group and tried to bridge differing opinions. In all cases where we identified differing opinions—2 cases to be precise—, this was due to a different interpretation of the question. In addition, we did not only compare notes, we also collected anecdotes on how FireDetective was useful for answering a question. In particular, when we were discussing the questions in a group, the developers were frequently referring to using some FireDetective functionality to investigate the implementation of a particular feature. In the next paragraph, we highlight some of the more interesting anecdotes. Important to note as well is that during the entire interview, FireDetective was available to the developers so that they could check up on certain ideas or opinions.
Where in the code is the text in this error message or UI element? Both developers agreed here that FireDetective is useful (score: 4) and that finding the origins of an error message in the trace is actually more efficient than using a textual search tool, which would sometimes need to be applied on both the client and server-side code in order to find the origins of an error message.
Where is there any code involved in the implementation of this behavior? Many times during the exploration of the Java Pet Store case, the two developers formed a hypothesis that a particular functionality was purely implemented on the client side, or both on the client and the server-side. FireDetective helped them to verify their hypotheses. As an example, the two Mendix developers hypothesized that when the Java Pet Store puts a marker on the Google Map when clicking on the address of a physical shop, this functionality is purely implemented on the client-side. Investigation with FireDetective confirmed their initial hypothesis. The developers valued FireDetective’s ability to answer this question with scores of 4 and 5.
How is this concern or feature implemented? Both developers strongly agreed that FireDetective comes in very handy for answering this question and both gave a score of 5 for FireDetective’s ability to answer this question. Both developers really appreciated the trace marking feature of FireDetective, which effectively enabled them to do feature location (Wilde and Scully 1995).
How can data be passed to or accessed at this point in the code? Both developers immediately indicated that FireDetective makes it very easy to see how data can be passed to a point in code, but how it is accessed is not always clear. For the first option, both developers decided to give a score of “totally agree” (5) for FireDetective’s capabilities to support this question. What they particularly liked about FireDetective in this regard, is its ability to connect the client and the server-side code, with the added benefit of having the possibility to navigate both the client and the server side code from within FireDetective. This effectively allows them to follow the control flow, which helps to understand how data can be passed.
How is control getting (from here to) here? Again, both developers indicated that FireDetective supports this question very well (score of 5 by both). Furthermore, the developers stressed that FireDetective allowed them to get a really good overview of everything that is happening at the client-side, and while they appreciated the connection to the server, they thought that the main strength of FireDetective was its way of visualizing calls and interactions that occur within the client-side (browser). At this point, they also compared FireBug with FireDetective, stating that for understanding purposes FireDetective is superior because it provides a better overview of what is going on, compared to the breakpointing facility offered by FireBug.
Which execution path is being taken in this case? FireDetective’s facilities to answer this question were rated with a score of 5 by both developers.
Additional insights obtained from the contextual interview
In this section we will briefly discuss some of the additional insights that we gained during the contextual interview with the two developers. These additional insights could not be directly mapped to one of the comprehension questions in the previous part.
One of the developers suggested to incorporate profiling functionality into FireDetective, to provide more obvious insights into the performance of the application. This would effectively mean integrating some of the functionality of DynaTrace, a tool both developers knew (also see Section 8), into FireDetective. There was actually no consensus on the benefits of integrating all functionality (for understanding and for profiling) in one tool versus using two separate tools, while each separately would possibly be more powerful in its own right.
With regard to FireBug, the two Mendix developers would appreciate the ability to investigate the actual values of parameters, variables and return values in FireDetective, a feature that FireBug currently does have. Intuitively, one of the developers wanted to start up FireBug during the exploration phase, but quickly abandoned this route, when he realized that FireDetective and FireBug were unable to be used on the same installation of Firefox.
They finished the interview by stating that they felt that FireDetective excelled at giving the developer a feeling of confidence of his understanding, which they rated as more important than any time-gain from using FireDetective.
The two experienced Web developers from Mendix that we recruited for this user study have provided us with additional indications of the perceived usefulness of FireDetective as a program comprehension tool for understanding complex Ajax Web applications. Their opinions are similar to the results that we obtained from our first user study.
In particular, if we compare the results from the posttest of the user study (see Fig. 4) to the opinion of the two Mendix developers, we see a similar trend: Web developers are convinced that FireDetective can help them in understanding applications. From the insights that we gained from interacting with the two expert developers, however, we gathered that while FireDetective might speed up the comprehension process, the ultimate benefit they see is that FireDetective increases their confidence in their understanding.
Also of interest to note is that the two expert developers found FireDetective’s features to investigate client-side interactions to be the most useful, because it is at the client side that things often become difficult to follow. They still appreciated the fact that they also had the opportunity to investigate the server-side behavior, without having to make a context-switch to an IDE or another tool.
Furthermore, the hypothesis-driven approach that the developers followed for understanding the Java Pet Store gives an indication that they were following a top-down comprehension strategy, where they created a hypothesis, executed part of the application under study, and started investigating the behavior drilling down to the source code level to accept or reject their hypothesis. This is similar to what we saw in our user study.
7 Threats to Validity
7.1 Internal Validity
Participants might have been inclined to rate the tool more positively than they actually value it, because they might have felt this was the more desirable answer. For both user studies, we mitigated this concern by indicating to participants that only honest answers were valuable. Nevertheless, we recognize that such a bias possibility still exists.
Next, the introduction sessions might have biased participants towards using the features that we showed them. We tried to neutralize this threat in the following way. During the introduction session for part A of the user study we only showed participants basic information on where they could find the different parts (i.e., server-side code, client-side code) within the Eclipse project, and the basic FireBug views. Explanations of other features were not included and participants were told they could use any feature they desired. For part B of the user study, we made sure to explain all features of FireDetective, so that participants would not be biased towards using any feature in particular. For the field user study, we showed all features of FireDetective to the two developers.
The tasks of the user study might have been too easy or too difficult. However, through three pilot sessions we adjusted the task difficulty level accordingly. Also, participants of the user study might have felt time pressure, causing them to behave differently. We minimized this problem by telling them that the number of tasks completed was not important and by handing out tasks one at a time, without revealing how many there were to come. For the field user study, we indicated that the free exploration period would take approximately 1.5 h, but that if the developers felt the need, they could continue beyond this time frame.
7.2 External Validity
A concern regarding the generalizability of the results of the user study is that most participants of the user study were students. However, as shown in Section 3.6 a lot of these participants had a relevant part-time job. Participants of the user study were not familiar with two of the technologies used in the study, JSP and Dojo. We admit that the learning curve involved has likely impacted the results. Yet, we also think that this impact is limited because both JSP and Dojo are technologies that are very similar to rivaling technologies. Moreover, participants were given a brief introduction to JSP, and were allowed to ask questions about the technologies involved at any time. Additionally, for the field user study that we performed, we recruited two experienced Web developers and while they did not execute the same tasks, their opinion of FireDetective does not differ much.
The Java Pet Store, our target application, is a showcase application. This might cause one to question whether this application is representative of a real-world Ajax application. However, the application represents the state-of-the-practice and manual inspection of the application shows that it uses Ajax on most of its pages and is clearly more than just a “toy example”. Moreover, the application has been used in previous program understanding research efforts, e.g., (Li and Wohlstadter 2009). While the two experienced Web developers that we recruited for the field user study had no prior knowledge of the Java Pet Store, they actually acknowledged that this project contains many of the typically (Ajax) idioms that you would also find in industrial-strength Ajax Web applications, which is an extra argument for the representativeness of the Java Pet Store application.
The tasks for the user study might not have been representative of real-world tasks. Because of the limited time frame, tasks are likely to be shorter than real-world tasks, and they might not have covered all program understanding aspects. We tried to mitigate this threat by using Pacione’s framework of principal comprehension activities (Pacione et al. 2004) to make sure that the tasks are realistic and cover a significant portion of the program comprehension spectrum.
Similarly, the questions that we discussed with the two Mendix developers might not have been realistic. In order to mitigate this threat, we reused the list of questions that was previously identified by Sillito et al. (2006).
Both the user study that we describe in Section 3 and the industrial field study from Section 5 let the participants deal with a software system that they are not familiar with; both experiments deal with a situation in which the participants are considered software immigrants (Sim and Holt 1998), i.e., developers that are getting to know the domain of the case study, in this case the Java Pet Store. As such, the findings that we report upon are based on developers trying to find their way in a previously unknown software system and do not reflect situations were developers are already familiar with the system. We acknowledge that a follow-up study should also investigate the usefulness of FireDetective in situations where the developers are already familiar with the domain.
8 Related Work
Reverse engineering approaches can generally by categorized into static, dynamic or hybrid (combining static and dynamic) analysis techniques. This section provides a brief overview of approaches that use dynamic analysis (Cornelissen et al. 2009a). We start by discussing some general trace analysis techniques, after which we focus specifically on techniques for reverse engineering and understanding Web applications.
8.1 Trace Analysis
Trace analysis concerns itself with with sequences of run-time events and how these sequences can be used to gain insight into the workings of the program. Since traces may quickly grow to massive proportions (Zaidman et al. 2006b, 2005, 2006a), we need ways to deal with their size (Cornelissen et al. 2009a). We consider two common ways to do so: trace reduction and trace visualization, which are often combined.
Language-based filtering methods in which particular kinds of programming constructs can be omitted from a trace without sacrificing too much of the information the trace conveys. Examples are getters and setters that are called from within a class (when called between classes, getter and setter accesses can indicate important relationships!), and constructors and destructors of unimportant or unused objects (Cornelissen et al. 2007a; Hamou-Lhadj and Lethbridge 2003). We can also filter elements of the program or its libraries, i.e., calls to specific components, classes, methods, etc.
Metric-based filtering methods can be used to determine which parts to keep and which parts to discard from a trace. Examples are: using stack depth as a metric, i.e., filtering all calls above a specific depth (Pauw et al. 1998) or below a specific depth (Cornelissen et al. 2007a). Hamou-Lhadj and Lethbridge put forward a utilityhood metric that indicates the probability that a specific method is a utility method, which is based on fan-in and fan-out analysis, and use a threshold value to filter parts of the trace (Hamou-Lhadj and Lethbridge 2006). A similar technique for finding important classes has been proposed by Zaidman and Demeyer (2008).
Trace summarization is meant to find patterns within traces to compact these patterns. Typically, there are a lot of those patterns, since programs often contain repetitions, and “execution patterns of iterative behavior rarely justify the space they consume” (Pauw et al. 1998). Examples are methods based on string matching (Systä et al. 2001), run-length encoding or grammars (Reiss and Renieris 2001), techniques that are borrowed from the signal processing field (Kuhn and Greevy 2006; Zaidman and Demeyer 2004) and approaches that use information from source code (Myers et al. 2010). A question that arises when identifying patterns, is how far we should go with generalizing parts of traces to patterns. Seldomly will we see many exact recurrences of a pattern. Instead, each recurrence often differs by a slight amount (Pauw et al. 1998). De Pauw et al. propose various measures to decide which parts can be considered equivalent, such as: class identity (the same classes are being called), message structure identity (the same methods are being called) and repetition identity (different numbers of repetition are considered the same) (Pauw et al. 1998).
Trace visualization is a popular research area: many techniques have been suggested. Sequence diagrams—and variations of them—are the most common way to visualize execution traces. Bennett et al. investigate the importance of several features of sequence diagrams, and provide a survey of different approaches (Bennett et al. 2008). Rather than mentioning every trace visualization technique that has been proposed over the years, we mention several techniques that, in our opinion, are among the more interesting and novel ones. Reiss (2003) puts forward a real-time visualization of program activity in the form of real-time-box views. Such a view consists of a grid in which every square represents information about a single problem element (e.g., class, method, etc.). Ducasse et al. take this idea a step further by introducing polymetric views, a more general version of the former views (Ducasse et al. 2004). For example, instead of squares in a grid, they use nodes in a graph to represent program elements. Cornelissen et al. describe the idea of circular bundle views, in which a system’s components are shown on the boundary of a circle, and bundles within the circle represent relationships between components (Cornelissen et al. 2007b, 2008b). In subsequent research, they also investigated the effectiveness of their circular bundle views in a controlled experiment (Cornelissen et al. 2009b).
8.2 Dynamic Analysis for Understanding Web Applications
Early Web application reverse engineering efforts were mainly focused on architecture reconstruction, e.g., (Hassan and Holt 2002; Ricca and Tonella 2001; Di Lucca et al. 2002; Tonella et al. 2002; Antoniol et al. 2004). Static analysis alone does not suffice because of the dynamic nature of Web applications (Tonella et al. 2002; Antoniol et al. 2004), so in most cases the static analysis is complemented by dynamic analysis. However, many client-side aspects that are common in Ajax applications are not taken into account.
De Pauw et al. (2005) present the Web Services Navigator, a tool that offers insight into message and transaction flows in systems of multiple Web services. The tool combines multiple Web service event logs to reconstruct meaningful abstractions in the Web service domain and has some similarities with FireDetective, albeit applied to a different domain.
Oney and Myers (2009) present FireCrystal, which enables a user to view a timeline of DOM events and DOM modifications, and view code coverage per DOM event.
Our approach differs from these last two approaches in a number of ways. First, our approach visualizes execution traces. Second, it combines client and server-side information to show a complete picture of an Ajax application. Third, it uses a different and larger set of abstractions from the Ajax/Web domain to link traces together (in contrast to only DOM mutations and DOM events).
Finally, there is one commercial tool of interest: DynaTrace Ajax.16 DynaTrace Ajax and FireDetective are quite similar: they both record execution traces, they both use abstractions from the Ajax domain to link traces, and they both combine client-side and server-side data. However, DynaTrace is primarily focused on performance analysis, whereas FireDetective is primarily focused on improving understanding. FireDetective lacks performance analysis features, but instead has features that aid the program understanding process, such as showing code in its original context.
9 Conclusions and Future Work
In this paper we have introduced FireDetective, a dynamic analysis tool for analyzing Ajax applications. FireDetective records execution traces on both the browser and server, captures information about Ajax/Web abstractions, and presents this information in a linked way.
RQ1 Which strategies do Web developers currently use when trying to understand Ajax applications? In the user study we gauged how the user study participants work with traditional Web development tools. We witnessed that participants mainly use a bottom-up approach, and heavily rely on text search. This strategy is ad hoc and problematic for understanding Ajax applications, of which the logic is spread over the client and server-side. It is our argument that tool support can improve on this situation.
During the field user study that we performed with two experienced Ajax Web developers, we witnessed a hypothesis-driven top-down understanding strategy. When understanding Ajax applications, the developers found that visualizing the client-side interactions was actually the most beneficial part the FireDetective tool. On some occasions, they also investigated the server-side information that FireDetective provides, and they appreciated the fact that they could see all this information integrated into a single tool. Finally, the two experienced developers also noted that they feel that while they are not sure whether FireDetective would save them time when trying to understand Ajax applications, they felt that FireDetective does make them more confident in their understanding.
We have designed and implemented FireDetective, a dynamic analysis tool for understanding Ajax applications.
We have shown how to employ abstractions in the Ajax/Web domain to link execution traces.
We have carried out a preliminary user study that showed us (1) how developers traditionally go about understanding Ajax applications and (2) that dynamic analysis techniques, in particular the trace analysis capabilities of FireDetective, can improve their understanding.
We have carried out a field user study with two experienced Ajax developers that gave us additional insight into how experienced Ajax developers use FireDetective for understanding complex Web applications.
An interesting avenue for future work is to explore ways to further improve program understanding of Ajax applications. At the same time we must carefully evaluate empirically how individual aspects and techniques affect the understanding process.
In order to strengthen our evaluation, it is our aim to pursue two distinct routes, namely a longitudinal study and a controlled experiment. The longitudinal study would explore the long term effects of using FireDetective in a real Web development environment, while the controlled experiment would give us additional insight into the actual effectiveness of FireDetective in terms of improvements in speed and correctness of maintenance tasks performed with or without FireDetective, similar to Cornelissen et al. (2011).
Incorporating some FireBug features into FireDetective, e.g., the inspection of values of parameters and return values.
Extending the server tracing facilities to encompass other platforms besides Java EE.
Creating a client-side plug-in for the WebKit17 browser development platform, which would also enable the investigation of Ajax Web applications developed for mobile platforms such as iOS or Android.
In a somewhat different direction for future work, it is our aim to investigate whether understanding rich client GUI applications suffers from the same difficulties as understanding Ajax Web applications. Furthermore, if our assumption of seeing the same difficulties in that domain is true, we might be able to reuse some of the ideas of FireDetective to also help developers understand these rich client GUI applications.
FireDetective is open source and can be downloaded from http://swerl.tudelft.nl/bin/view/Main/FireDetective.
The last two types of links were only implemented after conducting the user study.
FireBug version 1.5.0.
One of these two developers was an undergraduate student in the Software Engineering Research Group of the Delft University of Technology a few years back.
See http://ajax.dynatrace.com/. DynaTrace Ajax Edition was released in September 2009, after we built FireDetective.
We would like to thank Ian Bull for providing valuable comments on earlier versions of this paper. Additionally, our gratitude goes out to all volunteers that participated in our user study. We also want to thank Johan den Haan, Michiel Kalkman and Michel Weststrate from Mendix for enabling the field user study.
This work has been sponsored by the Center for Dependable ICT (CeDICT), an initiative of NIRICT, the Netherlands Institute for Research on ICT.
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.