Abstract
Program comprehension tools used with assembly language—often for maintaining legacy software or reverse engineering malware threats—are dated and fail to provide rudimentary features found in tool support for higher-level languages. The need for people who can maintain these legacy systems is growing, as is the number of malicious cyberspace threats. To build new visualization and analysis tools within this domain, we need to understand the unique challenges faced by these developers. This paper presents the results of an exploratory case study to elicit requirements from two uniquely specialized groups of assembly language developers in an industrial setting: a large multi-national company developing mainframe software and a government defense facility analyzing malware and security flaws. In addition to surveys, observations and interviews, this study applies social psychology and nominal group techniques. We provide a ranking, and detailed description, for the requirements elicited in each group. We further include additional requirements obtained from observational studies. The ultimate conclusion we reach is that while similarities exist at a high level, upon deeper inspection, each group is quite unique with regard to their tooling needs.
Similar content being viewed by others
References
Baldwin J, Sinha P, Salois M, Coady Y (2011) Progressive user interfaces for regressive analysis: making tracks with large, low-level systems. In: Proceedings of the Australasian user interface conference (AUIC), Perth, Australia
Treude C, Figueira Filho F, Storey M-A, Salois M (2011) An exploratory study of software reverse engineering in a security context. In: 18th working conference on reverse engineering (WCRE), Oct 2011, pp 184–188
Teh A, Baniassad E, Rooy DV, Boughton C (2011) Social psychology and software teams: a preliminary look at establishing task-effective group norms, vol 99. IEEE Software (PrePrints)
Postmes T, Spears R, Cihangir S (2001) Quality of decision making and group norms. J Pers Soc Psychol 80(6):918–930
Stangor C (2004) Social groups in action and interaction. Psychology Press, New York, NY
Janis IL (1982) Groupthink: psychological studies of policy decisions and fiascoes. Houghton Mifflin, Boston
Tajfel H, Billig MG, Bundy RP, Flament C (1971) Social categorization and intergroup behaviour. Eur J Soc Psychol 1(2):149–178. doi:10.1002/ejsp.2420010202
Goncalo J, Staw B (2006) Individualism-collectivism and group creativity. Organ Behav Hum Decis Process 100(1):96–109
Kruglanski AW (1990) Motivations for judging and knowing: implications for causal attribution. Handb Motiv Cogn Found Soc Behav 2:333–368
Kruglanski AW, Webster DM (1996) Motivated closing of the mind:“seizing” and “freezing”. Psychol Rev 103(2):263–283 (Online). http://www.ncbi.nlm.nih.gov/pubmed/8637961
Bechtoldt MN, De Dreu CKW, Nijstad BA, Choi H-S (2010) Motivated information processing, social tuning, and group creativity. J Personal Soc Psychol 99(4):622–637
Oyserman D, Coon HM, Kemmelmeier M (2002) Rethinking individualism and collectivism: evaluation of theoretical assumptions and meta-analyses. Psychol Bull 128(1):3–72 (Online). http://psycnet.apa.org/index.cfm?fa=fulltext.journal&jcode=bul&vol=128&issue=1&format=html&page=3&expand=1
LimeSurvey (2013) (Online). http://www.limesurvey.org/en/
Webster DM, Kruglanski AW (1994) Individual differences in need for cognitive closure. J Personal Soc Psychol 67(6):1049–1062 (Online). http://psycnet.apa.org/journals/psp/67/6/1049/
Roets A, Van Hiel A (2011) Item selection and validation of a brief, 15-item version of the need for closure scale. Personal Individ Differ
Ericsson KA, Simon HA (1993) Protocol analysis: verbal reports as data, Rev edn. MIT Press, Cambridge
Lewis C, Rieman J (1994) Task-centered user interface design: a practical introduction. Department of Computer Science, University of Colorado, Boulder
Goguen J, Linde C (1993) Techniques for requirements elicitation. In: Proceedings of IEEE international symposium on requirements engineering (RE), Jan 1993, pp 152–164
Singer J, Lethbridge T, Vinson N, Anquetil N (1997) An examination of software engineering work practices. In: Proceedings of the centre for advanced studies conference (CASCON). IBM Press (Online). http://portal.acm.org/citation.cfm?id=782010.782031
Delbecq AL, VandeVen AH (1971) A group process model for problem identification and program planning. J Appl Behav Sci VII:466–491
Diehl M, Stroebe W (1987) Productivity loss in brainstorming groups: toward the solution of a riddle. J Personal Soc Psychol, 53(3):497–509 (Online). http://linkinghub.elsevier.com/retrieve/pii/S0022351403031157
High level assembler and toolkit feature (2010) (Online). http://www-01.ibm.com/software/awdtools/hlasm
Hex-Rays SA (2010) IDA pro disassembler (Online). http://www.hex-rays.com/idapro
Storey M-A, Cheng L-T, Bull I, Rigby P (2006) Shared waypoints and social tagging to support collaboration in software development. In: Proceedings of the 2006 20th anniversary conference on computer supported cooperative work, ser. CSCW ’06. New York, NY, USA: ACM, pp 195–198
Collberg C, Thomborson C, Low D (1997) A taxonomy of obfuscating transformations. Technical Report 148
Plug-In Contest 2011: Hall Of Fame, 2012 (Online). http://www.hex-rays.com/contests/2011/index.shtml
Van Emmerik M, Waddington T (2004) Using a decompiler for real-world source recovery. In: WCRE ’04: proceedings of the 11th working conference on reverse engineering. IEEE Computer Society, Washington, DC, USA, pp 27–36
IDA Plugins: Sobek, 2012. (Online). http://www.openrce.org/downloads/details/38/Sobek
Baldwin J, Coady Y (2012) AVA: assembly visualization and analysis. In: Eclipse Demo Camp. Vancouver, BC, Canada June 2012
Thompson M (2010) Mariposa botnet analysis. Defence intelligence, Technical Report (Online). http://defintel.com/docs/Mariposa_Analysis
Sinha P, Boukhtouta A, Belarde VH, Debbabi M (2010) Insights from the analysis of the Mariposa botnet. In: 5th international conference on risks and security of internet and systems (CRISIS), Montreal, QC, Canada
Google App Engine (2012) (Online). https://developers.google.com/appengine/
Amini P (2006) PaiMei—reverse engineering framework. In: RECON ’06: reverse engineering conference. Montreal, Canada
Bales RF (1950) Interaction process analysis. Massachusetts, Cambridge
Teh A (2012) Normative manipulation as a way of improving the performance of software engineering groups: three experiments. Ph.D. dissertation, The Australian National University
First Nations Stewardship Tools Partnership (2013) (Online). http://web.uvic.ca/fnst/
Franke RH, Kaul JD (1978) The hawthorne experiments: first statistical interpretation. Am Sociol Rev 43(5):623–643 (Online). http://www.jstor.org/stable/2094540
Acknowledgments
The authors would like to thank the members of the Alpha group and Beta group for participating in our research. This work was partially funded by NSERC (Natural Sciences and Engineering Research Council of Canada).
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: Script used during the nominal group session
This script was used for the 2-h-long session with the Alpha group of six participants. While the same script was also used with the Beta group, times should be adjusted according to the size of the participant group.
Time | Action | Script |
---|---|---|
0-min introduction | SAY | Hi, I’m RESEARCHER NAME from UNIVERSITY NAME. For my PhD in Computer Science, I am exploring how visualization and tool support for assembly language might be useful. My work is being funded by COMPANY NAMES |
Since you are experts in the area, we really value your experience and expertise in defining the issues | ||
This is SECOND RESEARCHER NAME and I’ll let him introduce himself | ||
University | ||
Degree | ||
Research interest | ||
To get started, I’d like to collect the ethics forms that you were given yesterday | ||
DO | Collect the ethics forms | |
SAY | This session should take no longer than 2 h, including a 20 min break. The aim is to discuss and critically rank all of the items from the exercise yesterday. If you come up with new ideas during the session, please add them to your list. Feel free to be creative | |
Does everyone have the blue pages? | ||
First of all, I’d like to go around the table and have everyone introduce themselves and tell us about your job. We also know from the survey that your teams are expertise-centered, so it would be great to hear about that, as well as your interests | ||
10-min listing of ideas | SAY | Now to begin the group exercise, we will go around the table and each person will share one item from their list at a time. At this time, please avoid discussion or talking out of turn |
After all of the items are listed, we will have a discussion to clarify the items. If you have any new ideas then feel free to add them to your sheet. If you want to skip a turn, that is also fine | ||
DO | Record word for word what each person says on the power point slide | |
30-min discussion of ideas | SAY | We will now have a 30 min discussion on all the ideas generated |
Now is the time to ask for clarification or elaboration on an idea, or dispute or defend an item | ||
You are also welcome to suggest new items during this time, but no items can be eliminated | ||
We’ll go through them item by item | ||
DO | Announce each item on the list and ask what it means, or how people feel about it. Record any new ideas on the power point slide | |
60-min ranking to select the “top ten” ideas | SAY | Now if everyone could take out their yellow sheet for preliminary ranking |
You can see there are 10 spaces to be filled in. You can select 10 items that are the most important for you from all of the options. Then assign them a rank which is a numbering between 1 and 10, where 10 is the most important | ||
Once you are finished, please turn it face down on the table and then you are free to take a break for about 20 min | ||
70-min break | DO | Go around the table and transcribe and sum up the points from the ranking sheets onto the power point slides. Then reorder them on the slide based on the greatest number of points |
Collect everyone from after their break | ||
90-min discussion of vote | SAY | We have reordered the items according to rank and you can see the score for them. We have also highlighted the top ten |
We will now have a free-for-all discussion about the nature and content of the top ten | ||
We would also like to hear how you feel about items that should have been included or excluded from this list | ||
110-min re-ranking and rating revised “top ten” items | SAY | Now if everyone could take out their green sheet for final ranking. Here you will again list the top ten items that you think are the most important |
This may be the same ten, or feel free to modify which items are in your top ten | ||
The ranking here is different in that 100 points will be given to the most important item. Every other item can have a value between 0 and 100. Two items can have the same ranking | ||
Once you are finished, please hand in your sheets to me face down, and then we’re all done! | ||
DO | Collect the green sheets from everyone and tally up the final scores based on the 0–100 ranking | |
END CASE STUDY AT (START + 120 MIN) |
Appendix 2: Issues observed at the Alpha group during activity-based protocol elicitation
Requirement category | Issue | Description |
---|---|---|
First session | ||
Browsing and navigation | XREF works on only 8 character long names | When there are more, search must be used, which only finds them one at a time in the code |
Bookmarking lines of code | Have to create names “a,” “b.” If the name already exists, it is just overwritten | |
Lack of navigation | Need to scroll through many screens of code to look for the right spot | |
Build | ||
Control flow | Hard to find main task | |
Tools would need to support multi-threading | ||
Data | ||
Debugging | Timing issues were tricky | Timing dumps not useful because they are too complicated |
Couldn’t work out what was causing the cancel | Need some way to trap the event | |
XREF plus debugger to find the correct place to debug | Step-through debugging might be helpful | |
De-obfuscation | Redundant code makes the code confusing to read | Statements such as branching to the next address. Unnecessary since that code is next to be executed |
Documentation | Look up vendor error code in CA documentation | Not indexable online so need to download CA docs to search them. CA error code is then used to look up IBM Manual error code. Codes are OS version dependent |
User prints off whole modules | The printoff is portable and more comfortable to look at (easier on the eyes). There are also sticky notes and writing on the pages. These written notes include variable names, addresses and error codes | |
The dump was scrolling off the page | There were so many errors, it did not fit. Need a way to condense it | |
Integration | ||
References | ||
Source control | Object module replacement | Overwrites whole module, have to be careful not to overwrite a change. Have to check prerequisite chain, and which fixes supersede others |
Source editing | ||
Second session | ||
Browsing and navigation | *temp is used as a TODO | Shows up only when you dig into the module you’re interested in. Used pdsman to scan and find. Scan doesn’t show the active module however |
Switching terminal screens constantly | Need to scroll through many screens of code to look for the right spot. Kept many terminal screens open. Was hard to keep track of which showed the right code | |
Build | Register usage | Waits for compile error to say that the register is in use |
? at the start of lines | To ensure you get errors, but do no want to deal with the actual errors (stub error) | |
Scanning software for changes he knows he has to make | Otherwise waits for compile errors. Compile errors would be better if they occurred during editing. Context aware correction suggestions (i.e., does not exist, did you mean…?). Calls out to code that does not exist anymore | |
Control flow | ||
Data | ||
Debugging | No breakpoints in XDC | Puts code in to make it fail |
De-obfuscation | ||
Documentation | IBM Principles of Hardware Manual | Useful to double check some things |
Integration | ||
References | Code module—fan in, fan out | Wanted to know what module was being called dependent on the code and what code it depended on |
Source control | ||
Source editing | Tedious refactoring of modules | Splitting larger modules into smaller ones to use as templates. Templates are not useful, not maintained that much, but useful for people starting from scratch. Instead he uses something else he’s working on, copies it and butchers it (side by side editing) |
Forgot to save the file | No alert was given | |
Code shortcuts | Stuff he does more than once |
Rights and permissions
About this article
Cite this article
Baldwin, J., Teh, A., Baniassad, E. et al. Requirements for tools for comprehending highly specialized assembly language code and how to elicit these requirements. Requirements Eng 21, 131–159 (2016). https://doi.org/10.1007/s00766-014-0214-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00766-014-0214-y