Requirements for tools for comprehending highly specialized assembly language code and how to elicit these requirements

Baldwin, Jennifer; Teh, Alvin; Baniassad, Elisa; van Rooy, Dirk; Coady, Yvonne

doi:10.1007/s00766-014-0214-y

Requirements for tools for comprehending highly specialized assembly language code and how to elicit these requirements

Original Article
Published: 09 October 2014

Volume 21, pages 131–159, (2016)
Cite this article

Requirements Engineering Aims and scope Submit manuscript

Jennifer Baldwin¹,
Alvin Teh²,
Elisa Baniassad²,
Dirk van Rooy³ &
…
Yvonne Coady¹

1585 Accesses
3 Citations
4 Altmetric
Explore all metrics

Abstract

Program comprehension tools used with assembly language—often for maintaining legacy software or reverse engineering malware threats—are dated and fail to provide rudimentary features found in tool support for higher-level languages. The need for people who can maintain these legacy systems is growing, as is the number of malicious cyberspace threats. To build new visualization and analysis tools within this domain, we need to understand the unique challenges faced by these developers. This paper presents the results of an exploratory case study to elicit requirements from two uniquely specialized groups of assembly language developers in an industrial setting: a large multi-national company developing mainframe software and a government defense facility analyzing malware and security flaws. In addition to surveys, observations and interviews, this study applies social psychology and nominal group techniques. We provide a ranking, and detailed description, for the requirements elicited in each group. We further include additional requirements obtained from observational studies. The ultimate conclusion we reach is that while similarities exist at a high level, upon deeper inspection, each group is quite unique with regard to their tooling needs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Understanding large-scale software systems – structure and flows

Article 31 March 2021

Conclave: Ontology-Driven Measurement of Semantic Relatedness between Source Code Elements and Problem Domain Concepts

What do developers search for on the web?

Article 09 April 2017

Notes

References

Baldwin J, Sinha P, Salois M, Coady Y (2011) Progressive user interfaces for regressive analysis: making tracks with large, low-level systems. In: Proceedings of the Australasian user interface conference (AUIC), Perth, Australia
Treude C, Figueira Filho F, Storey M-A, Salois M (2011) An exploratory study of software reverse engineering in a security context. In: 18th working conference on reverse engineering (WCRE), Oct 2011, pp 184–188
Teh A, Baniassad E, Rooy DV, Boughton C (2011) Social psychology and software teams: a preliminary look at establishing task-effective group norms, vol 99. IEEE Software (PrePrints)
Postmes T, Spears R, Cihangir S (2001) Quality of decision making and group norms. J Pers Soc Psychol 80(6):918–930
Article Google Scholar
Stangor C (2004) Social groups in action and interaction. Psychology Press, New York, NY
Google Scholar
Janis IL (1982) Groupthink: psychological studies of policy decisions and fiascoes. Houghton Mifflin, Boston
Google Scholar
Tajfel H, Billig MG, Bundy RP, Flament C (1971) Social categorization and intergroup behaviour. Eur J Soc Psychol 1(2):149–178. doi:10.1002/ejsp.2420010202
Article Google Scholar
Goncalo J, Staw B (2006) Individualism-collectivism and group creativity. Organ Behav Hum Decis Process 100(1):96–109
Article Google Scholar
Kruglanski AW (1990) Motivations for judging and knowing: implications for causal attribution. Handb Motiv Cogn Found Soc Behav 2:333–368
Google Scholar
Kruglanski AW, Webster DM (1996) Motivated closing of the mind:“seizing” and “freezing”. Psychol Rev 103(2):263–283 (Online). http://www.ncbi.nlm.nih.gov/pubmed/8637961
Bechtoldt MN, De Dreu CKW, Nijstad BA, Choi H-S (2010) Motivated information processing, social tuning, and group creativity. J Personal Soc Psychol 99(4):622–637
Article Google Scholar
Oyserman D, Coon HM, Kemmelmeier M (2002) Rethinking individualism and collectivism: evaluation of theoretical assumptions and meta-analyses. Psychol Bull 128(1):3–72 (Online). http://psycnet.apa.org/index.cfm?fa=fulltext.journal&jcode=bul&vol=128&issue=1&format=html&page=3&expand=1
LimeSurvey (2013) (Online). http://www.limesurvey.org/en/
Webster DM, Kruglanski AW (1994) Individual differences in need for cognitive closure. J Personal Soc Psychol 67(6):1049–1062 (Online). http://psycnet.apa.org/journals/psp/67/6/1049/
Roets A, Van Hiel A (2011) Item selection and validation of a brief, 15-item version of the need for closure scale. Personal Individ Differ
Ericsson KA, Simon HA (1993) Protocol analysis: verbal reports as data, Rev edn. MIT Press, Cambridge
Google Scholar
Lewis C, Rieman J (1994) Task-centered user interface design: a practical introduction. Department of Computer Science, University of Colorado, Boulder
Goguen J, Linde C (1993) Techniques for requirements elicitation. In: Proceedings of IEEE international symposium on requirements engineering (RE), Jan 1993, pp 152–164
Singer J, Lethbridge T, Vinson N, Anquetil N (1997) An examination of software engineering work practices. In: Proceedings of the centre for advanced studies conference (CASCON). IBM Press (Online). http://portal.acm.org/citation.cfm?id=782010.782031
Delbecq AL, VandeVen AH (1971) A group process model for problem identification and program planning. J Appl Behav Sci VII:466–491
Article Google Scholar
Diehl M, Stroebe W (1987) Productivity loss in brainstorming groups: toward the solution of a riddle. J Personal Soc Psychol, 53(3):497–509 (Online). http://linkinghub.elsevier.com/retrieve/pii/S0022351403031157
High level assembler and toolkit feature (2010) (Online). http://www-01.ibm.com/software/awdtools/hlasm
Hex-Rays SA (2010) IDA pro disassembler (Online). http://www.hex-rays.com/idapro
Storey M-A, Cheng L-T, Bull I, Rigby P (2006) Shared waypoints and social tagging to support collaboration in software development. In: Proceedings of the 2006 20th anniversary conference on computer supported cooperative work, ser. CSCW ’06. New York, NY, USA: ACM, pp 195–198
Collberg C, Thomborson C, Low D (1997) A taxonomy of obfuscating transformations. Technical Report 148
Plug-In Contest 2011: Hall Of Fame, 2012 (Online). http://www.hex-rays.com/contests/2011/index.shtml
Van Emmerik M, Waddington T (2004) Using a decompiler for real-world source recovery. In: WCRE ’04: proceedings of the 11th working conference on reverse engineering. IEEE Computer Society, Washington, DC, USA, pp 27–36
IDA Plugins: Sobek, 2012. (Online). http://www.openrce.org/downloads/details/38/Sobek
Baldwin J, Coady Y (2012) AVA: assembly visualization and analysis. In: Eclipse Demo Camp. Vancouver, BC, Canada June 2012
Thompson M (2010) Mariposa botnet analysis. Defence intelligence, Technical Report (Online). http://defintel.com/docs/Mariposa_Analysis
Sinha P, Boukhtouta A, Belarde VH, Debbabi M (2010) Insights from the analysis of the Mariposa botnet. In: 5th international conference on risks and security of internet and systems (CRISIS), Montreal, QC, Canada
Google App Engine (2012) (Online). https://developers.google.com/appengine/
Amini P (2006) PaiMei—reverse engineering framework. In: RECON ’06: reverse engineering conference. Montreal, Canada
Bales RF (1950) Interaction process analysis. Massachusetts, Cambridge
Google Scholar
Teh A (2012) Normative manipulation as a way of improving the performance of software engineering groups: three experiments. Ph.D. dissertation, The Australian National University
First Nations Stewardship Tools Partnership (2013) (Online). http://web.uvic.ca/fnst/
Franke RH, Kaul JD (1978) The hawthorne experiments: first statistical interpretation. Am Sociol Rev 43(5):623–643 (Online). http://www.jstor.org/stable/2094540

Download references

Acknowledgments

The authors would like to thank the members of the Alpha group and Beta group for participating in our research. This work was partially funded by NSERC (Natural Sciences and Engineering Research Council of Canada).

Author information

Authors and Affiliations

Department of Computer Science, University of Victoria, Victoria, BC, Canada
Jennifer Baldwin & Yvonne Coady
Department of Computer Science, Australian National University, Canberra, ACT, Australia
Alvin Teh & Elisa Baniassad
Department of Psychology, Australian National University, Canberra, ACT, Australia
Dirk van Rooy

Authors

Jennifer Baldwin
View author publications
You can also search for this author in PubMed Google Scholar
Alvin Teh
View author publications
You can also search for this author in PubMed Google Scholar
Elisa Baniassad
View author publications
You can also search for this author in PubMed Google Scholar
Dirk van Rooy
View author publications
You can also search for this author in PubMed Google Scholar
Yvonne Coady
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jennifer Baldwin.

Appendices

Appendix 1: Script used during the nominal group session

This script was used for the 2-h-long session with the Alpha group of six participants. While the same script was also used with the Beta group, times should be adjusted according to the size of the participant group.

Time	Action	Script
0-min introduction	SAY	Hi, I’m RESEARCHER NAME from UNIVERSITY NAME. For my PhD in Computer Science, I am exploring how visualization and tool support for assembly language might be useful. My work is being funded by COMPANY NAMES
		Since you are experts in the area, we really value your experience and expertise in defining the issues
		This is SECOND RESEARCHER NAME and I’ll let him introduce himself
		University
		Degree
		Research interest
		To get started, I’d like to collect the ethics forms that you were given yesterday
	DO	Collect the ethics forms
	SAY	This session should take no longer than 2 h, including a 20 min break. The aim is to discuss and critically rank all of the items from the exercise yesterday. If you come up with new ideas during the session, please add them to your list. Feel free to be creative
		Does everyone have the blue pages?
		First of all, I’d like to go around the table and have everyone introduce themselves and tell us about your job. We also know from the survey that your teams are expertise-centered, so it would be great to hear about that, as well as your interests
10-min listing of ideas	SAY	Now to begin the group exercise, we will go around the table and each person will share one item from their list at a time. At this time, please avoid discussion or talking out of turn
	SAY	After all of the items are listed, we will have a discussion to clarify the items. If you have any new ideas then feel free to add them to your sheet. If you want to skip a turn, that is also fine
	DO	Record word for word what each person says on the power point slide
30-min discussion of ideas	SAY	We will now have a 30 min discussion on all the ideas generated
		Now is the time to ask for clarification or elaboration on an idea, or dispute or defend an item
		You are also welcome to suggest new items during this time, but no items can be eliminated
		We’ll go through them item by item
	DO	Announce each item on the list and ask what it means, or how people feel about it. Record any new ideas on the power point slide
60-min ranking to select the “top ten” ideas	SAY	Now if everyone could take out their yellow sheet for preliminary ranking
		You can see there are 10 spaces to be filled in. You can select 10 items that are the most important for you from all of the options. Then assign them a rank which is a numbering between 1 and 10, where 10 is the most important
		Once you are finished, please turn it face down on the table and then you are free to take a break for about 20 min
70-min break	DO	Go around the table and transcribe and sum up the points from the ranking sheets onto the power point slides. Then reorder them on the slide based on the greatest number of points
70-min break	DO	Collect everyone from after their break
90-min discussion of vote	SAY	We have reordered the items according to rank and you can see the score for them. We have also highlighted the top ten
		We will now have a free-for-all discussion about the nature and content of the top ten
		We would also like to hear how you feel about items that should have been included or excluded from this list
110-min re-ranking and rating revised “top ten” items	SAY	Now if everyone could take out their green sheet for final ranking. Here you will again list the top ten items that you think are the most important
		This may be the same ten, or feel free to modify which items are in your top ten
		The ranking here is different in that 100 points will be given to the most important item. Every other item can have a value between 0 and 100. Two items can have the same ranking
		Once you are finished, please hand in your sheets to me face down, and then we’re all done!
	DO	Collect the green sheets from everyone and tally up the final scores based on the 0–100 ranking
	DO	END CASE STUDY AT (START + 120 MIN)

Appendix 2: Issues observed at the Alpha group during activity-based protocol elicitation

Requirement category	Issue	Description
First session
Browsing and navigation	XREF works on only 8 character long names	When there are more, search must be used, which only finds them one at a time in the code
	Bookmarking lines of code	Have to create names “a,” “b.” If the name already exists, it is just overwritten
	Lack of navigation	Need to scroll through many screens of code to look for the right spot
Build
Control flow	Hard to find main task
Control flow	Tools would need to support multi-threading
Data
Debugging	Timing issues were tricky	Timing dumps not useful because they are too complicated
	Couldn’t work out what was causing the cancel	Need some way to trap the event
	XREF plus debugger to find the correct place to debug	Step-through debugging might be helpful
De-obfuscation	Redundant code makes the code confusing to read	Statements such as branching to the next address. Unnecessary since that code is next to be executed
Documentation	Look up vendor error code in CA documentation	Not indexable online so need to download CA docs to search them. CA error code is then used to look up IBM Manual error code. Codes are OS version dependent
	User prints off whole modules	The printoff is portable and more comfortable to look at (easier on the eyes). There are also sticky notes and writing on the pages. These written notes include variable names, addresses and error codes
	The dump was scrolling off the page	There were so many errors, it did not fit. Need a way to condense it
Integration
References
Source control	Object module replacement	Overwrites whole module, have to be careful not to overwrite a change. Have to check prerequisite chain, and which fixes supersede others
Source editing
Second session
Browsing and navigation	*temp is used as a TODO	Shows up only when you dig into the module you’re interested in. Used pdsman to scan and find. Scan doesn’t show the active module however
Browsing and navigation	Switching terminal screens constantly	Need to scroll through many screens of code to look for the right spot. Kept many terminal screens open. Was hard to keep track of which showed the right code
Build	Register usage	Waits for compile error to say that the register is in use
	? at the start of lines	To ensure you get errors, but do no want to deal with the actual errors (stub error)
	Scanning software for changes he knows he has to make	Otherwise waits for compile errors. Compile errors would be better if they occurred during editing. Context aware correction suggestions (i.e., does not exist, did you mean…?). Calls out to code that does not exist anymore
Control flow
Data
Debugging	No breakpoints in XDC	Puts code in to make it fail
De-obfuscation
Documentation	IBM Principles of Hardware Manual	Useful to double check some things
Integration
References	Code module—fan in, fan out	Wanted to know what module was being called dependent on the code and what code it depended on
Source control
Source editing	Tedious refactoring of modules	Splitting larger modules into smaller ones to use as templates. Templates are not useful, not maintained that much, but useful for people starting from scratch. Instead he uses something else he’s working on, copies it and butchers it (side by side editing)
	Forgot to save the file	No alert was given
	Code shortcuts	Stuff he does more than once

Rights and permissions

Reprints and permissions

About this article

Cite this article

Baldwin, J., Teh, A., Baniassad, E. et al. Requirements for tools for comprehending highly specialized assembly language code and how to elicit these requirements. Requirements Eng 21, 131–159 (2016). https://doi.org/10.1007/s00766-014-0214-y

Download citation

Received: 08 July 2013
Accepted: 24 September 2014
Published: 09 October 2014
Issue Date: March 2016
DOI: https://doi.org/10.1007/s00766-014-0214-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Requirements for tools for comprehending highly specialized assembly language code and how to elicit these requirements

Abstract

Access this article

Similar content being viewed by others

Understanding large-scale software systems – structure and flows

Conclave: Ontology-Driven Measurement of Semantic Relatedness between Source Code Elements and Problem Domain Concepts

What do developers search for on the web?

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Script used during the nominal group session

Appendix 2: Issues observed at the Alpha group during activity-based protocol elicitation

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Requirements for tools for comprehending highly specialized assembly language code and how to elicit these requirements

Abstract

Access this article

Similar content being viewed by others

Understanding large-scale software systems – structure and flows

Conclave: Ontology-Driven Measurement of Semantic Relatedness between Source Code Elements and Problem Domain Concepts

What do developers search for on the web?

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendix 1: Script used during the nominal group session

Appendix 2: Issues observed at the Alpha group during activity-based protocol elicitation

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation