Skip to main content
Log in

Requirements for tools for comprehending highly specialized assembly language code and how to elicit these requirements

  • Original Article
  • Published:
Requirements Engineering Aims and scope Submit manuscript

Abstract

Program comprehension tools used with assembly language—often for maintaining legacy software or reverse engineering malware threats—are dated and fail to provide rudimentary features found in tool support for higher-level languages. The need for people who can maintain these legacy systems is growing, as is the number of malicious cyberspace threats. To build new visualization and analysis tools within this domain, we need to understand the unique challenges faced by these developers. This paper presents the results of an exploratory case study to elicit requirements from two uniquely specialized groups of assembly language developers in an industrial setting: a large multi-national company developing mainframe software and a government defense facility analyzing malware and security flaws. In addition to surveys, observations and interviews, this study applies social psychology and nominal group techniques. We provide a ranking, and detailed description, for the requirements elicited in each group. We further include additional requirements obtained from observational studies. The ultimate conclusion we reach is that while similarities exist at a high level, upon deeper inspection, each group is quite unique with regard to their tooling needs.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. https://github.com/cbenning/idapro_dataflow.

  2. https://github.com/cbenning/idapro_comment, https://github.com/cbenning/idapro_comment_template.

References

  1. Baldwin J, Sinha P, Salois M, Coady Y (2011) Progressive user interfaces for regressive analysis: making tracks with large, low-level systems. In: Proceedings of the Australasian user interface conference (AUIC), Perth, Australia

  2. Treude C, Figueira Filho F, Storey M-A, Salois M (2011) An exploratory study of software reverse engineering in a security context. In: 18th working conference on reverse engineering (WCRE), Oct 2011, pp 184–188

  3. Teh A, Baniassad E, Rooy DV, Boughton C (2011) Social psychology and software teams: a preliminary look at establishing task-effective group norms, vol 99. IEEE Software (PrePrints)

  4. Postmes T, Spears R, Cihangir S (2001) Quality of decision making and group norms. J Pers Soc Psychol 80(6):918–930

    Article  Google Scholar 

  5. Stangor C (2004) Social groups in action and interaction. Psychology Press, New York, NY

    Google Scholar 

  6. Janis IL (1982) Groupthink: psychological studies of policy decisions and fiascoes. Houghton Mifflin, Boston

    Google Scholar 

  7. Tajfel H, Billig MG, Bundy RP, Flament C (1971) Social categorization and intergroup behaviour. Eur J Soc Psychol 1(2):149–178. doi:10.1002/ejsp.2420010202

    Article  Google Scholar 

  8. Goncalo J, Staw B (2006) Individualism-collectivism and group creativity. Organ Behav Hum Decis Process 100(1):96–109

    Article  Google Scholar 

  9. Kruglanski AW (1990) Motivations for judging and knowing: implications for causal attribution. Handb Motiv Cogn Found Soc Behav 2:333–368

    Google Scholar 

  10. Kruglanski AW, Webster DM (1996) Motivated closing of the mind:“seizing” and “freezing”. Psychol Rev 103(2):263–283 (Online). http://www.ncbi.nlm.nih.gov/pubmed/8637961

  11. Bechtoldt MN, De Dreu CKW, Nijstad BA, Choi H-S (2010) Motivated information processing, social tuning, and group creativity. J Personal Soc Psychol 99(4):622–637

    Article  Google Scholar 

  12. Oyserman D, Coon HM, Kemmelmeier M (2002) Rethinking individualism and collectivism: evaluation of theoretical assumptions and meta-analyses. Psychol Bull 128(1):3–72 (Online). http://psycnet.apa.org/index.cfm?fa=fulltext.journal&jcode=bul&vol=128&issue=1&format=html&page=3&expand=1

  13. LimeSurvey (2013) (Online). http://www.limesurvey.org/en/

  14. Webster DM, Kruglanski AW (1994) Individual differences in need for cognitive closure. J Personal Soc Psychol 67(6):1049–1062 (Online). http://psycnet.apa.org/journals/psp/67/6/1049/

  15. Roets A, Van Hiel A (2011) Item selection and validation of a brief, 15-item version of the need for closure scale. Personal Individ Differ

  16. Ericsson KA, Simon HA (1993) Protocol analysis: verbal reports as data, Rev edn. MIT Press, Cambridge

    Google Scholar 

  17. Lewis C, Rieman J (1994) Task-centered user interface design: a practical introduction. Department of Computer Science, University of Colorado, Boulder

  18. Goguen J, Linde C (1993) Techniques for requirements elicitation. In: Proceedings of IEEE international symposium on requirements engineering (RE), Jan 1993, pp 152–164

  19. Singer J, Lethbridge T, Vinson N, Anquetil N (1997) An examination of software engineering work practices. In: Proceedings of the centre for advanced studies conference (CASCON). IBM Press (Online). http://portal.acm.org/citation.cfm?id=782010.782031

  20. Delbecq AL, VandeVen AH (1971) A group process model for problem identification and program planning. J Appl Behav Sci VII:466–491

    Article  Google Scholar 

  21. Diehl M, Stroebe W (1987) Productivity loss in brainstorming groups: toward the solution of a riddle. J Personal Soc Psychol, 53(3):497–509 (Online). http://linkinghub.elsevier.com/retrieve/pii/S0022351403031157

  22. High level assembler and toolkit feature (2010) (Online). http://www-01.ibm.com/software/awdtools/hlasm

  23. Hex-Rays SA (2010) IDA pro disassembler (Online). http://www.hex-rays.com/idapro

  24. Storey M-A, Cheng L-T, Bull I, Rigby P (2006) Shared waypoints and social tagging to support collaboration in software development. In: Proceedings of the 2006 20th anniversary conference on computer supported cooperative work, ser. CSCW ’06. New York, NY, USA: ACM, pp 195–198

  25. Collberg C, Thomborson C, Low D (1997) A taxonomy of obfuscating transformations. Technical Report 148

  26. Plug-In Contest 2011: Hall Of Fame, 2012 (Online). http://www.hex-rays.com/contests/2011/index.shtml

  27. Van Emmerik M, Waddington T (2004) Using a decompiler for real-world source recovery. In: WCRE ’04: proceedings of the 11th working conference on reverse engineering. IEEE Computer Society, Washington, DC, USA, pp 27–36

  28. IDA Plugins: Sobek, 2012. (Online). http://www.openrce.org/downloads/details/38/Sobek

  29. Baldwin J, Coady Y (2012) AVA: assembly visualization and analysis. In: Eclipse Demo Camp. Vancouver, BC, Canada June 2012

  30. Thompson M (2010) Mariposa botnet analysis. Defence intelligence, Technical Report (Online). http://defintel.com/docs/Mariposa_Analysis

  31. Sinha P, Boukhtouta A, Belarde VH, Debbabi M (2010) Insights from the analysis of the Mariposa botnet. In: 5th international conference on risks and security of internet and systems (CRISIS), Montreal, QC, Canada

  32. Google App Engine (2012) (Online). https://developers.google.com/appengine/

  33. Amini P (2006) PaiMei—reverse engineering framework. In: RECON ’06: reverse engineering conference. Montreal, Canada

  34. Bales RF (1950) Interaction process analysis. Massachusetts, Cambridge

    Google Scholar 

  35. Teh A (2012) Normative manipulation as a way of improving the performance of software engineering groups: three experiments. Ph.D. dissertation, The Australian National University

  36. First Nations Stewardship Tools Partnership (2013) (Online). http://web.uvic.ca/fnst/

  37. Franke RH, Kaul JD (1978) The hawthorne experiments: first statistical interpretation. Am Sociol Rev 43(5):623–643 (Online). http://www.jstor.org/stable/2094540

Download references

Acknowledgments

The authors would like to thank the members of the Alpha group and Beta group for participating in our research. This work was partially funded by NSERC (Natural Sciences and Engineering Research Council of Canada).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jennifer Baldwin.

Appendices

Appendix 1: Script used during the nominal group session

This script was used for the 2-h-long session with the Alpha group of six participants. While the same script was also used with the Beta group, times should be adjusted according to the size of the participant group.

Time

Action

Script

0-min introduction

SAY

Hi, I’m RESEARCHER NAME from UNIVERSITY NAME. For my PhD in Computer Science, I am exploring how visualization and tool support for assembly language might be useful. My work is being funded by COMPANY NAMES

Since you are experts in the area, we really value your experience and expertise in defining the issues

This is SECOND RESEARCHER NAME and I’ll let him introduce himself

University

Degree

Research interest

To get started, I’d like to collect the ethics forms that you were given yesterday

DO

Collect the ethics forms

SAY

This session should take no longer than 2 h, including a 20 min break. The aim is to discuss and critically rank all of the items from the exercise yesterday. If you come up with new ideas during the session, please add them to your list. Feel free to be creative

Does everyone have the blue pages?

First of all, I’d like to go around the table and have everyone introduce themselves and tell us about your job. We also know from the survey that your teams are expertise-centered, so it would be great to hear about that, as well as your interests

10-min listing of ideas

SAY

Now to begin the group exercise, we will go around the table and each person will share one item from their list at a time. At this time, please avoid discussion or talking out of turn

After all of the items are listed, we will have a discussion to clarify the items. If you have any new ideas then feel free to add them to your sheet. If you want to skip a turn, that is also fine

DO

Record word for word what each person says on the power point slide

30-min discussion of ideas

SAY

We will now have a 30 min discussion on all the ideas generated

Now is the time to ask for clarification or elaboration on an idea, or dispute or defend an item

You are also welcome to suggest new items during this time, but no items can be eliminated

We’ll go through them item by item

DO

Announce each item on the list and ask what it means, or how people feel about it. Record any new ideas on the power point slide

60-min ranking to select the “top ten” ideas

SAY

Now if everyone could take out their yellow sheet for preliminary ranking

You can see there are 10 spaces to be filled in. You can select 10 items that are the most important for you from all of the options. Then assign them a rank which is a numbering between 1 and 10, where 10 is the most important

Once you are finished, please turn it face down on the table and then you are free to take a break for about 20 min

70-min break

DO

Go around the table and transcribe and sum up the points from the ranking sheets onto the power point slides. Then reorder them on the slide based on the greatest number of points

Collect everyone from after their break

90-min discussion of vote

SAY

We have reordered the items according to rank and you can see the score for them. We have also highlighted the top ten

We will now have a free-for-all discussion about the nature and content of the top ten

We would also like to hear how you feel about items that should have been included or excluded from this list

110-min re-ranking and rating revised “top ten” items

SAY

Now if everyone could take out their green sheet for final ranking. Here you will again list the top ten items that you think are the most important

This may be the same ten, or feel free to modify which items are in your top ten

The ranking here is different in that 100 points will be given to the most important item. Every other item can have a value between 0 and 100. Two items can have the same ranking

Once you are finished, please hand in your sheets to me face down, and then we’re all done!

DO

Collect the green sheets from everyone and tally up the final scores based on the 0–100 ranking

END CASE STUDY AT (START + 120 MIN)

Appendix 2: Issues observed at the Alpha group during activity-based protocol elicitation

Requirement category

Issue

Description

First session

Browsing and navigation

XREF works on only 8 character long names

When there are more, search must be used, which only finds them one at a time in the code

Bookmarking lines of code

Have to create names “a,” “b.” If the name already exists, it is just overwritten

Lack of navigation

Need to scroll through many screens of code to look for the right spot

Build

  

Control flow

Hard to find main task

 

Tools would need to support multi-threading

 

Data

  

Debugging

Timing issues were tricky

Timing dumps not useful because they are too complicated

Couldn’t work out what was causing the cancel

Need some way to trap the event

XREF plus debugger to find the correct place to debug

Step-through debugging might be helpful

De-obfuscation

Redundant code makes the code confusing to read

Statements such as branching to the next address. Unnecessary since that code is next to be executed

Documentation

Look up vendor error code in CA documentation

Not indexable online so need to download CA docs to search them. CA error code is then used to look up IBM Manual error code. Codes are OS version dependent

User prints off whole modules

The printoff is portable and more comfortable to look at (easier on the eyes). There are also sticky notes and writing on the pages. These written notes include variable names, addresses and error codes

The dump was scrolling off the page

There were so many errors, it did not fit. Need a way to condense it

Integration

  

References

  

Source control

Object module replacement

Overwrites whole module, have to be careful not to overwrite a change. Have to check prerequisite chain, and which fixes supersede others

Source editing

  

Second session

Browsing and navigation

*temp is used as a TODO

Shows up only when you dig into the module you’re interested in. Used pdsman to scan and find. Scan doesn’t show the active module however

Switching terminal screens constantly

Need to scroll through many screens of code to look for the right spot. Kept many terminal screens open. Was hard to keep track of which showed the right code

Build

Register usage

Waits for compile error to say that the register is in use

? at the start of lines

To ensure you get errors, but do no want to deal with the actual errors (stub error)

Scanning software for changes he knows he has to make

Otherwise waits for compile errors. Compile errors would be better if they occurred during editing. Context aware correction suggestions (i.e., does not exist, did you mean…?). Calls out to code that does not exist anymore

Control flow

  

Data

  

Debugging

No breakpoints in XDC

Puts code in to make it fail

De-obfuscation

  

Documentation

IBM Principles of Hardware Manual

Useful to double check some things

Integration

  

References

Code module—fan in, fan out

Wanted to know what module was being called dependent on the code and what code it depended on

Source control

  

Source editing

Tedious refactoring of modules

Splitting larger modules into smaller ones to use as templates. Templates are not useful, not maintained that much, but useful for people starting from scratch. Instead he uses something else he’s working on, copies it and butchers it (side by side editing)

Forgot to save the file

No alert was given

Code shortcuts

Stuff he does more than once

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Baldwin, J., Teh, A., Baniassad, E. et al. Requirements for tools for comprehending highly specialized assembly language code and how to elicit these requirements. Requirements Eng 21, 131–159 (2016). https://doi.org/10.1007/s00766-014-0214-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00766-014-0214-y

Keywords

Navigation