Which testing is used to determine whether the system fulfills user requirements?

Show

  • Which testing is used to determine whether the system fulfills user requirements?
    Access through your institution

Abstract

Acceptance testing is crucial to determine whether a system fulfills end-user requirements. However, the creation of acceptance tests is a laborious task entailing two major challenges: (1) practitioners need to determine the right set of test cases that fully covers a requirement, and (2) they need to create test cases manually due to insufficient tool support. Existing approaches for automatically deriving test cases require semi-formal or even formal notations of requirements, though unrestricted natural language is prevalent in practice. In this paper, we present our tool-supported approach CiRA (Conditionals in Requirements Artifacts) capable of creating the minimal set of required test cases from conditional statements in informal requirements. We demonstrate the feasibility of CiRA in a case study with three industry partners. In our study, out of 578 manually created test cases, 71.8% can be generated automatically. Additionally, CiRA discovered 80 relevant test cases that were missed in manual test case design. CiRA is publicly available at www.cira.bth.se/demo/.

Introduction

Acceptance tests are used to verify the conformity between end-user requirements and actual system behavior (ISO/IEC/IEEE 24765:2010, E). Each acceptance test contains a finite set of test cases that specify certain test inputs and expected results. Test case design is a very laborious activity that easily accounts for 40 - 70 % of the total effort in the testing process (Beller et al., 2015). This stems from the following challenges.

Determining the right set of test cases that fully covers a requirement is a difficult task, especially for complex requirements. A requirement is considered fully covered if a set of associated test cases assures the behavior implied by that requirement (Whalen et al., 2006). In a previous study (Fischbach et al., 2020a), we found that acceptance tests are often not systematically created, resulting in incomplete or excessive test cases. In the case of missing test cases, system defects are not (or only partially) detected. In contrast, excessive test cases lead to unnecessary testing efforts and increased test maintenance costs. Consequently, practitioners need to strike a balance between full test coverage and number of required test cases.

Creating acceptance tests is a predominantly manual task due to insufficient tool support (Garousi et al., 2020). Most of the existing approaches allow the derivation of test cases from semi-formal requirements (Wang et al., 2020, Carvalho et al., 2014, Barros et al., 2011) (e.g., expressed in controlled natural language) or formal requirements (Liu and Nakajima, 2020, Sharma and Biswas, 2014) (e.g., expressed in linear temporal logic), but are not suitable to process informal requirements. However, studies (Kassab et al., 2014) have shown that requirements are usually expressed in unrestricted natural language (NL). Some approaches (Fischbach et al., 2020c, Verma and Beg, 2013, Santiago Júnior and Vijaykumar, 2012) address this research gap and focus on deriving test cases from informal requirements. Nevertheless, they show poor performance when evaluated on unseen real-world data. Specifically, they are not robust against grammatical errors and fail to process words that are not yet part of their training vocabulary (Fischbach et al., 2021c).

We aim to develop a tool-supported approach to derive the minimal set of required test cases automatically from NL requirements by applying Natural Language Processing (NLP).

Functional requirements often describe system behavior by relating events to each other, e.g., “If the system detects an error (e1), an error message shall be shown (e2)”. Previous studies (Fischbach et al., 2021b, Fischbach et al., 2020c) show that such conditional statements are prevalent in both traditional and agile requirements such as acceptance criteria. In this paper, we focus on conditionals in NL requirements and utilize their embedded logical knowledge for the automatic derivation of test cases. We answer three research questions (RQ):

RQ 1: How to extract conditionals from NL requirements and use their implied relationships for automatic test case derivation?

RQ 2: Can our automated approach create the same test cases as the manual approach?

RQ 3: What are the reasons for deviating test cases?

The answers to RQ 1 shall inform the implementation of a new tool-supported approach for automatic test case derivation. RQ 2 and RQ 3 study the impact of the new approach: does it achieve the status quo or even lead to an improvement of the manual test case derivation? To this end, we conduct a case study with three industry partners and compare automatically created test cases with existing, manually created test cases. In summary, this paper makes the following contributions (C):

C 1: To answer RQ 1, we present our tool-supported approach CiRA (Conditionals in Requirements Artifacts) capable of (1) detecting conditional statements in NL requirements, (2) extracting their implied relationships in fine-grained form and (3) mapping these relationships to a Cause-Effect-Graph from which the minimal set of required test cases can be derived automatically. The output of CiRA are manual test cases, which - if required - can be converted into automatic test cases using third-party tools such as Selenium, Robot Framework or Tricentis.

C 2: To answer RQ 2 and RQ 3, we conduct a case study with three companies and compare CiRA to the manual test case design. We show that CiRA is able to automatically generate 71.8% of the 578 manually created test cases. In addition, CiRA identifies 80 relevant test cases that were missed in the manual test case design.

C 3: To strengthen transparency and facilitate replication, we make our tool, code, annotated data set, and all trained models publicly available.1

In the course of the paper, we demonstrate the functionality of CiRA by means of a running example. Specifically, we explain how CiRA automatically derives the minimum number of required test cases for the requirements specification shown in Fig. 1. The specification contains an excerpt of requirements that describe the functionality of “The Energy Management System” (THEMAS). THEMAS is intended to be used by people that maintain the heating and cooling systems in a building. We retrieved the requirements from the PURE (PUblic REquirements) data set (Ferrari et al., 2017) that contains 79 publicly available natural language requirements documents collected from the Web. We encourage the readers of this paper to use our online demo to process the running example on their own, allowing them to follow each individual step of CiRA.

The remainder of this paper is organized as follows: Section 2 provides the theoretical background. Section 3 answers RQ 1 and introduces CiRA in detail. Section 4 presents the results of our case study and answers RQ 2 and RQ 3. Section 5 discusses our results and indicates directions for both research and practice. Section 6 briefly surveys related work. Finally, Section 7 presents our conclusions.

Section snippets

Test case.

A test case is a set of certain test inputs (input parameters) and expected results (output parameters) used to verify compliance with a specific requirement (ISO/IEC/IEEE 24765:2010, E). Each input and output parameter is defined by a variable and a condition that the parameter can take (Sneed, 2007). For example, the parameter “the system detects an error” can be decomposed into

and . All test cases that constitute a single acceptance test are summarized in a test case specification (see

CiRA pipeline

As shown in Fig. 2, CiRA consists of three steps: We first detect whether an NL requirement contains a conditional (see Section 3.1). Second, we extract the conditional in fine-grained form (see Section 3.2). Specifically, we consider the combinatorics between causes and effects and split them into more granular text fragments (e.g., variable and condition), making the extracted conditionals suitable for automatic test case derivation. Third, we map the extracted causes and effects into a CEG

Objective.

To answer RQ 2 and RQ 3, we conduct a case study in an exploratory fashion. We aim to evaluate whether CiRA could either replace or augment the existing manual approach for creating test cases. For our study, we follow the guidelines by Runeson and Höst (2009) for conducting case study research.

Case sampling and study objects

We apply purposive case sampling augmented with convenience sampling (Kitchenham and Pfleeger, 2002). Specifically, we approached some of our industry contacts inquiring whether they are interested in

Discussion

This section discusses our results and summarizes both the potentials as well as limitations of CiRA. Based on our discussion, we deduce key take-aways for practitioners.

Since the early 1980s, NLP techniques have been applied to RE artifacts to support a variety of use cases: e.g., requirements classification (Hey et al., 2020), topic modeling (Gülle et al., 2020), and quality checks (Femmer et al., 2017). A comprehensive overview of existing NLP4RE tools is provided by Zhao et al. (2021). In this paper, we use NLP methods to extract conditionals from requirements in fine-grained form and to derive test cases automatically. This section reviews existing

Conclusion

Acceptance testing evaluates the conformance between actual and expected system behavior. However, the creation of acceptance tests is laborious and requires manual work due to missing tool support. In this paper, we focus on conditional statements in functional requirements and demonstrate how NLP can be used to automatically generate the minimum set of required test cases. Specifically, we present our tool-supported approach CiRA capable of (1) detecting conditional statements, (2) extracting

CRediT authorship contribution statement

Jannik Fischbach: Conceptualization, Methodology, Software, Validation, Investigation, Data curation, Writing – original draft, Writing – review & editing, Visualization, Supervision. Julian Frattini: Methodology, Software, Validation, Investigation, Data curation, Writing – review & editing, Visualization. Andreas Vogelsang: Conceptualization, Supervision, Project administration. Daniel Mendez: Methodology, Writing – review & editing, Supervision. Michael Unterkalmsteiner: Methodology, Writing

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

We would like to acknowledge that this work was supported by the KKS foundation through the S.E.R.T. Research Profile project at Blekinge Institute of Technology. We thank all collaborating companies for providing access to their data and test designers who volunteered their time to participate in this research.

Jannik Fischbach is a Ph.D. student at the Institute of Computer Science of the University of Cologne. From 2019 to 2022, Jannik also worked as a consultant at Qualicen GmbH - a spin-off founded out of the Technical University of Munich focusing on software and systems engineering. In June 2022, he joined Netlight as a consultant. His main research interests include requirements engineering and, in particular, the application of natural language processing methods on requirements artifacts.

References (57)

  • et al.

    Automatic generation of acceptance test cases from use case specifications: an NLP-based approach

    IEEE Trans. Softw. Eng.

    (2020)

  • TranH.K.V. et al.

    Assessing test artifact quality—A tertiary study

    Inf. Softw. Technol.

    (2021)

  • SarmientoE. et al.

    Test scenario generation from natural language requirements descriptions based on Petri-nets

    Electron. Notes Theor. Comput. Sci.

    (2016)

  • LiuS. et al.

    Automatic test case and test oracle generation based on functional scenarios in formal specifications for conformance testing

    IEEE Trans. Softw. Eng.

    (2020)

  • LiZ. et al.

    Causality extraction based on self-attentive BiLSTM-CRF with transferred embeddings

    Neurocomputing

    (2021)

  • HripcsakG.

    Agreement, the F-measure, and reliability in information retrieval

    J. Am. Med. Inform. Assoc.

    (2005)

  • GarousiV. et al.

    NLP-assisted software testing: A systematic mapping of the literature

    Inf. Softw. Technol.

    (2020)

  • FemmerH. et al.

    Rapid quality assurance with requirements smells

    J. Syst. Softw.

    (2017)

  • AhsanI. et al.

    A comprehensive investigation of natural language processing techniques and tools to generate automated test cases

  • Barros, F.A., Neves, L., Hori, E., Torres, D., 2011. The ucsCNL: A Controlled Natural Language for Use Case...

  • BellerM. et al.

    When, how, and why developers (do not) test in their IDEs

  • BergstraJ. et al.

    Algorithms for hyper-parameter optimization

  • CarvalhoG. et al.

    Model-based testing from controlled natural language requirements

  • ChangD.-S. et al.

    Causal relation extraction using cue phrase and lexical pair probabilities

  • DasguptaT. et al.

    Automatic extraction of causal relations from text using linguistically informed deep neural networks

  • DevlinJ. et al.

    BERT: Pre-training of deep bidirectional transformers for language understanding

  • DwarakanathA. et al.

    Litmus: Generation of test cases from functional requirements in natural language

  • FernándezD.M. et al.

    Naming the pain in requirements engineering

    Empir. Softw. Eng.

    (2017)

  • FerrariA. et al.

    PURE: A dataset of public requirements documents

  • FischbachJ. et al.

    What makes agile test artifacts useful? An activity-based quality model from a practitioners’ perspective

  • FischbachJ. et al.

    How do practitioners interpret conditionals in requirements?

  • FischbachJ. et al.

    Automatic detection of causality in requirement artifacts: The CiRA approach

  • FischbachJ. et al.

    Towards causality extraction from requirements

  • FischbachJ. et al.

    Fine-grained causality extraction from natural language requirements using recursive neural tensor networks

  • FischbachJ. et al.

    SPECMATE: Automated creation of test cases from acceptance criteria

  • GirjuR. et al.

    Semeval-2007 task 04: Classification of semantic relations between nominals

  • GoffiA. et al.

    Automatic generation of oracles for exceptional behaviors

  • GülleK.J. et al.

    Topic modeling on user stories using word mover’s distance

  • Cited by (0)

    Jannik Fischbach is a Ph.D. student at the Institute of Computer Science of the University of Cologne. From 2019 to 2022, Jannik also worked as a consultant at Qualicen GmbH - a spin-off founded out of the Technical University of Munich focusing on software and systems engineering. In June 2022, he joined Netlight as a consultant. His main research interests include requirements engineering and, in particular, the application of natural language processing methods on requirements artifacts. Jannik holds a Master’s degree in Information Systems from the Technical University of Munich.

    Julian Frattini is a Ph.D. student at the Blekinge Institute of Technology located in Karlskrona, Sweden. Since 2020 he is working under the supervision of Daniel Mendez in the area of requirements quality, specifically investigating the notion of good-enoug requirements engineering. Through the collaboration with Ericsson Karlskrona the research is applied and grounded in practice. Julian holds a Master’s degree in Informatics from the Technical University of Munich.

    Andreas Vogelsang is full professor for Software and Systems Engineering at the University of Cologne. He received a PhD from the Technical University of Munich. His research interests comprise requirements engineering, model-based systems engineering, and software architectures for embedded systems. He has published over 70 papers in international journals and conferences such as TSE, JSS, IEEE Software, and ICSE. In 2018, he was appointed as Junior-Fellow of the German Society for Informatics (GI). Further information can be obtained from https://cs.uni-koeln.de/sse.

    Daniel Mendez is full professor at the Blekinge Institute of Technology, Sweden, and Lead Researcher heading the research division Requirements Engineering at fortiss, the research and transfer institute of the Free State of Bavaria for software-intensive systems and services. After studying Computer Science and Cognitive Neuroscience at the Ludwig Maximilian University of Munich, he pursued his doctoral and his habilitation degrees at the Technical University of Munich. His research is since then on Empirical Software Engineering with a particular focus on interdisciplinary, qualitative research in Requirements Engineering and its quality improvement – all in close collaboration with the relevant industries. He is further editorial board member for EMSE and JSS where he co-chairs the special tracks Reproducibility & Open Science (EMSE) and In Practice (JSS) respectively. Finally, he is a member of the ACM, the German association of university professors and lecturers, the German Informatics Society, and ISERN. Further information is available at http://www.mendezfe.org.

    Michael Unterkalmsteiner is a senior lecturer at the Blekinge Institute of Technology, Sweden, where he also received a PhD in Software Engineering. He has been researching Software Engineering since 2009, focusing in particular on the coordination between requirements engineering and software testing. His research work is shaped by empirical problem identification, in-depth analysis of the state-of-art and practice, and collaborative solution development. This empirical, practice-driven approach has led to innovative and scalable solutions. His current research focuses on designing and implementing automated decision support systems for software engineers. Further information is available at https://lmsteiner.com.

    Andreas Wehrle is a software engineer at Allianz Deutschland AG. His main interest is the application of natural language processing methods to requirements. Andreas holds a Master’s degree in Information Systems from the Technical University of Munich.

    Pablo Restrepo Henao is an IT consultant for software engineering and machine learning at Netlight Consulting. He has worked as software engineer and technical lead in multiple companies and is currently pursuing his Master’s degree in Computer Science at the Technical University of Munich. His main research interest is the application of natural language processing techniques in the software engineering area.

    Parisa Yousefi is a line manager and owner of architecture with Business Solution System (BSS) in Ericsson. Having a background as a developer, her interest are with AI/ML, platform related advances and new technologies as well as core software engineering principles and methodologies such as Agile, Test Driven development & requirement engineering.

    Tedi Juricic is a technical quality assurance officer in functional testing within the Business Support Solution Charging and Billing unit at Ericsson. He is responsible for asserting the official quality stamp on the work packages as well as assuring the overall quality of the BSS product.

    Jeannette Radduenz is a quality and test manager within the Platform Management & Testing Services unit at Allianz Technology. She is mainly responsible for planning, coordination, and control of test activities.

    Carsten Wiecher is a development engineer at KOSTAL Automobil Elektrik GmbH & Co. KG. Since 2013, Carsten has been working in different development projects in the field of e-mobility. His main focus is in the area of software integration for complex electronic control units. Since 2018, Carsten is also a research associate at the IDiAL institute which is part of Dortmund University of Applied Sciences and Arts. Since 2020, he is involved in a research project with KOSTAL focusing on model-based systems engineering, requirements analysis and test specification. Carsten holds a Master’s degree in Information Technology from Dortmund University of Applied Sciences and Arts.

    View full text

    © 2022 Published by Elsevier Inc.

    Which testing is used to determine whether the system Fulfils user requirements?

    Acceptance Testing is the last phase of software testing performed after System Testing and before making the system available for actual use. Types of Acceptance Testing: User Acceptance Testing (UAT): User acceptance testing is used to determine whether the product is working for the user correctly.

    What are the steps of system testing quizlet?

    Alpha Testing. assess if the entire system meets the design requirements of the users..
    Development Testing. test the system to ensure it is bug-free..
    Integration Testing. Verify that separate systems can work together, passing data back and forth correctly..
    System Testing. ... .
    User Acceptance Testing (UAT) ... .
    Unit Testing..

    What are system requirements quizlet?

    ~System requirements are more detailed descriptions of the system services and constraint, written for developers of the system. What is the distinction between functional and non-functional requirements? ~Functional requirements define what the system should do.

    What is a test case quizlet?

    What is a test case? A set of conditions and/or variables under which a tester will determine if a requirement upon an application is satisfied.