++++++++++++++++++ FME2001: - Review reports for paper 42 +++++++++++++++++++++ - Dear author(s) : this file was extracted automatically from a very large - mailbox. In case you find any problem concerning mail encodings (or any - other kind of anomaly disturbing your understanding of the reviews) please - email any of the PC cochairs (pamela@research.att.com,jno@di.uminho.pt). ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ PAPER NUMBER: 42 CATEGORY: 1 TITLE: A Combined Testing and Verification Approach for Software Reliability AUTHOR(S): Natasha Sharygina Doron Peled -------------------------------------------------------------------------------- 1. Briefly SUMMARIZE the paper (2-3 lines): The authors present an approach to system verification which starts with formal modeling. White box testing is used to validate the model and continues as long as modeling errors are found, and until a level of confidence in the model is achieved. Then, they apply model checking with SPIN. Upon finding an error, neighborhood testing is performed. The detected error is then corrected in the original code or the model. The authors demonstrate the effectiveness of this approach by applying it to a Robot Control System (RCS). In particular, a part of a multiple criterion decision support software system used for controlling robots that have more than six degrees of freedom. 2. RELEVANCE: Please provide a rating of the paper's relevance to the FME Symposium, using the scale: 0 = Out of scope 1 = Marginal interest 2 = Minority interest 3 = Majority interest 4 = Outstanding interest Numeric Rating: 4 Please comment your rating: The paper describes the use of model checking in conjunction with white box testing on a real system. 3. OVERALL RECOMMENDATION: Please provide a rating of the paper's acceptability, using the scale: 1 = Strong reject 2 = Weak reject 3 = Could go either way 4 = Weak accept 5 = Strong accept Numeric Rating: 5 NB: There should be a correlation between the two rates above. 4. CONFIDENCE LEVEL: Please provide a rating oof your expertise in the area addressed by the paper, using the scale: 1 = Know virtually nothing about this area 2 = Not too knowledgeable, but I know a bit 3 = Know a moderate amount, about average I'd say 4 = Not my main area of work, but I know a lot about it 5 = My area of work, I know it extremely well Numeric Rating: 3 NB: PC members are responsible for ensuring that 1 is not used here. 5. ORIGINALITY. What is NEW and SIGNIFICANT in the work reported here? Comment: the testing methodology and its application 6. How WORTHWILE is the goal of the authors? Comment: very important 7. How well is this goal EXPLAINED and JUSTIFIED? Comment: well-written paper 8. TECHNICAL QUALITY. Are the technical parts (definitions, statements, specifications, proofs, algorithms, etc.) SOUND? Comment: technical quality is high 9. APPLICABILITY. If the work is primarily theoretical or conceptual, are the implications in terms of applicability adequately justified? If the paper is about a new formal technique, are satisfactory arguments presented in favor of the new technique? If a methodology is proposed, is it sound? If experimental results are presented, is their relevance justified? Comment: Methodological concept + practical application + results 10. PRESENTATION: Describe the QUALITY of the writing, including suggestions for changes where appropriate. Comment: the writing quality is very good. 11. Were there any formatting or mechanical problems with this paper?: no Are the figures and length acceptable?: yes Are the references correct?: yes 12. OTHER COMMENTS you believe would be useful to the author(s), including pointers to missing relevant work: I would suggest that you not start the paper with the phrase "software reliability". I would use "software dependability" instead. The term software reliability carries the connotation of probabilistic assessment, i.e. calculating the probability that the software will fail. +++++++++++++++++++++ End of FME 2001 Paper Review Report ++++++++++++++++++++++ PAPER NUMBER: 42 CATEGORY: 1 TITLE: A Combined Testing and Verification Approach for Software Reliability AUTHOR(S): Natasha Sharygina Doron Peled -------------------------------------------------------------------------------- 1. Briefly SUMMARIZE the paper (2-3 lines): The authors propose a methodology, combining testing and verification. Starting from actual code, white box testing continues as long as modelling errors are found. Then model checking is performed. If a counter example is found, then a neighborhood testing is performed. 2. RELEVANCE: Please provide a rating of the paper's relevance to the FME Symposium, using the scale: 0 = Out of scope 1 = Marginal interest 2 = Minority interest 3 = Majority interest 4 = Outstanding interest Numeric Rating: 3 Please comment your rating: Combining verification of the model and testing is attractive in order to augment the confidence in the code. The proposed method use two tools the Pet tester and the model checker Spin. 3. OVERALL RECOMMENDATION: Please provide a rating of the paper's acceptability, using the scale: 1 = Strong reject 2 = Weak reject 3 = Could go either way 4 = Weak accept 5 = Strong accept Numeric Rating: 2 NB: There should be a correlation between the two rates above. 4. CONFIDENCE LEVEL: Please provide a rating oof your expertise in the area addressed by the paper, using the scale: 1 = Know virtually nothing about this area 2 = Not too knowledgeable, but I know a bit 3 = Know a moderate amount, about average I'd say 4 = Not my main area of work, but I know a lot about it 5 = My area of work, I know it extremely well Numeric Rating: 5 NB: PC members are responsible for ensuring that 1 is not used here. 5. ORIGINALITY. What is NEW and SIGNIFICANT in the work reported here? Comment: The notion of neighborhood of a path seems original, but this notion is not formalized. How one can formalize this notion in order to automatize the testing process ? 6. How WORTHWILE is the goal of the authors? Comment: 7. How well is this goal EXPLAINED and JUSTIFIED? Comment: 8. TECHNICAL QUALITY. Are the technical parts (definitions, statements, specifications, proofs, algorithms, etc.) SOUND? Comment: there is no formal definition 9. APPLICABILITY. If the work is primarily theoretical or conceptual, are the implications in terms of applicability adequately justified? If the paper is about a new formal technique, are satisfactory arguments presented in favor of the new technique? If a methodology is proposed, is it sound? If experimental results are presented, is their relevance justified? Comment: For software engineers, the method is interesting; but the software developer must be expert in two tools: one for the tester Pet and for the model checker Spin. 10. PRESENTATION: Describe the QUALITY of the writing, including suggestions for changes where appropriate. Comment: 11. Were there any formatting or mechanical problems with this paper?: Are the figures and length acceptable?: Are the references correct?: References are not relevant. The authors seem to ignore a lot of works on testing, model cheking and program analysis, which are later than 1995 (see CAV, SAS, TACAS, IWTCS, FME,... 12. OTHER COMMENTS you believe would be useful to the author(s), including pointers to missing relevant work: The main (and attractive) idea of the paper is to use together testing and model checking to obtain a certan degree of confidence in the actual code. This approach is illustrated with the use of two tools. However, the paper does not address the following points : - as far as one disposes of source code, and not only an object code (assembly, object,...) why do not envisage the automatic extraction of a model ? What kind of automatic translation is possible ? - The main idea of the paper becomes weaker as there exist two modellings: one for the model checker, and the other for the tester. This point may be not problematic for the chosen example, but the limit of the apporach is not clearly stated. +++++++++++++++++++++ End of FME 2001 Paper Review Report ++++++++++++++++++++++ PAPER NUMBER: 42 CATEGORY: 1 TITLE: A Combined Testing and Verification Approach for Software Reliability AUTHOR(S): Natasha Sharygina Doron Peled -------------------------------------------------------------------------------- 1. Briefly SUMMARIZE the paper (2-3 lines): The paper proposes a combined verification and testing approach to improve the reliability of software: testing (via the tool PET) is used to validate the correctness of the translation of the actual code into a model which serves as input to a model checking tool (here SPIN). On the model correctness properties expressed in Linear Temporal Logic are verified. The methodology is demonstrated by an example of a robot control system (RCS). 2. RELEVANCE: Please provide a rating of the paper's relevance to the FME Symposium, using the scale: 0 = Out of scope 1 = Marginal interest 2 = Minority interest 3 = Majority interest 4 = Outstanding interest Numeric Rating: 3 Please comment your rating: Pragmatic approaches like the one decribed in this paper are important to promote the application of formal methods in industry. Since formal verification on actual code is hard to achieve in practice the hybrid approach proposed in this paper which combines white box testing techniques with the benefits of formal model checking looks interesting to the referee. In particular it is in accordance with one of the main goals of the conference which is the transfer of formal techniques into industry. 3. OVERALL RECOMMENDATION: Please provide a rating of the paper's acceptability, using the scale: 1 = Strong reject 2 = Weak reject 3 = Could go either way 4 = Weak accept 5 = Strong accept Numeric Rating: 4 NB: There should be a correlation between the two rates above. 4. CONFIDENCE LEVEL: Please provide a rating oof your expertise in the area addressed by the paper, using the scale: 1 = Know virtually nothing about this area 2 = Not too knowledgeable, but I know a bit 3 = Know a moderate amount, about average I'd say 4 = Not my main area of work, but I know a lot about it 5 = My area of work, I know it extremely well Numeric Rating: 3 NB: PC members are responsible for ensuring that 1 is not used here. 5. ORIGINALITY. What is NEW and SIGNIFICANT in the work reported here? Comment: To the referee the combined application of white box testing and verification looks rather original. 6. How WORTHWILE is the goal of the authors? Comment: See the comments on 2. Relevance 7. How well is this goal EXPLAINED and JUSTIFIED? Comment: The authors motivate their hybrid approach by stating that exhaustive consistency checks between the actual code and the model usually are not possible. Therefore they propose to use white box testing for consistency checks combined with formal model checking on the model. To the referee, however, it did not become utterly clear which kind of test concept is underlying the white box test of the model (via the tool PET). The authors say that they examined the model and compared it "to the known behavior of the original code" using PET. In which way was this "known behavior" expressed ? And how was the comparison done ? Some more details would be helpful. 8. TECHNICAL QUALITY. Are the technical parts (definitions, statements, specifications, proofs, algorithms, etc.) SOUND? Comment: It did not become evident to the referee in which way the (manual) translation from the Object Bench code of the RCS example to the PROMELA model was achieved. Since for both the Object Bench Code and the model code only non-self-contained parts are presented, and since the method guiding the translation is not explained, it is difficult to follow the discussion in Section 3.4 (Testing and Verification Results) from a technical point of view. In particular it would be interesting to get some information on how the state variables in the model are obtained from the Object Bench Transition diagram presented in Figure 4. 9. APPLICABILITY. If the work is primarily theoretical or conceptual, are the implications in terms of applicability adequately justified? If the paper is about a new formal technique, are satisfactory arguments presented in favor of the new technique? If a methodology is proposed, is it sound? If experimental results are presented, is their relevance justified? Comment: The methodology is supported by tools and has been used on a real application. 10. PRESENTATION: Describe the QUALITY of the writing, including suggestions for changes where appropriate. Comment: The paper is well written. As already mentionned under 8. some more technical details concerning the translation from the Object Bench Code into the model code would be helpful. 11. Were there any formatting or mechanical problems with this paper?: No Are the figures and length acceptable?: When printing out the paper, the text labels in Figure 5 did not come out in a readable way. Are the references correct?: The reference to [17] (Precise UML group) in the second line of the second paragraph in page 2 (Simulation of the model execution [17]) looks strange. Moreover the given link does not work. 12. OTHER COMMENTS you believe would be useful to the author(s), including pointers to missing relevant work: a) The definition of the neighbourhood concept did not become clear to the referee: you say that " a related path of the flow graph can be obtained from it by selecting at some point an alternative edge out of a condition node. Continuing this process of obtaining alternative paths we obtain a tree which we call neighbourhood of the original path." But if you can choose arbitrary condition nodes and arbitrary alternative edges of the original path, then it seems to the referee that a neighbourhood will cover the entire flow graph of the sequential program. b) 3.4 Testing and Verification Results, Formula Number 3: <> (abort_var = 1 U (end_position = 1 \/ (recovery_status = 1 /\ counter = number_joints) ) i) a closing bracket is missing. ii) the informal description of this formula given in the paragraph below the formula does not really seem to fit together with the formula : since the "Until"-Operator is used the above formula implies that (end_position = 1 \/ (recovery_status = 1 /\ counter = number_joints) ) eventually has to hold. This does not come out in the informal description. Moreover, in order to get a better understanding for this formula, some information should be given on the use of the abort_var - Variable: Which process sets it to 1 ? (only the Arm-Process ?) When is it set to 1 ? (when the task is completed ?) +++++++++++++++++++++ End of FME 2001 Paper Review Report ++++++++++++++++++++++