++++++++++++++++++ FME2001: - Review reports for paper 42 +++++++++++++++++++++
- Dear author(s) : this file was extracted automatically from a very large
- mailbox. In case you find any problem concerning mail encodings (or any
- other kind of anomaly disturbing your understanding of the reviews) please
- email any of the PC cochairs (pamela@research.att.com,jno@di.uminho.pt).
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

PAPER NUMBER:	42
CATEGORY:	1
TITLE:		A Combined Testing and Verification Approach for Software Reliability
AUTHOR(S):	Natasha Sharygina
		Doron Peled

--------------------------------------------------------------------------------

1. Briefly SUMMARIZE the paper (2-3 lines):  The authors present an approach to
system verification which starts with formal modeling. White box testing is used to
validate the model and continues as long as modeling errors are found, and until 
a level of confidence in the model is achieved. Then, they apply model
checking with SPIN. Upon finding an error, neighborhood testing is
performed.  The detected error is then corrected in the original code
or the model.  The authors demonstrate the effectiveness of this approach by
applying it to a Robot Control System (RCS). In particular, a part of a multiple 
criterion decision support software system used for controlling robots that have
more than six degrees of freedom.

 
2. RELEVANCE: Please provide a rating of the paper's relevance to the
   FME Symposium, using the scale: 
      0 = Out of scope 
      1 = Marginal interest
      2 = Minority interest
      3 = Majority interest
      4 = Outstanding interest

   Numeric Rating:	4   

   Please comment your rating:   The paper describes the use of model checking
                in conjunction with white box testing on a real system.
 

3. OVERALL RECOMMENDATION: Please provide a rating of the paper's
   acceptability, using the scale:
        1 = Strong reject
        2 = Weak reject
        3 = Could go either way
        4 = Weak accept
        5 = Strong accept

   Numeric Rating:	5
   NB: There should be a correlation between the two rates above.
 

4. CONFIDENCE LEVEL: Please provide a rating oof your expertise in the
   area addressed by the paper, using the scale: 
        1 = Know virtually nothing about this area
        2 = Not too knowledgeable, but I know a bit
        3 = Know a moderate amount, about average I'd say
        4 = Not my main area of work, but I know a lot about it
        5 = My area of work, I know it extremely well

   Numeric Rating:	3

   NB:  PC members are responsible for ensuring that 1 is not used
   here. 


5. ORIGINALITY. What is NEW and SIGNIFICANT in the work reported here?

   Comment: the testing methodology and its application

 
6. How WORTHWILE is the goal of the authors? 

   Comment: very important
 

7. How well is this goal EXPLAINED and JUSTIFIED?

   Comment: well-written paper
 

8. TECHNICAL QUALITY. Are the technical parts (definitions, statements,
   specifications, proofs, algorithms, etc.) SOUND?

   Comment: technical quality is high
 

9. APPLICABILITY. If the work is primarily theoretical or conceptual,
   are the implications in terms of applicability adequately
   justified? If the paper is about a new formal technique, are
   satisfactory arguments presented in favor of the new technique?  If a
   methodology is proposed, is it sound? If experimental results are
   presented, is their relevance justified?

   Comment: Methodological concept + practical application + results

 
10. PRESENTATION: Describe the QUALITY of the writing, including
   suggestions for changes where appropriate.

   Comment: the writing quality is very good.
 

11. Were there any formatting or mechanical problems with this
    paper?: no
    Are the figures and length acceptable?: yes
    Are the references correct?: yes
 

12. OTHER COMMENTS you believe would be useful to the author(s),
    including pointers to missing relevant work: 
 
       I would suggest that you not start the paper with the phrase "software
       reliability".  I would use "software dependability" instead.  The term
       software reliability carries the connotation of probabilistic assessment,
       i.e. calculating the probability that the software will fail.   


+++++++++++++++++++++ End of FME 2001 Paper Review Report ++++++++++++++++++++++


PAPER NUMBER:	42
CATEGORY:	1
TITLE:		A Combined Testing and Verification Approach for Software
Reliability
AUTHOR(S):	Natasha Sharygina
		Doron Peled

--------------------------------------------------------------------------------

1. Briefly SUMMARIZE the paper (2-3 lines): 
The authors propose a methodology, combining testing and verification.
Starting from actual code, white box testing continues as long as modelling
errors are found. Then model checking is performed. If a counter example
is found, then a neighborhood testing is performed.


2. RELEVANCE: Please provide a rating of the paper's relevance to the
   FME Symposium, using the scale:
      0 = Out of scope
      1 = Marginal interest
      2 = Minority interest
      3 = Majority interest
      4 = Outstanding interest

   Numeric Rating:	3 

   Please comment your rating: 

Combining verification of the model and testing is attractive in order
to augment the confidence in the code. The proposed method use two tools
the Pet tester and the model checker Spin.

3. OVERALL RECOMMENDATION: Please provide a rating of the paper's
   acceptability, using the scale:
        1 = Strong reject
        2 = Weak reject
        3 = Could go either way
        4 = Weak accept
        5 = Strong accept

   Numeric Rating:	2
   NB: There should be a correlation between the two rates above.


4. CONFIDENCE LEVEL: Please provide a rating oof your expertise in the
   area addressed by the paper, using the scale:
        1 = Know virtually nothing about this area
        2 = Not too knowledgeable, but I know a bit
        3 = Know a moderate amount, about average I'd say
        4 = Not my main area of work, but I know a lot about it
        5 = My area of work, I know it extremely well

   Numeric Rating:	5

   NB:  PC members are responsible for ensuring that 1 is not used
   here.


5. ORIGINALITY. What is NEW and SIGNIFICANT in the work reported here?

   Comment:
The notion of neighborhood of a path seems original, but this notion is not
formalized. How one can formalize this notion in order to automatize the
testing process ?


6. How WORTHWILE is the goal of the authors?

   Comment:


7. How well is this goal EXPLAINED and JUSTIFIED?

   Comment:


8. TECHNICAL QUALITY. Are the technical parts (definitions, statements,
   specifications, proofs, algorithms, etc.) SOUND?

   Comment:

there is no formal definition


9. APPLICABILITY. If the work is primarily theoretical or conceptual,
   are the implications in terms of applicability adequately
   justified? If the paper is about a new formal technique, are
   satisfactory arguments presented in favor of the new technique?  If a
   methodology is proposed, is it sound? If experimental results are
   presented, is their relevance justified?

   Comment:
For software engineers, the method is interesting; but the software developer
must be expert in two tools: one for the tester Pet and for the model checker
Spin.


10. PRESENTATION: Describe the QUALITY of the writing, including
   suggestions for changes where appropriate.

   Comment:


11. Were there any formatting or mechanical problems with this
    paper?:
    Are the figures and length acceptable?:
    Are the references correct?:

References are not relevant. The authors seem to ignore a lot of works on
testing, model cheking and program analysis, which are later than 1995
(see CAV, SAS, TACAS, IWTCS, FME,...

12. OTHER COMMENTS you believe would be useful to the author(s),
    including pointers to missing relevant work:

The main (and attractive) idea of the paper is to use together testing
and model checking to obtain a certan degree of confidence in the actual code.
This approach is illustrated with the use of two tools. However, the paper
does not address the following points :
- as far as one disposes of source code, and not only an object code
(assembly, object,...) why do not envisage the automatic extraction of a model
? What kind of automatic translation is possible ?
- The main idea of the paper becomes weaker as there exist two modellings:
one for the model checker, and the other for the tester. This point may be not
problematic for the chosen example, but the limit of the apporach
is not clearly stated.


+++++++++++++++++++++ End of FME 2001 Paper Review Report ++++++++++++++++++++++


PAPER NUMBER:	42
CATEGORY:	1
TITLE:		A Combined Testing and Verification Approach for Software Reliability
AUTHOR(S):	Natasha Sharygina
		Doron Peled

--------------------------------------------------------------------------------

1. Briefly SUMMARIZE the paper (2-3 lines): 
The paper proposes a combined verification and testing approach to
improve the reliability of software: testing (via the tool PET) is used 
to validate the correctness of the translation of the actual code 
into a model which serves as input to a model checking tool (here SPIN).
On the model correctness properties expressed in Linear Temporal Logic 
are verified. The methodology is demonstrated by an example of a robot 
control system (RCS).

 
2. RELEVANCE: Please provide a rating of the paper's relevance to the
   FME Symposium, using the scale:
      0 = Out of scope 
      1 = Marginal interest
      2 = Minority interest
      3 = Majority interest
      4 = Outstanding interest

   Numeric Rating:	3 

   Please comment your rating:  Pragmatic approaches like the one decribed
in this paper are important to promote the application of formal methods 
in industry. Since formal verification on actual code is hard to achieve 
in practice the hybrid approach proposed in this paper which combines 
white box testing techniques with the benefits of formal model checking 
looks interesting to the referee. In particular it is in accordance with 
one of the main goals of the conference which is the transfer of formal 
techniques into industry.


3. OVERALL RECOMMENDATION: Please provide a rating of the paper's
   acceptability, using the scale:
        1 = Strong reject
        2 = Weak reject
        3 = Could go either way
        4 = Weak accept
        5 = Strong accept

   Numeric Rating:	4
   NB: There should be a correlation between the two rates above.
 

4. CONFIDENCE LEVEL: Please provide a rating oof your expertise in the
   area addressed by the paper, using the scale: 
        1 = Know virtually nothing about this area
        2 = Not too knowledgeable, but I know a bit
        3 = Know a moderate amount, about average I'd say
        4 = Not my main area of work, but I know a lot about it
        5 = My area of work, I know it extremely well

   Numeric Rating:	3

   NB:  PC members are responsible for ensuring that 1 is not used
   here. 


5. ORIGINALITY. What is NEW and SIGNIFICANT in the work reported here?

   Comment: To the referee the combined application of white box 
testing and verification looks rather original.

 
6. How WORTHWILE is the goal of the authors? 

   Comment: See the comments on 2. Relevance
 

7. How well is this goal EXPLAINED and JUSTIFIED?

   Comment: The authors motivate their hybrid approach by stating that 
exhaustive consistency checks between the actual code and the model
usually are not possible. Therefore they propose to use white box
testing for consistency checks combined with formal model checking
on the model. To the referee, however, it did not become utterly clear
which kind of test concept is underlying the white box test of
the model (via the tool PET). The authors say that they examined the
model and compared it "to the known behavior of the original code" 
using PET. In which way was this "known behavior" expressed ?
And how was the comparison done ? Some more details would be helpful.
 

8. TECHNICAL QUALITY. Are the technical parts (definitions, statements,
   specifications, proofs, algorithms, etc.) SOUND?

   Comment: It did not become evident to the referee in which way the
(manual) translation from the Object Bench code of the RCS example
to the PROMELA model was achieved. Since for both the Object Bench Code 
and the model code only non-self-contained parts are presented,
and since the method guiding the translation is not explained, it
is difficult to follow the discussion in Section 3.4 (Testing and
Verification Results) from a technical point of view. 
In particular it would be interesting to get some information on how 
the state variables in the model are obtained from the Object Bench 
Transition diagram presented in Figure 4. 

 
9. APPLICABILITY. If the work is primarily theoretical or conceptual,
   are the implications in terms of applicability adequately
   justified? If the paper is about a new formal technique, are
   satisfactory arguments presented in favor of the new technique?  If a
   methodology is proposed, is it sound? If experimental results are
   presented, is their relevance justified?

   Comment: The methodology is supported by tools and has been used
on a real application.

 
10. PRESENTATION: Describe the QUALITY of the writing, including
   suggestions for changes where appropriate.

   Comment: The paper is well written. As already mentionned under 8.
some more technical details concerning the translation from the Object
Bench Code into the model code would be helpful.
 

11. Were there any formatting or mechanical problems with this
    paper?: No
    Are the figures and length acceptable?: When printing out the paper, 
the text labels in Figure 5 did not come out in a readable way.

    Are the references correct?: The reference to [17] 
(Precise UML group) in the second line of the second paragraph in page 2 
(Simulation of the model execution [17]) looks strange. 
Moreover the given link does not work.
 

12. OTHER COMMENTS you believe would be useful to the author(s),
    including pointers to missing relevant work: 

a) The definition of the neighbourhood concept did not become clear to
the referee: you say that " a related path of the flow graph can be 
obtained from it by selecting at some point an alternative edge
out of a condition node. Continuing this process of obtaining
alternative paths we obtain a tree which we call neighbourhood of
the original path." But if you can choose arbitrary condition nodes
and arbitrary alternative edges of the original path, then it
seems to the referee that a neighbourhood will cover the entire
flow graph of the sequential program.

b) 3.4 Testing and Verification Results, Formula Number 3:
<> (abort_var = 1 U (end_position = 1 \/ (recovery_status = 1 /\
counter = number_joints) )

i) a closing bracket is missing.


ii) the informal description of this formula given in the
paragraph below the formula does not really seem to
fit together with the formula : since the "Until"-Operator is
used the above formula implies that (end_position = 1 \/ 
(recovery_status = 1 /\ counter = number_joints) ) eventually
has to hold. This does not come out in the informal description.

Moreover, in order to get a better understanding for this 
formula, some information should be given on the use of the
abort_var - Variable:

Which process sets it to 1 ? (only the Arm-Process ?)
When is it set to 1 ? (when the task is completed ?) 

                     
+++++++++++++++++++++ End of FME 2001 Paper Review Report ++++++++++++++++++++++