++++++++++++++++++ FME2001: - Review reports for paper 53 +++++++++++++++++++++ - Dear author(s) : this file was extracted automatically from a very large - mailbox. In case you find any problem concerning mail encodings (or any - other kind of anomaly disturbing your understanding of the reviews) please - email any of the PC cochairs (pamela@research.att.com,jno@di.uminho.pt). ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ PAPER NUMBER: 53 CATEGORY: 2 TITLE: Houdini, an Annotation Assistant for ESC/Java AUTHOR(S): Cormac Flanagan K. Rustan M. Leino -------------------------------------------------------------------------------- 1. Briefly SUMMARIZE the paper (2-3 lines): The paper describes the design, implementation, and evaluation of an annotation assistant for ESC/Java, an static checker for Java. The assistant automatically generates annotations (pre and post conditions, class invariants) which are then used by ESC/Java to find defects in Java programs. 2. RELEVANCE: Please provide a rating of the paper's relevance to the FME Symposium, using the scale: 0 = Out of scope 1 = Marginal interest 2 = Minority interest 3 = Majority interest 4 = Outstanding interest Numeric Rating: 3 Please comment your rating: The system described by the paper uses formal methods techniques to generate annotations which can be used both for increasing the productivity in documentation and validation, two of the main topics suggested for the symposium. However, the paper focus on the system and does not give details of the formal methods techniques used to build or justify it; that is left for another paper to be published elsewhere. 3. OVERALL RECOMMENDATION: Please provide a rating of the paper's acceptability, using the scale: 1 = Strong reject 2 = Weak reject 3 = Could go either way 4 = Weak accept 5 = Strong accept Numeric Rating: 3 NB: There should be a correlation between the two rates above. 4. CONFIDENCE LEVEL: Please provide a rating oof your expertise in the area addressed by the paper, using the scale: 1 = Know virtually nothing about this area 2 = Not too knowledgeable, but I know a bit 3 = Know a moderate amount, about average I'd say 4 = Not my main area of work, but I know a lot about it 5 = My area of work, I know it extremely well Numeric Rating: 3 NB: PC members are responsible for ensuring that 1 is not used here. 5. ORIGINALITY. What is NEW and SIGNIFICANT in the work reported here? Comment: The algorithms and the heuristics used to generate candidate annotations seem to be new, but seem also to be straightforward. The experiments performed with the system, and the related observations, are new and significant. 6. How WORTHWILE is the goal of the authors? Comment: The main goal of the authors is to improve the productivity of the validation of unannotated Java programs, which is completely worthwhile. 7. How well is this goal EXPLAINED and JUSTIFIED? Comment: The goal is very well explained and justified throughout the paper. 8. TECHNICAL QUALITY. Are the technical parts (definitions, statements, specifications, proofs, algorithms, etc.) SOUND? Comment: The algorithms seem to be sound, but there are no proofs of that or further technical details in the paper. The authors simply refer to another paper to be published elsewhere. It is not clear for me if the algorithms would work for classes with mutually recursive methods. 9. APPLICABILITY. If the work is primarily theoretical or conceptual, are the implications in terms of applicability adequately justified? If the paper is about a new formal technique, are satisfactory arguments presented in favor of the new technique? If a methodology is proposed, is it sound? If experimental results are presented, is their relevance justified? Comment: The relevance of the experimental results are well justified, and the current drawbacks of the proposed approach are clearly discussed. 10. PRESENTATION: Describe the QUALITY of the writing, including suggestions for changes where appropriate. Comment: The quality of the writing is OK, but could be improved in some places: - The limitations of ESC/Java (its incompleteness, etc.) are discussed only on Page 12; I think it would be better to say that from the beginning, at the end of Section 1, for example. That would simplify the life of readers which are not familiar with ESC/Java. - The description of the algorithms say that they give warnings as output, but at the beginning of Section 5 the reader learns that the annotated program is also given as output. I think this should be clear from the beginning; if that is the case, why not argue about the use of the generated annotations as documentation? Would not they be useful? - It would be nice to say something about the other annotation assistants discussed on Section 7. Who developed them? It seems that the authors did, but it is not clear. Are there any references to them? There are also some typos: - pg. 3, ...the into the program... - pg. 13, Is this table... - pg. 13, ... hold hold... 11. Were there any formatting or mechanical problems with this paper?: Yes, except for Table 1. Are the figures and length acceptable?: Yes. Are the references correct?: Yes. 12. OTHER COMMENTS you believe would be useful to the author(s), including pointers to missing relevant work: The results compare the use of Houdini+ESC/Java with the use of ESC/Java for unannotated programs, which seems to be the most important case, but it would be nice to say also something about the total validation effort of using Houdini+ESC/Java versus using ManualAnnotation+ESC/Java. As it is not clear for me if the algorithms would work for classes with mutually recursive methods, I thought that it would be useful to say something about that. +++++++++++++++++++++ End of FME 2001 Paper Review Report ++++++++++++++++++++++ PAPER NUMBER: 53 CATEGORY: unspecified TITLE: Houdini, an Annotation Assistant for ESC/Java AUTHOR(S): Cormac Flanagan -------------------------------------------------------------------------------- 1. Briefly SUMMARIZE the paper (2-3 lines): The paper describes Houdini, a tool for automating the annotation process of existing Java code by inferring candidate ESC/Java annotations and accepting or rejecting them through verification by ESC/Java. 2. RELEVANCE: Please provide a rating of the paper's relevance to the FME Symposium, using the scale: 0 = Out of scope 1 = Marginal interest 2 = Minority interest 3 = Majority interest 4 = Outstanding interest Numeric Rating: 2 Please comment your rating: The paper seems a little premature ("premature" in the sense that the paper needs to mature a little; I'll try to explain below). 3. OVERALL RECOMMENDATION: Please provide a rating of the paper's acceptability, using the scale: 1 = Strong reject 2 = Weak reject 3 = Could go either way 4 = Weak accept 5 = Strong accept Numeric Rating: 4 NB: There should be a correlation between the two rates above. 4. CONFIDENCE LEVEL: Please provide a rating oof your expertise in the area addressed by the paper, using the scale: 1 = Know virtually nothing about this area 2 = Not too knowledgeable, but I know a bit 3 = Know a moderate amount, about average I'd say 4 = Not my main area of work, but I know a lot about it 5 = My area of work, I know it extremely well Numeric Rating: 4 NB: PC members are responsible for ensuring that 1 is not used here. 5. ORIGINALITY. What is NEW and SIGNIFICANT in the work reported here? Comment: The fact that the authors try to automate the addition of formal annotations to an existing code base could be of significant benefit to the testing and maintenance phases of the program life-cycle. 6. How WORTHWILE is the goal of the authors? Comment: Their attempt to improve program quality is very worthwile. 7. How well is this goal EXPLAINED and JUSTIFIED? Comment: The goal, reducing the number of program defects through formal analysis, is not very well explained. The tool they propose would be of most benefit to eiter maintenance programmers or program testers, for debugging existing code with no formal annotations. This context is not given. 8. TECHNICAL QUALITY. Are the technical parts (definitions, statements, specifications, proofs, algorithms, etc.) SOUND? Comment: Although the effort seems promising, a lot depends on the chosen initial candidate annotation set. The paper does not go into much detail justifying the set proposed by the authors, especially the trade-off between false positives and false negatives could have been more formally analysed (instead of just "this appears to be a more cost-effective strategy"). Another danger lies in the fact that the same static checker is used for inferring the rule-set and for checking the final combination of program code and annotations. This could lead to certain categorical errors going undetected (a partial "blindness" of the approach chosen). Research on this area is needed! 9. APPLICABILITY. If the work is primarily theoretical or conceptual, are the implications in terms of applicability adequately justified? If the paper is about a new formal technique, are satisfactory arguments presented in favor of the new technique? If a methodology is proposed, is it sound? If experimental results are presented, is their relevance justified? Comment: Although experimental results are given (on real-life programs), the effectiveness of the tool is not clear: no comparisons are made with respect to more traditional methods of defect finding. Additional measurements need to be made to see if the tool provides significant improvements. 10. PRESENTATION: Describe the QUALITY of the writing, including suggestions for changes where appropriate. Comment: Table 1 on page 14 show an error in rows 4, 5 and 6 (the totals for these rows seem to have been rotated). For the rest, the paper is clear, and the writing style is good, and there were surprisingly few typo's (page 13, bottom: "Is this table" and "hold hold"). 11. Were there any formatting or mechanical problems with this paper?: No Are the figures and length acceptable?: Yes Are the references correct?: Yes 12. OTHER COMMENTS you believe would be useful to the author(s), including pointers to missing relevant work: None. +++++++++++++++++++++ End of FME 2001 Paper Review Report ++++++++++++++++++++++