More on V&V and SQA at LANL for ASC and the NNSA
This previous post mentioned V&V and SQA at Los Alamos National Laboratory ( LANL ) within the framework of the Advanced Simulation & Computing ( ASC ) Program for the National Nuclear Security Administration ( NNSA )
Verification and Validation of scientific and engineering software seems to have become a very important part of scientific and engineering software at Los Alamos National Laboratory.
This section:
Typical Questions That V&V Can Answer…
Has this entry:
• “Models can be validated without data.”
Wrong! There is no validation without data because model validation must assess prediction accuracy relative to a physical reality. While code verification and calculation verification are concerned with the accuracy of the numerical implementation and convergence, respectively, validation activities focus on the adequacy of numerical simulations when applied to the description of reality, which requires experimental observations. We nevertheless recognize that the lack of test data can pose serious problems to model validation. Rigorously controlled expert elicitation techniques can provide information that is substituted to experimental testing in cases of severe lack of data and uncertainty.
Code-to-code comparisons are not Validation. Never have been, never will be.
Special Pleading Continues
Professor Easterbrook continues the special pleading for exemption that is so common among software developers that have not yet properly addressed important IV&V and SQA issues. This latest version of Professor’s pleading includes the following aspects; (1) a Red Herring, (2) broad generalizations, and (3) appeal to authority.
Appeal to Authority ( His Own )
Professor Easterbrook has not yet cited any of the literature associated with IV&V and SQA for engineering and scientific software. Literature that has presented procedures and processes that have been widely accepted and proven successful. What more can you say about that? Self reference based on a position of authority always fails.
Red Herring
Professor Easterbrook is the only person who has suggested that IV&V and SQA procedures and processes that are applied to commercial software be applied to engineering and scientific software. The subject software is not generally developed fresh from scratch but instead has evolved over decades of time. The engineering community has developed procedures and processes that take into account this obvious and critically important aspect of real-world complex software. Easterbrook attempts to introduce a comparison of apples and zebras into the discussions.
Broad Generalizations
The activities Easterbrook describes are Standard Operating Procedures for every software project that I have experience with. Direct experience. It is not IV&V Lite; it’s plain and simple SOP. SOP is not IV&V and SQA; never has been, will never be.
Collaborative comparisons of software results, and more importantly collaborative comparisons with experimental data, have a very long history in engineering and science software. This software is almost always not commercial software. In my industry, this work started in the mid-1970s. In turbulence modeling the work started in the early 1980s with the (in)famous Stanford Olympics. Infamous because so many calculations got so many different wrong results by so many different ways, sometimes when using the same software; the same models, methods and software. Turbulence is a hard problem.
It is common in many industries for user groups to be formed around a single piece of software. These groups have members that number in the 10s to a few 100s. Importantly the groups focus on a single version of the software, the frozen production-grade version of the code. That’s a lot of eyes looking in detail into all aspects of a single piece of software. From how I understand Easterbrook, the same kind of effort in the GCM community is significantly diluted relative to this standard.
A Prediction
I predict that papers will be written by Professor Easterbrook, reviewed by friendly cohorts who are equally unaware of the literature which has presented the modern methods that are applicable to engineering and science software, published in only the proper peer-reviewed Scientific Journals, and be quoted in the next IPCC reports that the Climate Science GCM software is pure. However, as mentioned above, self reference based solely on a position of authority always fails.
Not Even Wrong
This Comment by John Mashey is simply wrong. Look around this site, and this one, and this one, and this one, and this one, and this one.
Looks like we’re getting some Traction
This is interesting; Computational science: …Error. From Nature News, even. Comments allowed over there.
Old Science, New Science
Old Science:
Testable hypotheses Validated by Causality. Quality.
See Robert M. Pirsig, 1974.
New Science:
Plausible connections; Validation and Causality optional.
V&V and SQA: Part 5, SQA
Software Quality Assurance Procedures
Continue reading
V&V and SQA: Part 2, Requirements for Production-Grade Software
The requirements for release of software for production-grade applications include:
Continue reading
V&V and SQA: Part 1, Definitions and Characterization
I’m going to post a series of short summaries of some of the central aspects of verification, validation ( V&V ) and software quality assurance ( SQA ) for production-grade computer software. These subjects have received significant investigations starting in the 1980s ( more or less ) and have reached maturation and have been successfully applied to a wide range of scientific and engineering software. I don’t intend to give a complete exposition of the subjects, the field is much too big.
Continue reading
Hard Concepts
Boy, it’s difficult to get my mind around many of the concepts discussed in the post Tracking down the uncertainties in weather and climate prediction.
Updated July 10, 2010.
I have looked around but have not been successful in finding additional material from either the meeting or the presentation. I suspect all information presented at the meeting will eventually show up on the CCSM Web site.
Here’s a part that I find to be very unsettling. Starting at the 15th paragraph in the post.
And now, we have another problem: climate change is reducing the suitability of observations from the recent past to validate the models, even for seasonal prediction:
Figure Uncertainty2. Climate Change shifts the climatology, so that models tuned to 20th century climate might no longer give good forecasts
Hence, a 40-year hindcast set might no longer be useful for validating future forecasts. As an example, the UK Met Office got into trouble for failing to predict the cold winter in the UK for 2009-2010. Re-analysis of the forecasts indicates why: Models that are calibrated on a 40-year hindcast gave only 20% probability of cold winter (and this was what was used for the seasonal forecast last year). However, models that are calibrated on just the past 20-years gave a 45% probability. Which indicates that the past 40 years might no longer be a good indicator of future seasonal weather. Climate change makes seasonal forecasting harder!
The conclusion, “Climate change makes seasonal forecasting harder!” is basically unsupported. There are a very large number of critically important aspects between ‘Analysis” and “Changed climatology” that are simply skipped over.
Firstly, the Analysis has been conduced with models, methods, computer code, associated application procedures, and users, any one of which separately, or in combinations with the others, could contribute to the differences between the 40-year and 20-year hindcasts. Secondly, within each of these aspects there are many individual parts and pieces that could cause the difference; taken together the sum is enormous. Thirdly, relative to the time-scales for climate change in the physical world 20-years seems to be kind of short and maybe even 40 years is, too. Fourthly, no evidence has been offered to show that climatology has in fact changed sufficiently to contribute to the difference.
The presentation seems to have leapt from (1) there are differences, to (2) the climatology has changed. I find this very unsettling. The phrase, Jumping to conclusions, seems to be applicable.
With the given information, I think about all we can say is the the models, methods, code, application procedures, and users did not successfully calculate the data.
I don’t see that any ‘tracking down’ was done.