Finally, an excuse to learn to use theorem provers and include them in my resear...

dselsam · on June 29, 2017

Author here.

> Also as scribu states, this doesn't allow you to prove final goals of the system like "classify images with 95% accuracy", nor does it save you from insufficient or inaccurate data.

We focused on implementation-based notions of correctness, which are both much easier to state and much more useful when actually developing software, but in principle downstream notions of correctness could be stated and proved as well. For example, for some models and under some assumptions about the data, you could formally verify that the accuracy of your trained model will probably be high on an unseen test set. However, the limiting factor is that nobody knows that much yet about the assumptions under which most useful models (such as neural networks) are guaranteed to perform well.

kg9000 · on June 29, 2017

> Note that they claim performance similar to TensorFlow in CPU only. But a direct translation of this to Python TensorFlow code could run in GPU, and is probably easy to make. It wouldn't be proven, but it would be much more likely to be correct than the code you have constructed for your research just now.

They are wrapping unverified C++ code (Eigen) for the primitive kernels anyway, such as gemm, so AFAIK it could be extended to launch kernels on GPUs without any modification to the part in Lean that they proved correct.

Since most ML users are happy to trust TensorFlow, one could also wrap (unverified) TensorFlow and then use their approach to prove properties about real TensorFlow programs, e.g. that various model-transformations are sound.

dselsam · on June 29, 2017

Author here.

> They are wrapping unverified C++ code (Eigen) for the primitive kernels anyway, such as gemm, so AFAIK it could be extended to launch kernels on GPUs without any modification to the part in Lean that they proved correct.

That is correct. Our Lean development assumes that we have primitive kernels that we trust, and it is otherwise agnostic about how they are implemented. It would be straightforward to have kernels run on GPUs.

> Since most ML users are happy to trust TensorFlow, one could also wrap (unverified) TensorFlow and then use their approach to prove properties about real TensorFlow programs, e.g. that various model-transformations are sound.

This is a great idea. There are a bunch of useful properties TensorFlow users could prove about their programs. For example, a user could prove that their TensorFlow program will never produce a NaN, or that a mutation operation is "safe" and won't corrupt the gradients (see https://groups.google.com/a/tensorflow.org/forum/#!topic/dis... for an example of unsafe usage). Moreover, since verifying specific properties of specific programs is much more routine than verifying algorithms, most of the proofs could be push-button.

To TensorFlow users reading this: what are some common errors you make that are annoying to track down, that we may be able to catch (or prove the absence of) using formal methods? If there is enough demand perhaps we will work on this next.