Machine learning models are being used extensively in many important areas,
but there is no guarantee a model will always perform well or as its developers
intended. Understanding the correctness of a model is crucial to prevent
potential failures that may have significant detrimental impact in critical
application areas. In this paper, we propose a novel framework to efficiently
test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using
Bayesian neural network (BNN). We develop a novel data augmentation method
helping to train the BNN to achieve high accuracy. We also devise a theoretic
information based sampling strategy to sample data points so as to achieve
accurate estimations for the metrics of interest. Finally, we conduct an
extensive set of experiments to test various machine learning models for
different types of metrics. Our experiments show that the metrics estimations
by our method are significantly better than existing baselines.