Most of the available advanced misfit functions for full waveform inversion (FWI) are hand-crafted, and the performance of those misfit functions is data-dependent. Thus, we propose to learn a misfit function for FWI, entitled ML-misfit, based on machine learning. Inspired by the optimal transport of the matching filter misfit, we design a neural network (NN) architecture for the misfit function in a form similar to comparing the mean and variance for two distributions. To guarantee the resulting learned misfit is a metric, we accommodate the symmetry of the misfit with respect to its input and a Hinge loss regularization term in a meta-loss function to satisfy the "triangle inequality" rule. In the framework of meta-learning, we train the network by running FWI to invert for randomly generated velocity models and update the parameters of the NN by minimizing the meta-loss, which is defined as accumulated difference between the true and inverted models. We first illustrate the basic principle of the ML-misfit for learning a convex misfit function for travel-time shifted signals. Further, we train the NN on 2D horizontally layered models, and we demonstrate the effectiveness and robustness of the learned ML-misfit by applying it to the well-known Marmousi model.