We consider solving ill-posed imaging inverse problems under a generic forward model. Because of the ill-posedness present in such problems, prior models that encourage certain image-based structure are required to reduce the space of possible images when finding a solution. Common approaches utilize hand-crafted prior models with parameters tuned through trial and error, which can be time-intensive and prone to human bias. Other approaches based on machine learning try to learn the underlying image generation model given samples from the data distribution of interest, and use this to solve a constrained inverse problem; however, in many applications ground-truth images may be unavailable. In contrast, we propose to either select or learn an image generation model from the noisy measurements alone, without incorporating prior constraints on image structure. We first show how, given a collection of candidate models, the Evidence Lower Bound (ELBO) of a variational distribution can be used to select an appropriate prior. Then we showcase how, in the absence of available priors, one can directly learn the underlying model from a set of noisy measurements using the ELBO. We assume crucially that the ground-truth images share common structure by being drawn from the same underlying distribution. The learned model leverages this structure in its architecture, which consists of a shared generator with a compressed latent space where each measurement posterior is learned variationally. This allows the model to learn global properties of the data distribution from noisy observations without overfitting. We illustrate our framework on a variety of inverse problems, ranging from denoising to compressed sensing problems inspired by black-hole imaging.