We released the preprint On the Value of Cross-Modal Misalignment in Multimodal Representation Learning.