We enrich the self-supervision strategy employed in BERT by applying self-supervision on the Word-Sense level. Our model, SenseBERT, achieves significantly improved lexical disambiguation abilities, setting state-of-the-art results on the Word in Context (WiC) task.
Exemplar Guided Active Learning (EGAL) is a method we developed to economically annotate training data for tasks with extremely skewed label distributions, such as disambiguating rare word senses.
Text generation via language models (LMs) is dramatically improving, but LMs do not attribute their generated text to its sources and can often make mistakes. We propose the simple framework of 𝘐𝘯-𝘊𝘰𝘯𝘵𝘦𝘹𝘵 𝘙𝘦𝘵𝘳𝘪𝘦𝘷𝘢𝘭 𝘈𝘶𝘨𝘮𝘦𝘯𝘵𝘦𝘥 𝘓𝘢𝘯𝘨𝘶𝘢𝘨𝘦 𝘔𝘰𝘥𝘦𝘭𝘴, which allows for grounding 𝘢𝘯𝘺 𝘰𝘧𝘧-𝘵𝘩𝘦-𝘴𝘩𝘦𝘭𝘧 𝘓𝘔 in knowledge from external sources, and attributing the text it generates to its sources.