Hey guys, in this episode I talk about the pre training process of the state of the art NLP and computer vision transformer architectures. Since 2017 we train NLP (BERT, GPT, ELECTRA) networks with a masked language model using a self-supervised procedure, and now (since 2022) we are also able to train vision (MAE) networks using the same masked language model procedure. This way of self-supervised pre training enable us to train accurate models that really understands semantic and context without labeled data. I also talk about a tabular transformed architecture (TabTransformer - 2020) using the same approach achieve state of the art results compared to ensemble methods.




Instagram: https://www.instagram.com/podcast.lifewithai/


Linkedin: https://www.linkedin.com/company/life-with-ai


BERT paper: https://arxiv.org/pdf/1810.04805.pdf


GPT3 paper: https://arxiv.org/pdf/2005.14165.pdf


ELECTRA paper: https://arxiv.org/pdf/2003.10555.pdf


MAE paper: https://arxiv.org/pdf/2111.06377.pdf


TabTransformers paper: https://arxiv.org/pdf/2012.06678.pdf