My research interests include language and biology. During my PhD I was working on topics of distributional semantics to develop new ways of learning interpretable and disentangled representations of words and texts. Lately, I became increasingly interested in advancing natural sciences with methods from machine learning, and thus joined DeepMind Science team to develop new models for tasks in biology.

Recent publications

Highly accurate protein structure prediction with AlphaFold. John Jumper, Richard Evans, Alexander Pritzel, Tim Green, Michael Figurnov, Olaf Ronneberger, Kathryn Tunyasuvunakool, Russ Bates, Augustin Žídek, Anna Potapenko, Alex Bridgland, Clemens Meyer, Simon A. A. Kohl, Andrew J. Ballard, Andrew Cowie, Bernardino Romera-Paredes, Stanislav Nikolov, Rishub Jain, Jonas Adler, Trevor Back, Stig Petersen, David Reiman, Ellen Clancy, Michal Zielinski, Martin Steinegger, Michalina Pacholska, Tamas Berghammer, Sebastian Bodenstein, David Silver, Oriol Vinyals, Andrew W. Senior, Koray Kavukcuoglu, Pushmeet Kohli & Demis Hassabis. Nature, 2021.

Highly accurate protein structure prediction for the human proteome. Kathryn Tunyasuvunakool, Jonas Adler, Zachary Wu, Tim Green, Michal Zielinski, Augustin Žídek, Alex Bridgland, Andrew Cowie, Clemens Meyer, Agata Laydon, Sameer Velankar, Gerard J. Kleywegt, Alex Bateman, Richard Evans, Alexander Pritzel, Michael Figurnov, Olaf Ronneberger, Russ Bates, Simon A. A. Kohl, Anna Potapenko, Andrew J. Ballard, Bernardino Romera-Paredes, Stanislav Nikolov, Rishub Jain, Ellen Clancy, David Reiman, Stig Petersen, Andrew W. Senior, Koray Kavukcuoglu, Ewan Birney, Pushmeet Kohli, John Jumper & Demis Hassabis. Nature, 2021.

Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning. Angeliki Lazaridou, Anna Potapenko, and Olivier Tieleman. ACL-2020.

Compressive Transformers for Long-Range Sequence Modelling. Jack W. Rae, Anna Potapenko, Siddhant M. Jayakumar, Chloe Hillier, Timothy P. Lillicrap. ICLR-2020.

PhD (2014 - 2019)

“You shall know a word by the company it keeps” (Firth, 1957).

In my PhD, I was working on different methods of distributional semantics. I found it fascinating that people (and machines) can learn meaning of words just by their contexts.

I was working on Probabilistic Topic Models and particularly non-Bayesian approach of Additive Regularization that enables meeting multiple requirements for a model in practice.

Events and schools

Talk on ARTM embeddings [Slides] as an academic guest at Thomas Hofmann’s group at ETH Zurich (November 2017 - April 2018).

Rep4NLP workshop co-located with ACL-2017, Vancouver, Canada, July 30 - August 4. Regularized Topic Models for Sparse Interpretable Word Embeddings [Poster]

DataFest: data science conference and workshops, Moscow, Russia, February 11-12, 2017. Vector representations of words and documents [Video (in Russian)].

DeepHack.Q&A: hackathon on Deep Learning and Q&A systems, Moscow, Russia, February 2016. Word embeddings and topic models: bridging the gap [Slides].

The 2nd Conference of Yandex School of Data Analysis Machine Learning: Prospects and Applications, Berlin, Germany, October 5-8 2015. Linguistic regularization of topic models [Poster].

The 5th Lisbon Machine Learning School LxMLS-2015. Lisbon, Portugal, July 16-23 2015.

Visit to Microsoft Research Cambridge, UK, April 2015. Additive regularization of topic models and its parallel implementation [Slides].

The 8th Russian Summer School in Information Retrieval RuSSIR 2014, Nizhny Novgorod, Russia, August 18-22 2014. Additive Regularization for Learning Interpretable Topic Models [Poster].


A. Potapenko Probabilistic approach for embedding arbitrary features of text. Analysis of Images, Social networks and Texts (AIST-2018). [Slides].

A. Potapenko, A. Popov, K. Vorontsov Interpretable probabilistic embeddings: bridging the gap between topic models and neural networks. Artificial Intelligence and Natural Language (AINL-2017). [Slides].

Vorontsov K. V., Potapenko A. A. Additive Regularization of Topic Models. Machine Learning Journal, Special Issue “Data Analysis and Intelligent Optimization”– Springer, 2015. Volume 101, Issue 1, Page 303-323. DOI: 10.1007/s10994-014-5476-6.

Vorontsov K. V., Potapenko A. A., Plavin A.V. Additive Regularization of Topic Models for Topic Selection and Sparse Factorization. The Third International Symposium On Learning And Data Sciences, April 20-22, 2015, Royal Holloway, University of London, UK. – Springer, A. Gammerman et al. (Eds.): SLDS 2015, LNAI 9047, pp. 193-202.

Vorontsov K. V., Potapenko A. A. Tutorial on Probabilistic Topic Modeling: Additive Regularization for Stochastic Matrix Factorization. Analysis of Images, Social Networks, and Texts. – Springer, 2014, CCIS, vol. 436, pp. 29-46.

Potapenko A. A., Vorontsov K. V. Robust PLSA Performs Better Than LDA. The 35-th European Conference on Information Retrieval, ECIR-2013, Moscow, Russia, 24-27 March 2013. – LNCS 7814, Springer-Verlag Germany, 2013. Pp. 784-787.

Vorontsov K. V., Potapenko A. A. Regularization of probabilistic topic models to improve interpretability and determine the number of topics. International Conference on Computational Linguistics “Dialogue”. – Computational Linguistics and Intellectual Technologies, Moscow, 2014. Pp. 707-719.

Potapenko A. A. Regularization of probabilistic topic model for forming topic kernels. XXI International scientific conference “Lomonosov-2014”. – Moscow: Issuing office of MSU, CMC, 2013. Pp. 80-82.

Vorontsov K. V., Potapenko A. A. Modifications of Generalized EM-algorithm for Probabilistic Topic Modeling. Journal of Machine Learning and Data Analysis (ISSN 2223-3792), 2013.

Potapenko A. A. Sparse Probabilistic Topic Models. Theses of the 16-th Russian Conference “Mathematical Methods of Pattern Recognition”, Kazan. – Moscow: MAKS Press, 2013, P. 89.

Vorontsov K. V., Potapenko A. A. Regularization, Robustness and Sparsity of Probabilistic Topic Models. Computer research and modeling, 2012. V. 4, N 4.Pp. 693-706.

Vorontsov K. V., Potapenko A. A. Robust Sparse Probabilistic Topic Models. The 9-th International Conference “Intellectualization of Information Processing” (IIP-2012), Budva, Montenegro. – Moscow: Torus Press, 2012. Pp. 605-608.