Exploring the unknown, together

Cohere For AI is a non-profit research lab that seeks to solve complex machine learning problems. We support fundamental research that explores the unknown, and are focused on creating more points of entry into machine learning research.

Curiosity-driven collaboration

We are committed to making meaningful progress in machine learning research through open collaboration. We believe that technology is powerful, and empowering different perspectives ensures responsible innovation.

Cohere employee drawing on glass with a marker

Fundamental research lab

We contribute to progress in machine learning through fundamental research. We see contributions to traditional conferences and publications in journals as an important part of our work, but also support efforts that go “beyond the research paper” and encourage scientific communication through different mediums.

Meet our Staff
Title Card labelled Cohere For AI Scholars Program

Scholars Program

The Scholars Program provides the opportunity to work alongside some of the best researchers and engineering expertise in the world — exploring the unknown, together. It will serve as an open, supportive environment that provides an alternative point of entry into NLP research.

Accepted applicants, will join a dedicated team of passionate researchers and industry experts from January 2024 to August 2024, and will be paired with a project proposal, allowing you to grow as a researcher. Participation is full-time and paid. As part of the program, Scholars will have access to a large-scale experimental framework, world class research experts and will help advance our commitment to supporting responsible, fundamental research on machine learning topics while prioritizing good stewardship of open source scientific practices.

Learn more
The Aya logo - a speech bubble with a fern growing inside it and the word AYA.

Aya: An Open Science Initiative

Aya is a global project that aims to build a multilingual language model via instruction tuning that harnesses the collective wisdom and contributions of people from all over the world. The goal is to make language model development more accessible and collaborative and to address the under-representation of certain languages in natural language processing research. Aya is open to anyone who is passionate about advancing the field of natural language processing and is committed to promoting open science. Learn more about the project in this blog post.

Join the Aya Discord server and start contributing in your language today.

Start contributing to Aya

Our Open Science Community

We’re not just another research group. We are the open science community to conduct top-tier ML research while creating more points of entry into the field.

Our research community is a space where researchers, engineers, linguists, social scientists, and lifelong learners connect and collaborate with each other. We come together from over 100 countries around the world and support large and small scale research collaborations.

Join Us
Research Grant Program

Research Grant Program

Cohere For AI research grants are designed to support academic partners who are conducting research with the goal of releasing a peer-reviewed scientific artifact. Our program provides academic partners, developers, researchers, and other members of our community with subsidized access to the Cohere API. We are interested in supporting requests for API access that enable data for good applications of large language models (LLMs), and/or responsible use of LLMs. Learn more about the goals of this program on our blog.

Apply now


We bring together leading researchers and rising stars in the field of machine learning to discuss their research learning journeys and showcase their technical achievements. Research is inherently a human endeavor, and our event series provide insights from beginning to breakthrough.

To stay up to date on upcoming talks, sign up to our mailing list

Sign up to our mailing list

Spotlight papers

The Grand Illusion: The Myth of Software Portability and Implications for ML Progress

How portable are popular ML software frameworks? This work reveals how costly straying from a narrow set of hardware-software combinations can be.

Authors: Fraser Mince, Dzung Dinh, Jonas Kgomo, Neil Thompson, Sara Hooker

Read the paper

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

Can we use Mixture of Experts (MoE) for instruction tuning in extreme parameter constraint? Here, we push MoEs to the limit with ultra-lightweight experts, enabling parameter-efficient MoEs with high performance.

Authors: Ted Zadouri, Ahmet Üstün, Arash Ahmadian, Beyza Ermiş, Acyr Locatelli, Sara Hooker

Read the paper

When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale

Leveragin data pruning to examine what makes “good data.” We explore several metrics for measuring LLM pretraining data and find that we can remove up to 70% of pretraining data while achieving better test set performance.

Authors: Max Marion, Ahmet Üstün, Luiza Pozzobon, Alex Wang, Marzieh Fadaee, Sara Hooker

Read the paper

Evaluating the Social Impact of Generative AI Systems in Systems and Society

How do we benchmark the social impact of generative systems? This cross-institutional research collaboration provides a guide to evaluating the social impact of Generative AI Systems.

Authors: Irene Solaiman, Zeerak Talat, William Agnew, Lama Ahmad, Dylan Baker, Su Lin Blodgett, Hal Daumé III, Jesse Dodge, Ellie Evans, Sara Hooker, Yacine Jernite, Alexandra Sasha Luccioni, Alberto Lusoli, Margaret Mitchell, Jessica Newman, Marie-Therese Png, Andrew Strait, Apostol Vassilev

Read the paper

Intriguing Properties of Quantization at Scale

Are emergent quantization difficulties of LLM truly inherent to scale, or can they be altered and conditioned by optimization choices made during pre-training?

Authors: Arash Ahmadian, Saurabh Dash, Hongyu Chen, Bharat Venkitesh, Stephen Gou, Phil Blunsom, Ahmet Üstün, Sara Hooker

Read the paper

On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research

How changes to the Perspective API, used widely for toxicity evaluation, impact research reproducibility and rankings of model risk.

Authors: Luiza Pozzobon, Beyza Ermis, Patrick Lewis, Sara Hooker

Read the paper

Metadata Archaeology: Unearthing Data Subsets by Leveraging Training Dynamics

Providing a unified and efficient framework for Metadata Archaeology – uncovering and inferring metadata of examples in a dataset.

Authors: Shoaib Ahmed Siddiqui, Nitarshan Rajkumar, Tegan Maharaj, David Krueger, Sara Hooker

Read the paper

Efficient Methods for Natural Language Processing: A Survey

Synthesizing methods and findings in NLP efficiencies, guiding new researchers in the field, and inspiring the development of new methods.

Authors: Marcos Treviso, Tianchu Ji, Ji-Ung Lee, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Pedro H. Martins, André F. T. Martins, Peter Milder, Colin Raffel, Edwin Simpson, Noam Slonim, Niranjan Balasubramanian, Leon Derczynski, Roy Schwartz

Read the paper

Intriguing Properties of Compression on Multilingual Models

Exploring compression as a way to improve model robustness for low-resource languages.

Authors: Kelechi Ogueji, Orevaoghene Ahia, Gbemileke Onilude, Sebastian Gehrmann, Sara Hooker, Julia Kreutzer

Read the paper

Large Language Models are not Zero Shot Communicators

Investigating the implicature gap in Large Language Models.

Authors: Laura Ruis, Akbir Khan, Stella Biderman, Sara Hooker, Tim Rocktäschel, Edward Grefenstette

Read the paper

More Papers

Work by Cohere For AI and Technical Staff at Cohere
  • Adversarial Nibbler: A Data-Centric Challenge for Improving the Safety of Text-to-Image Models


    Alicia Parrish, Hannah Rose Kirk, Jessica Quaye, Charvi Rastogi, Max Bartolo, Oana Inel, Juan Ciro, Rafael Mosquera, Addison Howard, Will Cukierski, D. Sculley, Vijay Janapa Reddi, Lora Aroyo.

    Read the paper
  • The Presidio Recommendations on Responsible Generative AI - World Economic Forum


    Sara Hooker, and over 100 other thought leaders.

    Read the recommendations
  • Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting


    Miles Turpin, Julian Michael, Ethan Perez, Samuel R. Bowman

    Read the paper
  • FAIR-Ensemble: When Fairness Naturally Emerges From Deep Ensembling


    Wei-Yin Ko, Daniel D’souza, Karina Nguyen, Randall Balestriero, Sara Hooker

    Read the paper
  • Associative Memory Augmented Asynchronous Spatiotemporal Representation Learning for Event-based Perception


    Uday Kamal, Saurabh Dash, Saibal Mukhopahdyay

    Read the paper
  • PASHA: Efficient HPO and NAS with Progressive Resource Allocation


    Authors: Andrej Bohdal, Lukas Balles, Martin Wistuba, Beyza Ermis, Cedric Archambeau, Giovanni Zappella

    Read the paper
  • BigScience: A Case Study in the Social Construction of a Multilingual Large Language Model


    Christopher Akiki, Giada Pistilli, Margot Mieskes, Matthias Gallé, Thomas Wolf, Suzana Ilic, Yacine Jernite

    Read the paper
  • MTEB: Massive Text Embedding Benchmark


    Niklas Muennighoff, Nouamane Tazi, Loïc Magne, Nils Reimers

    Read the paper
  • Improving Policy Learning via Language Dynamics Distillation


    Victor Zhong, Jesse Mu, Luke Zettlemoyer, Edward Grefenstette, Tim Rocktäschel

    Read the paper
  • Prioritized Training on Points that are Learnable, Worth Learning, and Not Yet Learnt


    Sören Mindermann, Jan Brauner, Muhammed Razzak, Mrinank Sharma, Andreas Kirsch, Winnie Xu, Benedikt Höltgen, Aidan N. Gomez, Adrien Morisot, Sebastian Farquhar, Yarin Gal

  • Studying the Impact of Magnitude Pruning on Contrastive Learning Methods


    Francesco Corti, Rahim Entezari, Sara Hooker, Davide Bacciu, Olga Sauk

    Read the paper
  • Robust Distillation for Worst-class Performance


    Serena Wang, Harikrishna Narasimhan, Yichen Zhou, Sara Hooker, Michal Lukasik, Aditya Krishna Menon

    Read the paper
  • Lifting the Veil on Hyper-parameters for Value-based Deep Reinforcement Learning


    João G.M. Araújo, Johan S. Obando-Ceron, Pablo Samuel Castro

    Read the paper
  • αNAS: Neural Architecture Search using Property Guided Synthesis


    Charles Jin, Phitchaya Mangpo Phothilimthana, Sudip Roy

    Read the paper
  • Scalable Training of Language Models using PAX pjit and TPUv4


    Joanna Yoo, Kuba Perlin, Siddhartha Rao Kamalakara, João G.M. Araújo

    Read the paper
  • Mitigating Harm in Language Models with Conditional-Likelihood Filtration


    Helen Ngo, Cooper Raterink, João G.M. Araújo, Ivan Zhang, Carol Chen, Adrien Morisot, Nicholas Frosst

    Read the paper
  • No News is Good News: A Critique of the One Billion Word Benchmark


    Helen Ngo, João G.M. Araújo, Jeffrey Hui, Nicholas Frosst

    Read the paper
  • Exploring Low Rank Training of Deep Neural Networks


    Siddhartha Rao Kamalakara, Acyr Locatelli, Bharat Venkitesh, Jimmy Ba, Yarin Gal, Aidan N. Gomez

    Read the paper
  • Predicting Twitter Engagement With Deep Language Models


    Maksim N Volkovs, Zhaoyue Cheng, Mathieu Ravaut, Hojin Yang, Kevin Shen, Jinpeng Zhou, Anson Wong, Saba Zuberi, Ivan Zhang, Nick Frosst, Helen Ngo, Carol Chen, Bharat Venkitesh, Stephen Gou, Aidan N. Gomez

    Read the paper
  • Interlocking Backpropagation: Improving depthwise model-parallelism


    Aidan N. Gomez, Oscar Key, Kuba Perlin, Stephen Gou, Nick Frosst, Jeff Dean, Yarin Gal


Sparking great conversations, collaborations, and community


Who we are

Cohere For AI is a registered non-profit, and core to our mission statement is contributing to knowledge in the public domain. We collaborate with researchers from private and public institutions, as well as independent researchers unaffiliated with an institution.

We are committed to open sourcing code from our programs, and promoting good stewardship of open source scientific practices.

Our Team

History of For AI

In 2017, a team of friends, classmates, and engineers started a distributed research collaboration, with a focus on creating a medium for early-career AI enthusiasts to engage with experienced researchers – they called it “for.ai.” Two of those co-founding members, Aidan Gomez and Ivan Zhang, later went on to co-found Cohere, and many of the original members went on to do exciting things (pursuing PhDs, working at industry and academic labs).

At the time, For AI was one of the first community-driven research groups to support independent researchers around the world. Today, Cohere is proud to reintroduce For AI as Cohere For AI, a dedicated research lab and community for exploring the unknown, together.

Frequently Asked Questions

  • Do we charge for our educational programs or community membership?

    Cohere For AI is a registered non-profit. We do not charge for participating in any of our programs, and are committed to supporting educational outreach programs, which include compute resources and infrastructure needed to participate in machine learning research.

  • Are you hiring for research positions or interns?

    Our full list of positions are listed here.