Review of Jean-Pierre LORRÉ's speech at the IMA

Review of Jean-Pierre LORRÉ's speech at the IMA

LUCIE7B, a small model with big ambitions.


«We are a big fish that is made up of small fish. By joining forces and disseminating both data and models in an open format, we can come up with solutions that are comparable with those of our major competitors, who have comparably greater resources than we do. »

Our Research Director Jean-Pierre LORRE gave an exclusive presentation at the IMA - Innovation Makers Alliance on ‘LUCIE, the French open source LLM: challenges, implementation and roadmap’.


LUCIE, our opensource AI, is a model with 7 billion parameters. 
parameters, but don't let its size fool you:

  • These models are less computationally intensive.
  • They often achieve similar or even better performance on targeted tasks.


To obtain a high-performance AI model, two elements are essential: the quantity and quality of the data. But there is one factor that is often overlooked... the language of the data!

We have trained a model with a strong presence of French data:

  • 40% of the data is in French
  • 25 % in English

The rest is divided between German, Italian and Spanish.
To give you an idea, ChatGPT contains only a small percentage of data in French.


Betting on French-language data means guaranteeing AI that is more relevant, more sovereign, and better aligned with our linguistic practices.