IMTBS - Sovereign artificial intelligence in France: a strategic and technological challenge

Michel-Marie Maudet, Co-founder and CEO of LINAGORA, took part in an afternoon session at the Institut Mines-Télécom Business School (IMT-BS ) dedicated to generative Artificial Intelligence (AI). The session began with a research workshop, bringing together researchers and teachers to explore current AI issues. This was followed by a captivating masterclass, where Michel-Marie shared his expertise with the students, addressing the challenges and opportunities offered by generative AI in the professional and societal world. These two moments offered a rich immersion in reflection and debate on the future of technology.

The beginnings of LINAGORA

Michel-Marie Maudet and Alexandre Zapolsky, the two founders of LINAGORA, began this adventure on the premises of the IMT-BS, formerly the Institut National des Télécoms (INT), at the end of 1999. At the time, Michel-Marie was a soldier at the Brétigny air base and Alexandre was a student on the Evry campus, where they both played university rugby. In 2000, they created LINAGORA, and today, 25 years later, LINAGORA is a company of over 200 people, present on four continents, specialising in enterprise collaboration, developing sovereign alternatives to tools such as Office 365 or Google Workplace, thanks to Open Source.

Open Source, a fundamental pillar of digital sovereignty

The rise of Open Source technologies in France and Europe is a crucial step towards digital sovereignty. For 25 years, we have been developing Open Source technologies that offer control and transparency, reinforcing confidence in the technologies used. With AI, this transparency is becoming essential, especially in the long term when it comes to critical decisions (insurance, credit, etc.). This link between technology and future decision-making raises ethical questions and societal choices, particularly concerning the cultural influence of American data on AI models.

When it comes to digital sovereignty, Michel-Marie explains that :

« 80% of internet and cloud services are actually American solutions. Is that a problem?»

This problem extends to AI, where existing models are mostly trained on English-speaking data, influencing the responses generated and presenting cultural biases. This can lead to distortions, for example in the recognition of certain historical facts or in the analysis of students' CVs.
Open Source offers transparency, enabling us to maintain control over the data we use, and thus guarantee greater confidence in the tool.
As Michel-Marie MAUDET points out, ‘ transparency is trust ’.

This is particularly relevant in the field of artificial intelligence (AI), where mastery of data and algorithms is becoming a major issue in ensuring sovereign and fair decision-making.

LINAGORA's mission is therefore to reduce dependence on large American companies, which is essential when it comes to protecting sensitive information held by French government departments and businesses.

The biases of current AI models and the importance of sovereign models

Current AI models, mainly from the American and Chinese giants such as ChatGPT and Claude, have cultural biases that raise questions of relevance and security in other national contexts, particularly in France and Europe. Michel-Marie MAUDET highlights that

« language models, such as Llama or Falcon, are mainly trained on English data, with less than 1% of data in French ».

This leads to biased responses and preferences, which do not necessarily reflect European culture or values. The example of erroneous answers about the history of the first microcomputer is striking:

« When I ask the question: what was the first microcomputer that came on the market?

A model that was massively driven by English data, you answer an Altair 8080 machine, and indeed, it's an American machine that appeared in the 70s, but unfortunately, the first personal microcomputer that was marketed was a Micral N from a French company. ».

These ‘ hallucinations ’ of AI models reinforce the need to develop local AI, with datasets that are representative of local cultures and contexts. LINAGORA is therefore committed, with the support of the OpenLLM-France community, to the creation of LUCIE, an open source generative AI, with ‘ 30% French data ’, a compact, energy-efficient model centred on European languages, an essential step towardstechnological autonomy in France and Europe.

« The first iteration we did was with a model whose dataset contained less than 1% French. In Lucie, we have 30% French, 30% English, 20% code and mathematics, because this helps the models to reason. The remaining 10% is split between the 3 other major European languages: German, Italian and Spanish.

Environmental and economic issues: towards more sober models

The environmental impact of training AI models is an increasingly hotly debated topic. Michel-Marie MAUDET points out that

« chaque requête sur ChatGPT consomme environ un litre d'eau »,

Un chiffre alarmant dans un contexte où la durabilité devient une priorité mondiale. LINAGORA adopte une approche pragmatique en développant des modèles IA plus petits et plus économes, tels que LUCIE. Cette stratégie vise à concilier performance technologique et responsabilité écologique. L’enjeu n’est pas uniquement environnemental, mais aussi économique. Michel-Marie MAUDET souligne que

« each request on ChatGPT consumes around a litre of water ».

Growing dependence on expensive foreign services is forcing French companies to look for alternatives. By creating sovereign AIs, local companies can not only reduce their costs, but also contribute to the national economy by developing exportable technologies aligned with European values of transparency and inclusiveness..

Workshop with IMTBS researchers