Our Origins

The journey to LUCIE started in June 2023 when LINAGORA havbe decide to bootstrap the OpenLLM France community, uniting contributors (>900 members in January 2025) around the goal of building open generative AI aligned with shared european values. By February 2024, this community evolved into OpenLLM Europe, aiming to connect and strengthen European initiatives for open, ethical AI generative models.To take this vision further, LINAGORA led the OpenLLM France consortium**, formed with 11 partners from the community, to answer the "Communs Numériques dans le domaine de l'IA Générative" call for projects. Now as France 2030 laureate, the consortium embarks on a two-year mission starting in late 2024 to create open generative AI commons, with a particular focus on education and the EdTech sector. The next milestone came at the end of 2023, when LINAGORA initiated LUCIE's training with the support of the community and especially the GENCI (Grand Équipement National de Calcul Intensif) for the access to Jean Zay supercomputer. Today, in January 2025, the release of LUCIE marks a significant step forward in delivering a truly open-source, ethical, and efficient AI model for Europe and beyond.

ai-partner ai-partner ai-partner ai-partner ai-partner ai-partner ai-partner ai-partner ai-partner ai-partner

What makes LUCIE Truly Open Source?

solution

Transparent Data

All training datasets are open and licensed for public use. From collection to curation, we ensure transparency at every step.

solution

Open Algorithms

Our training methodologies, fine-tuning processes, and "secret sauce" are fully documented and openly available for anyone to explore, use, and improve.

solution

Freely Accessible Models

LUCIE's weights, checkpoints, and source code are accessible under the Apache 2.0 license. This permissive, unrestricted license allows anyone, anywhere in the world, to use, adapt, and deploy the model for any purpose, ensuring true global accessibility and innovation.

Designed for sovereignty and sustainability

LUCIE was built to address the unique challenges of creating ethical, efficient, and accessible AI

solution

European Sovereignty

LUCIE embodies a commitment to European values, respecting cultural diversity, promoting ethical AI development and compliance with AI Act.

solution

Compact and Efficient

Optimized for low-resource environments, LUCIE's architecture enables deployment on "GPU poor" infrastructures and even mobile devices.

solution

Eco-Responsibility

By focusing on quality over quantity in training data, we ensure a lighter environmental footprint without compromising performance.

Test LUCIE by your own

You can test LUCIE's capabilities firsthand through our dedicated SaaS platform, available now at LUCIE.chat. Whether you're exploring the model's performance or integrating it into your workflows, the platform offers seamless access to LUCIE's features.

Try it now

LUCIE in figures

7 billion parameters

Model Size: 7 billion parameters – compact and optimized for high performance across diverse applications. In 2025, we will build more compact model size of LUCIE (<3B)

3.1 trillion tokens

Training Dataset: 3.1 trillion tokens, carefully curated to balance quality and diversity, including French, English, German, Spanish, Italian, and code.

600K GPU hours

Training Hours: Over 600,000 GPU hours on the Jean Zay supercomputer, utilizing 512 NVIDIA H100 GPUs in parallel.

Languages Supported

Multilingual focus, with a primary emphasis on French and main european languages, ensuring cultural and linguistic representation.

2023-2025

Development Timeline: Training initiated in late 2023, culminating in the model's release in January 2025.

Future of LUCIE in 2025

The journey of LUCIE doesn't stop here. Our roadmap for 2025 outlines ambitious milestones to enhance capabilities and expand the model's applications:

Q1

Enhanced Fine-Tuning and better toolkit for AI makers

We will refine LUCIE's instruction-following capabilities (fine-instruct), introduce function calling for better integration with external systems, and release at least one model under 3 billion parameters to ensure accessibility for resource-constrained environments.

Q2

Advanced Retrieval-Augmented Generation (RAG)

LUCIE will gain an advanced RAG function, enabling it to leverage external knowledge bases for more accurate and context-aware responses.

Q3

Multimodal Expansion with Voice Support

We will extend LUCIE's capabilities into multimodal AI, with a focus on voice processing in French, opening new possibilities for applications in education, accessibility, and beyond.

Q4

Agentic AI Framework

LUCIE will evolve into a robust Agentic AI framework, harnessing its capabilities to power autonomous systems and foundations for Large Action Models (LAM) while maintaining transparency, trust, and ethical safeguards.