In this eisode we discuss more of L3arbi, that was resented by the DevoxxMA team at the vent, from a technical point of view with Nouamane and Faissal.




Guests

Faissal Boutaounte

Nouamane Tazi

Notes


0:00:00 - Introduction and welcoming


0:01:21 - Who/what is l3arbi?


0:16:24 - Is there possibilties that L3arbi will learn other arabic dialects other than darija?


0:26:35 - Challenges of audio transcription


0:27:31 - Live demo of L3arbi


0:32:56 - Data training and Whisper and how many hours of darija data is used ?


0:34:30 - Format of data that is used for training, and arcitecture of web application presented in DevoxxMA


0:39:37 - The use case of DevoxxMA


0:44:30 - Finetuning of an LLM, models and details


0:56:20 - Evaluation set for different dialects per regions in morocco


1:03:31 - Did you use manual transcription for the audio sets ?


1:12:09 - The future plans of data sources for L3arbi solution.


1:15:12 - Plans to open-source? Are there APIs available for developers to extend its functionality?


1:25:00 - QA & Giveaway


1:59:40 - Conclusion and goodbye.




Links

Huggingface
Translation demo for/from any language
Common Voice
Whisper API



Prepared and Presented by

Meriem Zaid


Twitter Mentions