Updates
Progress notes and research updates as ÒreAyò develops. Updated periodically with honest reporting on what's working, what's challenging, and what comes next.
Stage 1 Progress: Building the Foundations
Summary
Stage 1 focused on building and validating the infrastructure required to train speech models for Nigerian languages. No production models were trained during this stage.
What Stage 1 covered
- • Data processing and preparation pipelines
- • Training and experimentation infrastructure
- • Configuration systems for multilingual research
Stage 1 intentionally excluded model training, speech recognition, and user-facing features.
What now exists
By the end of Stage 1, ÒreAyò has a complete system capable of running speech model training end-to-end in a controlled research environment.
Key learnings
Nigerian speech data is significantly more limited than expected
Multilingual balance introduces non-trivial trade-offs
Early models will require iterative refinement rather than one-off training
These insights will directly inform Stage 2.
Known limitations
Stage 1 does not yet enable:
- • Speech recognition or transcription
- • Real-world evaluation
- • User-facing functionality
This stage was strictly foundational.
What Stage 1 enables
Stage 1 enables systematic experimentation with self-supervised learning for Nigerian languages.
What comes next
Stage 2 will focus on training and evaluating the first production-scale self-supervised speech encoder, while validating whether current data volumes are sufficient.
Stage 1 was about understanding the problem properly. Stage 2 will test whether that understanding holds under real training conditions.
Want to follow along?
Join the waitlist to receive updates as ÒreAyò progresses through research and development.
Join the waitlist