A complementary open-source chain.
Large language models produce inefficient tokens in most languages and fail on domain-specific questions. At Magibu we solve this at every link of the chain with open source: from tokenization faithful to a language's morphology, to a methodology for building embeddings in your own language; from high-quality fine-tune data, to tools that connect the model to the outside world.
Morphological Tokenizer
Split text into morphology-faithful units and recombine them.
Language-Native Embeddings
An open method for building tokenizers + embeddings in your own language.
Fine-tune Datasets
High-quality open datasets tailored to task and persona.
LLM Tools
The ability to call the right tool at the right time.
How do I contribute?
- 01Go to the repo you're interested in on GitHub and review open issues
- 02Comment on an issue or open a new one
- 03Fork the repo and create a new branch
- 04Make your change, test it, and document it
- 05Open a pull request - explain what and why
- 06After review, merge - join the contributor list
Turkey-focused open R&D. Pick up issues on GitHub or apply to join the team. These contributions are prioritized in future hiring.
Loading…
Open science, benchmarks, and community contributions.
Magibu AI Weekly
Open-source weekly digest: AI news, papers, models, benchmarks, and underrepresented language updates.
View archive →