Fine-tune Datasets

A community project gathering high-quality Turkish/English dialogue datasets in a single format to adapt models to a task, language, or persona.

datasetsfine-tuningturkishhuggingfacecommunity

Vision

Build, with the community, the high-quality, openly licensed fine-tune datasets Turkish language models need to be genuinely useful. Each contributor publishes from their own HuggingFace profile; this repo serves as the standard and index.

Contribution areas

Produce 100+ Turkish + 100+ English examples in any category
Extend existing datasets or run quality control
Contribute fine-tune notebooks and scripts
Build data quality validation tools

Tech stack: Python · HuggingFace Datasets · Parquet · License: CC BY 4.0

I want to join this project

Verify your Google account, fill out the form, then pick a task from the GitHub issue list to get started.

Fine-tune Datasets

Vision

Categories

Contribution areas

Resources & links

I want to join this project