You are viewing a single comment's thread from:

RE: LeoThread 2025-11-05 15-48

in LeoFinance21 days ago

Part 6/9:

  • Saving prompts and generated outputs systematically with timestamped filenames to facilitate data management.

This method ensures efficient creation of large, rich datasets without manually writing each example.

Fine-Tuning Data Formatting

Once the synthetic prompts and responses are generated, the next step involves formatting the data for fine-tuning:

  • Structuring each record with a clear prompt and corresponding completion.

  • Employing string formatting to embed variables (e.g., genre, location) into prompts dynamically.

  • Managing file storage, naming conventions, and ensuring proper data splits for training.

Shapiro emphasizes testing generated examples before actual training, to confirm that prompts effectively yield desired outputs.