Loading Events

Distilling LLMs: Training an In-House Model with a Mature One for Retrieval-Augmented Generation

June 25 @ 7:00 pm - 8:00 pm

This presentation outlines a practical pipeline for distilling large language models into compact, in-house alternatives using mature teacher models. We focus on retrieval-augmented generation workflows, where distilled students must learn context grounding, citation discipline, and latency-efficient behavior. The methodology covers data curation from teacher-generated RAG trajectories, constrained supervised fine-tuning with context masking, and evaluation using faithfulness, robustness, and efficiency <a href="http://metrics.

We” target=”_blank” title=”metrics.

We”>metrics.

We demonstrate end-to-end integration with the Hugging Face platform, leveraging Model Hub for teacher selection, Datasets for curated training data, trl and peft for parameter-efficient training, Spaces for interactive validation, and Leaderboards for community benchmarking. Comparative analysis shows distillation achieves 85–95% of teacher performance at 20% inference cost, enabling deployable, privacy-compliant RAG <a href="http://systems.
The” target=”_blank” title=”systems.
The”>systems.
The talk concludes with mitigation strategies for common pitfalls and future directions in agentic and multi-teacher <a href="http://distillation.

Speaker(s):” target=”_blank” title=”distillation.

Speaker(s):”>distillation.

Speaker(s): Zichao Li,

Agenda:
7:00PM – Introduction of IEEE Hamilton Section

7:15PM – Presentation

8:00PM – Q&A

8:15PM – Refreshments

Room: Multipurpose Room 3, Bldg: Trafalgar Park Community Centre, 133 Rebecca St,, Oakville,, Ontario, Canada, L6K 1J5