Finetuning LLMs with Unsloth

Rejected

Session Description

Many of us use LLMs but we dont really understand why it works , by way of fine-tuning a LLM I want to firstly inform people on how an LLM actually functions under the hood, why do we even fine-tune them, how is it even comprehending information? and how can we do this for our own purposes using unsloth, an apache 2.0 licensed platform for fine-tuning open source models. this is an outline of what i will cover:

What is an LLM?

the mathematical magic behind LLMs
why do we do things how we do them, how is it different from the past?
"Attention is All You Need" research papers impact on LLMs .
Attention / memory

Tokenizers

Subword
Word
Character

Why do we fine-tune?

Instruction tuning
Bias tuning / remove racism
Question answering
Domain specialization
Reduced cost

What is RAG and how is this different?

Fine-tuning = update weights
RAG = external retrieval

Prerequisites for fine-tuning

A base model (open source and free model to fit in with FOSS's vision)
Dataset
Compute
Framework (Unsloth / PEFT)

Use cases

Specialize in a domain
Model distillation
Reduced cost
Decensoring (harmful)

Fine-tuning our own model to generate a new episode of Dragon Ball Z

Dataset → DBZ scripts/fan prompts collection
Style/format prompts

Evaluation

Cross-entropy / Perplexity
LLM/Human as a Judge
Human eval (pairwise win-rate)

Options for fine-tuning

Full fine-tuning
LoRA
QLoRA
Unsloth

Where can you go from here?

Quantization (4-bit, 8-bit)
Distillation
Optimized serving
Fine-tune + RAG hybrids
Open datasets / community fine-tunes

Key Takeaways

how LLMs work
what is fine-tuning
fine-tuning our own LLM to generate anime like episodes

Reviews

0 %

Approvability

0

Approvals

0

Rejections

0

Not Sure

No reviews yet.