TIL: Masked Language Models Are Surprisingly Capable Zero-Shot Learners
Explores using a masked language model's head for zero-shot tasks, achieving strong results without task-specific heads.
Explores using a masked language model's head for zero-shot tasks, achieving strong results without task-specific heads.
Explores new research on instruction masking and LoRA finetuning techniques for improving large language models (LLMs).
Analysis of new LLM research on instruction masking and LoRA finetuning methods, with practical insights for developers.
A developer compares 8 LLMs on a custom retrieval task using medical transcripts, analyzing performance on simple to complex questions.
Explores methods for generating synthetic data (distillation & self-improvement) to fine-tune LLMs for pretraining, instruction-tuning, and preference-tuning.
Explores dataset-centric strategies for fine-tuning LLMs, focusing on instruction datasets to improve model performance without altering architecture.
Strategies for improving LLM performance through dataset-centric fine-tuning, focusing on instruction datasets rather than model architecture changes.
A technical guide on instruction-tuning Meta's Llama 2 model to generate instructions from inputs, enabling personalized LLM applications.
Introduces IGEL, an instruction-tuned German large language model based on BLOOM, for NLP tasks like translation and QA.