DeepSeek-R1 is Now Available!On January 20, the DeepSeek-R1 model was released on HuggingFace. (See https://huggingface.co/deepseek-ai/DeepSeek-R1)Jan 23Jan 23
A Deepseek Chain-of-Thought DemoAnson shared an X post about Deepseek, so I decided to try its “Deep Think” mode with a classic brain-teaser. What stood out was its…Nov 21, 2024Nov 21, 2024
How to Train Your Llama: The TrilogyThis is a “content page” for the writeup of my slides. For those who rather read than watch a video. Just jump to the parts that interest…Oct 22, 2024Oct 22, 2024
How to Train Your Llama: Hyper Parameters (Part 3)This is Part 3. You can read Part 1, Part 2 or watch the video presentation. The final dataset used is available on GitHub.Oct 22, 2024Oct 22, 2024
How to Train Your Llama: Go Wide, Go Deep (Part 2)This is Part 2. You can read Part 1 or watch the video presentation. The final dataset used is available on GitHub.Oct 22, 2024Oct 22, 2024
How to Train Your Llama: For Fun & Profit (Part 1)I recently presented at the “Dive Deeper into LLMs with Nationwide LLM League Winners” event. For those who prefer reading, I’ve created…Oct 22, 2024Oct 22, 2024
I’m imPROMPTool: Let Me Help Your Prompt Engineering Skills Take FlightI originally sat down to write a proper introduction for imPROMPTool, but as I started, it became clear that this AI assistant is so…Oct 17, 2024Oct 17, 2024