LLMs Study Plan and First Steps

Hey Folks,
If you haven’t introduced yourself please do so! It’s great to get to know each other.

Once you’ve done that, here’s the study plan and goals. Each session will be spaced roughly 1 to 3 weeks apart. Its still fluid so if you have thoughts leave them below.

Without further ado

Goals

  • Have great working knowledge of how these models work
  • Understand the various components of a model and workflows needed to build them
  • Implement one or two from scratch
  • Load and run a pretrained checkpoint from Hugging Face
  • Ways to shape LLMs through fine tuning, reinforcement learning

Session 0: Why the switch, why I’m interested, and study club focus

  • Kicking things off

Session 1: Building the foundation with basic neural nets

  • NN Frameworks and what they do
  • Model creation vs estimation
    • Backprop, vs SGD vs Bayesian

Session 2: Upping the level

  • More complicated feedforward neural networks
  • Different types of layers, how they work, we we use them

Session 3: Transformers and other NN Architectures

  • What are different neural network architectures
  • Why do they exist and how do they work
  • Why are transformers taking over everything?

Session 4: Language models focus

  • Basic Bayesian Language model
  • Neural Network Transformer Model

Session 5: Shaping LLMs

  • The strategies used to train your model to do what you want
    • SFT
    • RHLF
    • Prompting
7 Likes

I’m pretty excited about this, happy for the pivot

2 Likes

Thanks for the kind words!

I’m a bit sceptical about the hype but interested to get alternative views.

1 Like

Really looking forward to this. I’m out of town this weekend but plan on attending future events.

1 Like

Glad you mentioned this. We’ll definitely be separating the hype, from the math, from the reality in this study club. At the end you’ll have everything you need to form your own opinion

Hi, I am also looking forward to this.

1 Like

Kick off session is tomorrow! See you folks there

2 Likes

Looking forward to tomorrow’s kick off!

Although I’m sceptical of the hype around all this, I’m expecting to be asked a lot more by higher-ups to offer solutions with these techniques. So being able to see under the hype to understand A) If using these techniques to solve a problem offers value over simpler systems and B) Knowing the limits of these techniques to push back to higher ups before ideas become reality.

Looking forward to continue learning with you alll!

1 Like

The LLMs are quite magical and it would be nice to know how that magic happens. That is my motivation for joining this book club. On the other hand, how often do people customize LLM for their own application? My guess is not that often. For that reason, do I really to study these?

1 Like

This is a great question and one that is being actively answered by many organizations. Different companies and individuals are taking different approaches. We’ll talk about it in this book club in a couple of sessions

Hi. And thank you to Ravin for organizing this book club. I don’t work for a big firm but for a small startup. So my experience is not generalizable. But here is my 0.02$

Currently, training and fine tuning general LLMs like GPT-3.5/4 is not on the table for us. From our perspective, we are in exploration phase on how to best use this new technology. While for some tasks it might be useful, for others it might not be as useful. It’s also a resource question: should we as a company spending time and money on getting something LLM based or is that effort better spent elsewhere.

That resource question ultimately leads to the decision on whether or not to fine tune/ train a model for a custom use case. Are there other avenues we can explore: embeddings and clever prompting are two good ways of testing and gauging the possibilities with LLMs. Fine tuning, for us, is still a bit further away.

Happy to discuss more.

1 Like