LLMs Study Plan and First Steps

RavinKumar · May 3, 2023, 5:02am

Hey Folks,
If you haven’t introduced yourself please do so! It’s great to get to know each other.

Once you’ve done that, here’s the study plan and goals. Each session will be spaced roughly 1 to 3 weeks apart. Its still fluid so if you have thoughts leave them below.

Without further ado

Goals

Have great working knowledge of how these models work
Understand the various components of a model and workflows needed to build them
Implement one or two from scratch
Load and run a pretrained checkpoint from Hugging Face
Ways to shape LLMs through fine tuning, reinforcement learning

Session 0: Why the switch, why I’m interested, and study club focus

Kicking things off

Session 1: Building the foundation with basic neural nets

NN Frameworks and what they do
Model creation vs estimation
- Backprop, vs SGD vs Bayesian

Session 2: Upping the level

More complicated feedforward neural networks
Different types of layers, how they work, we we use them

Session 3: Transformers and other NN Architectures

What are different neural network architectures
Why do they exist and how do they work
Why are transformers taking over everything?

Session 4: Language models focus

Basic Bayesian Language model
Neural Network Transformer Model

Session 5: Shaping LLMs

The strategies used to train your model to do what you want
- SFT
- RHLF
- Prompting

andrew_n_carr · May 4, 2023, 2:34am

I’m pretty excited about this, happy for the pivot

RavinKumar · May 4, 2023, 2:34am

Thanks for the kind words!

trelsco · May 4, 2023, 2:52am

I’m a bit sceptical about the hype but interested to get alternative views.

JulioBarros · May 4, 2023, 1:16pm

Really looking forward to this. I’m out of town this weekend but plan on attending future events.

RavinKumar · May 4, 2023, 4:55pm

Glad you mentioned this. We’ll definitely be separating the hype, from the math, from the reality in this study club. At the end you’ll have everything you need to form your own opinion

Marcus · May 6, 2023, 2:50pm

Hi, I am also looking forward to this.

RavinKumar · May 6, 2023, 3:27pm

Kick off session is tomorrow! See you folks there

yuri · May 6, 2023, 3:30pm

Looking forward to tomorrow’s kick off!

Lukelb12 · May 7, 2023, 11:34am

Although I’m sceptical of the hype around all this, I’m expecting to be asked a lot more by higher-ups to offer solutions with these techniques. So being able to see under the hype to understand A) If using these techniques to solve a problem offers value over simpler systems and B) Knowing the limits of these techniques to push back to higher ups before ideas become reality.

Looking forward to continue learning with you alll!

vijayvd · May 10, 2023, 3:40pm

The LLMs are quite magical and it would be nice to know how that magic happens. That is my motivation for joining this book club. On the other hand, how often do people customize LLM for their own application? My guess is not that often. For that reason, do I really to study these?

RavinKumar · May 11, 2023, 2:49am

This is a great question and one that is being actively answered by many organizations. Different companies and individuals are taking different approaches. We’ll talk about it in this book club in a couple of sessions

sharma.dhruv · May 12, 2023, 1:59pm

Hi. And thank you to Ravin for organizing this book club. I don’t work for a big firm but for a small startup. So my experience is not generalizable. But here is my 0.02$

Currently, training and fine tuning general LLMs like GPT-3.5/4 is not on the table for us. From our perspective, we are in exploration phase on how to best use this new technology. While for some tasks it might be useful, for others it might not be as useful. It’s also a resource question: should we as a company spending time and money on getting something LLM based or is that effort better spent elsewhere.

That resource question ultimately leads to the decision on whether or not to fine tune/ train a model for a custom use case. Are there other avenues we can explore: embeddings and clever prompting are two good ways of testing and gauging the possibilities with LLMs. Fine tuning, for us, is still a bit further away.

Happy to discuss more.

Topic		Replies	Views
Session 3: More Complicated NN Architectures Reading List GenAI	2	270	May 28, 2023
The Hot Topic thread for LLMs/AI GenAI	3	257	May 9, 2023
What do you know about LLMS, and what do you not know? GenAI	3	273	May 9, 2023
Session 4: Mid year break GenAI	2	290	August 9, 2023
Linear Gaussian Synchronous Session April 2nd at 8am PST LInear Gaussian SSM	0	320	March 27, 2023