What do you know about LLMS, and what do you not know?

RavinKumar · May 7, 2023, 4:33pm

To make this book club most useful for let me know what know, and what you don’t know! By doing both you’ll also implicitly tell me what you don’t know that you don’t know.

And with all three sets we can maximize the usefulness to time ratio of this study club!

So let me kick it off

I know

The basics of LLMs implementation
The idea behind transformers and the general intution
A lot about open source vs closed source
The fundamental mathematics, such as matrix multiplication, softmax etc
How to code basic models

I don’t know

The specifics of why to pick say 4 self attention heads versus another
The specific details of the newer llama models versus HF models
How to comfortably train these models on my own commodity hardware
The details of methods used to shape models such as like LORA, SFT, PEFT
How choices of hyperparameters like dropout etc are chosen in NN

ChadDelany · May 7, 2023, 4:44pm

I know:

Convolutional Neural Networks as they pertain to computer vision (and classifying satellite imagery).
The basics of Recurrent Neural Networks.
I’m learning/relearning Keras/TensorFlow and I know some PyTorch.
Fundamental mathematics, such as linear algebra, matrix multiplication, softmax, etc.

I don’t know:

I’m not even sure what I don’t know about LLMs. I basically don’t know anything specific to LLMs.
I’m very interested in learning LLMs: underlying architecture, how to tune the hyperparameters, how to incorporate LLMs into my work productivity, & how to create specialized versions that adapted to my unique workflow.

tburch · May 8, 2023, 7:30pm

Know:

Basics of how NNs work, as well as a few of the more application specific architectures (e.g. CNNs for images/locality, RNNs for sequences)
The more popular NN languages (torch, keras/TF)
The idea behind transfer learning and fine-tuning

Don’t know:

Pretty much anything behind transformers.
I’ve used ChatGPT, but none of the other models out there, open or closed source. No idea what the state of the other models is presently.
Useful/interesting domain cases for LLMs outside of ChatGPT, Copilot

Lukelb12 · May 9, 2023, 9:10pm

Know:

Fundamental NLP concepts e.g., tokenisation, stop words, normalisation
Deep learning fundamentals
Bit of PyTorch
Use of pretrained-LLM models for some NLP tasks (used BERT embeddings to deal with a news article duplication problem at work)
Rough understanding of transformers

Don’t know:

Basically anything LLM from the ground up, most NLP things I have used are APIs like HuggingFace, Spacy, SentanceTransformers

Topic		Replies	Views
LLMs Study Plan and First Steps GenAI	12	791	May 12, 2023
The Hot Topic thread for LLMs/AI GenAI	3	257	May 9, 2023
Session 3: More Complicated NN Architectures Reading List GenAI	2	270	May 28, 2023
Open Source LLMs and Coding GenAI	2	302	May 9, 2023
Session 2: Basic Neural Nets Reading List GenAI	3	372	May 21, 2023

What do you know about LLMS, and what do you not know?

Related topics