What do you know about LLMS, and what do you not know?

To make this book club most useful for let me know what know, and what you don’t know! By doing both you’ll also implicitly tell me what you don’t know that you don’t know.

And with all three sets we can maximize the usefulness to time ratio of this study club!

So let me kick it off

I know

  • The basics of LLMs implementation
  • The idea behind transformers and the general intution
  • A lot about open source vs closed source
  • The fundamental mathematics, such as matrix multiplication, softmax etc
  • How to code basic models

I don’t know

  • The specifics of why to pick say 4 self attention heads versus another
  • The specific details of the newer llama models versus HF models
  • How to comfortably train these models on my own commodity hardware
  • The details of methods used to shape models such as like LORA, SFT, PEFT
  • How choices of hyperparameters like dropout etc are chosen in NN

I know:

  • Convolutional Neural Networks as they pertain to computer vision (and classifying satellite imagery).
  • The basics of Recurrent Neural Networks.
  • I’m learning/relearning Keras/TensorFlow and I know some PyTorch.
  • Fundamental mathematics, such as linear algebra, matrix multiplication, softmax, etc.

I don’t know:

  • I’m not even sure what I don’t know about LLMs. I basically don’t know anything specific to LLMs.
  • I’m very interested in learning LLMs: underlying architecture, how to tune the hyperparameters, how to incorporate LLMs into my work productivity, & how to create specialized versions that adapted to my unique workflow.

Know:

  • Basics of how NNs work, as well as a few of the more application specific architectures (e.g. CNNs for images/locality, RNNs for sequences)
  • The more popular NN languages (torch, keras/TF)
  • The idea behind transfer learning and fine-tuning

Don’t know:

  • Pretty much anything behind transformers.
  • I’ve used ChatGPT, but none of the other models out there, open or closed source. No idea what the state of the other models is presently.
  • Useful/interesting domain cases for LLMs outside of ChatGPT, Copilot
1 Like

Know:

  • Fundamental NLP concepts e.g., tokenisation, stop words, normalisation
  • Deep learning fundamentals
  • Bit of PyTorch
  • Use of pretrained-LLM models for some NLP tasks (used BERT embeddings to deal with a news article duplication problem at work)
  • Rough understanding of transformers

Don’t know:

  • Basically anything LLM from the ground up, most NLP things I have used are APIs like HuggingFace, Spacy, SentanceTransformers