Session 3: More Complicated NN Architectures Reading List

Hey Folks,
The tentative date for our next session is two weeks from now, on June 4th.

As of now we’ll be studying

  • More complicated NN structures such as
    • Deeper Dense Networks
    • CNNs and why differing network architectures work
    • Transformers, the one that’s taking over them all
  • Good practice for NN code in production

We’re going to focus on this example from Flax, and I’ll be updating and tweaking it as well to fit out book club.

As I go through building out this session I may post additional examples as well.


Hey folks, I’m putting together the material. Here’s what we’ll cover in more depth.

the TDLR is that training NNs these days can basically be summarized into “How do I create the most efficient learning architecture that balances the data and I can train in a finite time with limited hardware as much as possible”

To do this

  1. An applied example in Flax showing the differences in performance between a Vanilla NN and a CNN
  2. An overview of infrastructure and code setup that makes this efficient
  3. A short intro on why transformers are taking over
  4. A list of my best references for understanding transformers

Tools we’re going to use

  • Flax
  • Colab
  • Tensorboard

Let me know if there’s any other topics you’d like me to cover!

Third session is scheduled for one week from now! Livestream link below