172: Transformers and Large Language Models


172: Transformers and Large Language Models

Intro topic: Is WFH actually WFC?

News/Links:

Book of the Show

Patreon Plug https://www.patreon.com/programmingthrowdown?ty=h

Tool of the Show

Topic: Transformers and Large Language Models

  • How neural networks store information
    • Latent variables
  • Transformers
    • Encoders & Decoders
  • Attention Layers
    • History
      • RNN
        • Vanishing Gradient Problem
      • LSTM
        • Short term (gradient explodes), Long term (gradient vanishes)
    • Differentiable algebra
    • Key-Query-Value
    • Self Attention
  • Self-Supervised Learning & Forward Models
  • Human Feedback
    • Reinforcement Learning from Human Feedback
    • Direct Policy Optimization (Pairwise Ranking)


★ Support this podcast on Patreon ★




Source link

Post a Comment

أحدث أقدم