• Feyd@programming.dev
    link
    fedilink
    English
    arrow-up
    13
    ·
    edit-2
    2 days ago

    Not really. None of what has been going on with transformer models has been anything but hyper scaling. It’s not really making fundamental advances in technology it’s that they decided what they had at the scale they had makes convincing enough demos that the scam could start.

    • JealousJail@feddit.org
      link
      fedilink
      English
      arrow-up
      8
      arrow-down
      4
      ·
      edit-2
      2 days ago

      It has been more than just hyperscaling. First of all, the invention of transformers would likely be significantly delayed without the hype around CNNs in the first AI wave in 2014. OpenAI wouldn‘t have been founded and their early contributions (like Soft Actor-Critic RL) could have taken longer to be explored.

      While I agree that the transformer architecture itself hasn‘t advanced far since 2018 apart from scaling, its success has significantly contributed to self-learning policies.

      RLHF, Direct Policy Optimization, and in particular DeepSeek‘s GRPO are huge milestones for Reinforcement Learning which arguably is the most promising trajectory for actual intelligence. Those are a direct consequence of the money pumped into AI and the appeal it has to many smart and talented people around the world