• white_nrdy@programming.dev
    link
    fedilink
    English
    arrow-up
    3
    ·
    5 hours ago

    I have started using LLM tools recently after taking a new job where a lot of people do it. I’ve discovered that it’s actually fairly helpful not only for explanations, but in two other respects

    • Sifting through immense amounts of documentation. I have to deal with some datasheets that are hundreds of pages, where there will be info scattered throughout. It’s very helpful sifting through those.
    • Doing boiler plate “plumbing work” in my code. I’m mostly drawing a line where I don’t want it doing the “core” work in that which I’m an expert, since I agree that if I stop doing that, I’ll atrophy. However it can help accelerate my process if I pass off some of the minutiae that I don’t feel the need to do.

    However all that said, I am honestly pretty impressed how well it works. I’ve mostly been using Claude, and damn, it’s honestly pretty competent. I had it make me a helper Python GUI program for me to test some stuff (I’m not a UI/high level engineer like that, I’m an FPGA Engineer), and it did a decent job. It definitely needed a good amount of massaging and guidance. However I can definitely see the appeal, and I think it’s a slippery slope, and I need to make sure I remain disciplined in not letting it do everything

    • MangoCats@feddit.it
      link
      fedilink
      English
      arrow-up
      2
      ·
      5 hours ago

      One trap is to trust it as a means to accomodate unreasonable schedule pressure.

      Sure - this thing looks like it works, hell it probably does work, do you really want to launch a probably works product? If your management does - consider shopping around for a raise/promotion under different management. It’s never easy to move, but if you’re moving on your own terms you can often make the effort worth your while.

      Another note: I find the LLMs to be wickedly detail oriented code reviewers - like, they’ll point out the tiniest little discrepancies and edge cases, and what they (Claude, at least) report is usually “real.” Now, that doesn’t mean they find everything that’s wrong on the first pass, but once you’ve addressed everything in the first pass, you can make a second pass, and a third, etc. each time with different focus: documentation complete? implementation functions as intended? technical debt? test coverage? security issues? issues with maintainability? documentation in sync with implementation? specific aspect of implementation functions as intended? etc. - if you address all the findings after each review cycle (and addressing a finding can be clarifying a requirement to relax about certain unimportant aspects…) eventually the findings slow down / only find ridiculously unimportant things.

      • FaceDeer@fedia.io
        link
        fedilink
        arrow-up
        1
        ·
        3 hours ago

        A thing I found quite amusing about the AI agents I’ve toyed with is that they have a step where they do a code review of their changelist, usually switching to a different “persona” when they write it so that they’re not seeing it as “their own” code. It’s funny reading at the critiques and compliments it gives the “other agent” it’s checking the changes for.

        I haven’t seen this feature yet, but it might be a good future enhancement to ensure that the harness literally uses a different model for the code review from the one that wrote the code in the first place. If Claude wrote the code have GPT do the review, and vice versa, for example. Wouldn’t be surprised if the feature exists and I just haven’t spotted it yet though, things change fast.

        • MangoCats@feddit.it
          link
          fedilink
          English
          arrow-up
          1
          ·
          3 hours ago

          I use Cursor for work (Claude Code at home), and Cursor gives the option to select your model. I’ve dabbled a bit with GPT for the review of Claude code - haven’t found anything dramatically better doing that than just Claude prompted to “wear the reviewer hat now.”

          • FaceDeer@fedia.io
            link
            fedilink
            arrow-up
            2
            ·
            3 hours ago

            Yeah, I wouldn’t use a framework that didn’t let you select the basic model. I’m just thinking about having it automatically switch to a different one during the review “phase”. It’s not as popular a coding agent these days but I like using Google’s Antigravity and it’s capable of being told to go through the sequence of steps “plan - > write documentation -> implement the plan -> run unit tests -> do a code review” automatically without needing to be prompted at each step. That’s where it would be nice to have it automatically switch for the review.

            “Wear the reviewer hat now” does seem to work quite well with the same model, but if more models from different lineages are available it just seems like the right thing to do to switch to another one.