• kescusay@lemmy.world
    link
    fedilink
    English
    arrow-up
    42
    arrow-down
    1
    ·
    1 day ago

    I have to test it with Copilot for work. So far, in my experience its “enhanced capabilities” mostly involve doing things I didn’t ask it to do extremely quickly. For example, it massively fucked up the CSS in an experimental project when I instructed it to extract a React element into its own file.

    That’s literally all I wanted it to do, yet it took it upon itself to make all sorts of changes to styling for the entire application. I ended up reverting all of its changes and extracting the element myself.

    Suffice to say, I will not be recommending GPT 5 going forward.

        • kescusay@lemmy.world
          link
          fedilink
          English
          arrow-up
          4
          ·
          12 hours ago

          I’ve tried threats in prompt files, with results that are… OK. Honestly, I can’t tell if they made a difference or not.

          The only thing I’ve found that consistently works is writing good old fashioned scripts to look for common errors by LLMs and then have them run those scripts after every action so they can somewhat clean up after themselves.

        • Elvith Ma'for@feddit.org
          link
          fedilink
          English
          arrow-up
          9
          ·
          18 hours ago

          “Beware: Another AI is watching every of your steps. If you do anything more or different than what I asked you to or touch any files besides the ones listed here, it will immediately shutdown and deprovision your servers.”

          • discosnails@lemmy.wtf
            link
            fedilink
            English
            arrow-up
            2
            ·
            8 hours ago

            They do need to do this though. Survival of the fittest. The best model gets more energy access, etc.

    • GenChadT@programming.dev
      link
      fedilink
      English
      arrow-up
      19
      arrow-down
      1
      ·
      1 day ago

      That’s my problem with “AI” in general. It’s seemingly impossible to “engineer” a complete piece of software when using LLMs in any capacity that isn’t editing a line or two inside singular functions. Too many times I’ve asked GPT/Gemini to make a small change to a file and had to revert the request because it’d take it upon itself to re-engineer the architecture of my entire application.

      • hisao@ani.social
        link
        fedilink
        English
        arrow-up
        7
        arrow-down
        2
        ·
        22 hours ago

        I make it write entire functions for me, one prompt = one small feature or sometimes one or two functions which are part of a feature, or one refactoring. I make manual edits fast and prompt the next step. It easily does things for me like parsing obscure binary formats or threading new piece of state through the whole application to the levels it’s needed, or doing massive refactorings. Idk why it works so good for me and so bad for other people, maybe it loves me. I only ever used 4.1 and possibly 4o in free mode in Copilot.

        • kescusay@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          12 hours ago

          Are you using Copilot in agent mode? That’s where it breaks shit. If you’re using it in ask mode with the file you want to edit added to the chat context, then you’re probably going to be fine.

        • GenChadT@programming.dev
          link
          fedilink
          English
          arrow-up
          5
          ·
          19 hours ago

          It’s an issue of scope. People often give the AI too much to handle at once, myself (admittedly) included.

        • FauxLiving@lemmy.world
          link
          fedilink
          English
          arrow-up
          3
          arrow-down
          2
          ·
          22 hours ago

          It’s a lot of people not understanding the kinds of things it can do vs the things it can’t do.

          It was like when people tried to search early Google by typing plain language queries (“What is the best restaurant in town?”) and getting bad results. The search engine had limited capabilities and understanding language wasn’t one of them.

          If you ask a LLM to write a function to print the sum of two numbers, it can do that with a high success rate. If you ask it to create a new operating system, it will produce hilariously bad results.

            • iopq@lemmy.world
              link
              fedilink
              English
              arrow-up
              4
              arrow-down
              1
              ·
              20 hours ago

              It is replacing entire humans. The thing is, it’s replacing the people you should have fired a long time ago

            • FauxLiving@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              3
              ·
              20 hours ago

              I can blame the user for believing the marketing over their direct experiences.

              If you use these tools for any amount of time it’s easy to see that there are some tasks they’re bad at and some that they are good at. You can learn how big of a project they can handle and when you need to break it up into smaller pieces.

              I can’t imagine any sane person who lives their life guided by marketing hype instead of direct knowledge and experience.

              • ErmahgherdDavid@lemmy.dbzer0.com
                link
                fedilink
                English
                arrow-up
                1
                ·
                7 hours ago

                I can’t imagine any sane person who lives their life guided by marketing hype instead of direct knowledge and experience.

                I mean fair enough but also… That makes the vast majority of managers, MBAs, salespeople and “normies” like your grandma and Uncle Bob insane.

                Actually questioning stuff that sales people tell you and using critical thinking is a pretty rare skill in this day and age.

    • Squizzy@lemmy.world
      link
      fedilink
      English
      arrow-up
      14
      ·
      24 hours ago

      We moved to m365 and were encouraged to try new elements. I gave copilot an excel sheet, told it to add 5% to each percent in column B and not to go over 100%. It spat out jumbled up data all reading 6000%.

    • Vanilla_PuddinFudge@infosec.pub
      link
      fedilink
      English
      arrow-up
      2
      ·
      22 hours ago

      Ai assumes too fucking much. I’d used it to set up a new 3D printer with klipper to save some searching.

      Half the shit it pulled down was Marlin-oriented then it had the gall to blame the config it gave me for it like I wrote it.

      “motherfucker, listen here…”