Earlier this week, I bought myself a new 2TB NVMe drive on sale, and had planned today to install it. Knowing that the computer, built in Dec 2021 had an issue with PCIe at the time that prevented 4x drives from running at 4x, but now had a BIOS update that fixed that (and introduced support for two additional CPU generations), I thought I should go ahead and flash the BIOS too.
Sparing unnecessary details, everything went fine. I unseated the 3060Ti to get at the M.2 ports, swapped out the old drive for the new one. Plugged in a spinner I had laying around while I was in there. Machine wouldn’t boot - so I reinstalled GRUB (yes, I’m a Linux user) and that was fixed. The disk had everything copied over from the old one, so that was fine too, other than some power management spam in the journal that was fixed with a kernel param. Seems fine, right?
So I spin up Minecraft to have a quick session. Frame rates are like, 3-20 fps for about 2 full minutes, then they get up to 60fps (Vsync’d). But every few minutes, they keep dropping and creep back up. Sometimes they bounce up.
There are a dozen potential problems at least that could cause something like this. So I went back into the newly flashed BIOS knowing that all my settings were reset, and started poking around and fixing settings. This was good either way, but it didn’t fix my sporadic fps dips. The game ALWAYS started at sub-10 fps, but it would get to 60 eventually. Sometimes in a handful of seconds, sometimes in minutes. One time, all the textures refused to load. Another time, the GPU fell off the bus.
This was not a problem at all prior to the work, so I started reviewing what I’d done so far. I’d added a disk with all copied data… but there’s a nvidia cache in there… cleared that. No dice. I looked at heat. No problem. Same drivers as before, but maybe I have to roll back drivers, or kernel, or BIOS. I dunno. Frustrated as fuck.
Could also be a power issue. Or a reseating the card issue. I hadn’t added any devices with more power consumption and I have ~100W of buffer on a good PSU, so it really should be a problem, but I reseat the card and replug the power cables anyway.
The problem is worse.
So I spend another hour and a half poking around forums looking at every possible thing I could do on the software side to fix this, before I decide to reseat the card one more time. I mean, the first time changed SOMETHING, even if it was for the worse.
Popping out the card, I blew into the slot as I thought I saw a tiny speck of dust. I used canned air the first time but this was really an afterthought by this point…
The body of a tiny moth popped out of the slot.
I reseated the card, spun up Minecraft, and everything is fine. Perfect frame rates, just like before.
I’ll never get the hours back that I spent troubleshooting this today. At least several other items were improved along the way, such as RAM timings. But in the event this story helps anyone in the future, it’s worth the time it took to type it.
Your PC had a bug
literally the origin of the term - that bugs (literally) used to get into computers and mess them up
No, it isn’t the origin of the term.
The term “bug” to describe defects has been a part of engineering jargon since the 1870s[7] and predates electronics and computers; it may have originally been used in hardware engineering to describe mechanical malfunctions. For instance, Thomas Edison wrote in a letter to an associate in 1878:[8]
… difficulties arise—this thing gives out and [it is] then that “Bugs”—as such little faults and difficulties are called—show themselves[9]
Ah Cunningham’s law. I’ll take my L.
No, they didn’t.
Stolen from wiki but it’s a common story that Grace Hopper would tell:
In 1946, when Hopper was released from active duty, she joined the Harvard Faculty at the Computation Laboratory where she continued her work on the Mark II and Mark III. Operators traced an error in the Mark II to a moth trapped in a relay, coining the term bug. This bug was carefully removed and taped to the log book. Stemming from the first bug, today we call errors or glitches in a program a bug.[14]
Literally the same thing as OP’s story.
It’s a good story, but it turns out to be a folk etymology. Two good articles discuss this issue: one from the New York Times and one from Computerworld.
Congratulations, you’ve now become a debugger
Wow, that was actually worth the read. Well done, glad you found the bug!
This is actually the origin of the word “bug” for computer errors. Early room-sized computers had lots of moving parts and used a lot of power. Bugs liked the warmth, so they would regularly crawl inside then get crushed somewhere by a moving part. So when your computer started acting up, you had to debug your computer by literally digging around inside and finding all the dead bugs.
In 1946, when Hopper was released from active duty, she joined the Harvard Faculty at the Computation Laboratory where she continued her work on the Mark II and Mark III. Operators traced an error in the Mark II to a moth trapped in a relay, coining the term bug. This bug was carefully removed and taped to the log book. Stemming from the first bug, today we call errors or glitches in a program a bug.
To have such a legendary event happen. I would’ve even be mad about broken parts. This story is for life.
Has anyone mentioned the origin of the term, computer bug? It is not a story a Jedi would tell you.
In the days of yore, when the marvels of computation were in their infancy, there arose a curious tale of a most peculiar occurrence. Lo and behold, within the realms of the great Harvard Mark II computer, a glitch did manifest itself. And this glitch, verily, was likened unto a pestilence that did plague the workings of the machine.
Now, the diligent engineers sought to uncover the source of this disturbance, to fathom the cause behind the malfunction. And as they delved deeper into the intricate labyrinth of wires and circuits, their eyes beheld a most astonishing sight—a tiny creature, not of flesh and blood, but of the insect kind.
With great astonishment, they did identify this creature as a moth, a small winged being that had inadvertently made its abode within the sacred confines of the machine. Thus, they spake among themselves, saying, “Behold, a bug hath infiltrated our realm of computation, and caused this mighty disturbance.”
And so, the tale of the computer bug was born, a term coined to describe those minute creatures that wrought havoc upon the intricate mechanisms of these contraptions. Though time hath passed and technology hath advanced, the legacy of this tale remains, reminding us of the humble origins of our digital dominion.
Playing Minecraft when your problem with bugs would be better solved in Grounded. There’s your problem! 😆
I don’t want to question your debugging skills, but imo you’re lucky to have found that out at all.
I’ve been in IT a long time and building PCs for longer, but my debugging skills degrade in direct proportion to how frustrated I am by the problem. That usually starts off very high and gets higher as obvious things get checked off the list.
Evidently I still have some kind of problem, because the GPU just fell off the bus again… but I haven’t had any frame rate issues in the last hour, so I think it’s a coincident problem and maybe not related to the moth. I did just update a BIOS, so I dunno. Too late to worry about it tonight.
When I’m at peak frustration, I start nuking things or shopping for parts online. Eventually the time I’ve wasted seems like so much a waste compared to replacing everything that exists. If I were rich, I’d probably throw out an entire PC and get a new one to see if that fixed my internet. “Oh, I fixed it! I guess it was…something.”
You and I are VERY similar in that way. I was literally shopping for new parts last night for a whole new PC since this is effectively a motherboard problem.
With a decent sleep behind me, I got back at it today and flashed every BIOS from my manufacturer, backward from the newest, until I arrived at one that didn’t have the GPU issue which I thought was caused by the moth. It turns out that it was not, in the end, the bug’s fault.
Having started on Friday morning with the very first manufacturer BIOS, which shipped with the board, I knew that it worked in a stable fashion but was missing a key feature for my new hardware. By the time I found one that once and truly didn’t exhibit the issue that was crashing my GPU, I was back to the 3rd BIOS release and now I think we are at “stable.”
The board in question, in case anyone finds this on a search engine, is the Gigabyte Aorus Z690 Ultra DDR4, and the shipped bios is version F3. My video card is a Gigabyte RTX 3060Ti, and the bug causing hard locks / crashes is present in every BIOS from F20 and upward. Version F6 has another issue I didn’t bother to diagnose that was causing instability for unknown reasons. F8, however, works just as well as F3 so far, though I want the rest of the weekend to test. So far none of the F20 series issues are in the journal though, so I think that’s where the issue was introduced.
And for Gigabyte haters, I feel you, but it’s a SFF build and I needed a mITX Z690 board and a 3060Ti that was short enough to fit in a NF200 case built at the peak of the GPU shortage which only made things more complicated. My options were serendipitously limited to that manufacturer. :)