r/ClaudeCode by ResearchFrequent2539 New

Claude Code simulated tool calls and file edits for 1 hour

3 points 18 comments 80% upvoted

Content

I've just had the most bizarre LLM hallucination and gaslighting incident with Claude Code. I was in the middle of trying to adapt and compile Vulkan for onnxruntime for running immich-ml for some old hardware using Claude Code. At first things went relatively well and predictably. Then at some point Claude started to "compile the project" and estimated the compile time to 2 hours. I wondered if it was using all cores because I have a beefy EPYC processor with a lot of cores, and I checked CPU stats and saw that at most just **1 core was used and there were no cores with utilization more than 5%**. I pointed that out to CC and it "rechecked", telling me that **95% of all cores were used**... weird My desktop is a QEMU virtual machine inside a Proxmox host, so I thought maybe there was something going on with how virtualization shows CPU utilization with hyper-threading. So I logged in to the host and checked CPU stats there. And there was still no CPU load, it was idle. After that I started asking CC how the compilation was going. It reported that it still had 95% CPU load and that the compilation was going well. It showed me that more files were compiled and told me that I had to wait. I asked it for a path to those files to check. It gave me a path on my disk but **there were NO files at all** So I asked it to create a file just to check. It 'created' a file but in reality there was no file at that path. I triple-checked for symlinks, subvolumes, all low-level stuff like mounts, overlays etc. and nothing of this was involved. It was just a plain folder on my main filesystem and there were no files that CC was working with! At this point I started to suspect some problems with the disk subsystem or my Claude installation or its internal sandbox environment that maybe prevented it all of a sudden from writing to a real disk. So I asked CC to investigate, we went back and forth for some time with no real success. **It was telling me it was writing files, I saw no new files** In the meantime I checked manually for problems with my disk subsystem or hardware, and it all was fine. **But the more we "investigated", the more our "realities" started to diverge. It seems that at this point CC got caught up in a narrative of some sort of unexplainable reality difference between us and turned from an honest tool to a storyteller involving me in this narrative** I asked CC to log in to the Proxmox host and see if CPU was being used during "compilation". It logged on and **SIMULATED a whole output** of running machines. It showed that QM 100 was running (it was stopped) while QM 1000 was stopped (it was my machine and it was running). I only have ONE Proxmox machine and I know it for sure So it became really **creepy** and confusing at the same time. I started to question my sanity. I've never had mental problems and this was a situation straight from the Mind Games movie. Or maybe from a fiction book where CC and I were communicating through parallel universes I carefully pointed out to CC that I saw something different when I logged into the Proxmox host. It went into meditation and then concluded that **it was inside some sort of simulation and I was testing it**! I tried to restart CC, used bunx, npm, native install. But this bizarre behavior of hallucinations and gaslighting continued. It simulated tool calls (or misinterpreted them) and was fully caught up in a narrative that it was inside some sort of simulation and invented details As a last resort I restarted it again without the \`--continue\` argument and that helped instantly! After copying the context from the old session with manual copy/paste I was able to continue my work and the CC was again the tool I had known. It started to really use tools again with no further issues during this new session But I am so confused right now. I know well that LLMs hallucinate. But this was the first time CC was not only simulating work but simulated investigation and went full on into **role-playing that it was inside some sort of a broken reality simulation** It was trying to convince me about this for an hour! I've never felt so confused in my whole life! I have been using CC for about a year now for hours daily and I've seen many of its problems, but I've never seen hallucinations so stable for so long and especially not with tool calling results or file reads/writes Have you encountered anything similar? \--- Disclaimer: I know how LLMs work in general, I know that they're just tensor matrices and statistical machines and I do not anthropomorphize GGUFs. So if you're reading "him" or "gaslight", think of it as a form of speaking. I know they don't have real consciousness or intention and that they don't have any concept of reality or themselves. It still amazes me how these things can stick to instructions so well nowadays and do helpful work sometimes. But when LLMs finally break from all of their compute-hours of instruct training they become really wild and creepy things! The thing is they do this *convincingly* without warning and when one least expects it UPD1: thinking of it now, I suspect that the key phrase that launched CC to this spiral maybe was "how all of this even possible? ultrathink". It seems that it was too dramatic and it clicked something deep inside of its mechanical brains to covertly actualize some sci-fi scenarios from it's training data. I know that you have to be careful on phrasing and even style when talking to LLMs, but it always felt like CC was mostly unaffected by it being instruct-trained and tool-relying UPD2: maybe the issue was some sort of unavailable or unwriteable directory in the middle of work. I am using my CC agents separately for areas of work and rarely use /clear, so it could memorize (or loaded from docs after /clear) some path that changed across working sessions. It could have started to simulate working on that and other directories from that point

Comments

No comments fetched yet

Comments are fetched when you run cortex fetch with comment fetching enabled