Frequent "CLOCK_WATCHDOG_TIMEOUT" BSOD and other random errors
12 replies, posted
About a week and a half ago, I bought a new MOBO, RAM, CPU, SSD, and case to give my system a bit of an upgrade, since my old one was getting dated. I followed all instructions when setting everything up, was sure to discharge any static, and was very gentle with the parts. I POSTed on the first try, and everything seemed to be working out pretty well. After I finished installing all of my drivers, and getting basic software on the system, and playing some games, I started getting the "CLOCK_WATCHDOG_TIMEOUT" BSOD at random, but frequent, intervals. It seems to happen regardless of what I'm doing. I can be installing software, updating, playing a game, using chrome, or nothing at all, and I'll get the error.
So I downloaded Driver Booster, and updated all of my drivers (some were old), made sure windows is updated, updated my BIOS, cleaned off any potentially problematic software, and reset BIOS settings to default. However, I was still getting the error at the same frequency, including some other random errors (system boots to BIOS on its own, seemingly every other restart, BSOD or manual restart. ARMA 3 would crash from memory errors, and Civ 6 crashes, but I can't get any info on it). BlueScreenView reported the driver causing the error was "hal.dll" every time (even the one time I got a WHEA_UNCORRECTABLE_ERROR).
I figured it had to be a corrupt windows file, so I did a clean windows reinstall, and then updated all of my drivers, and updated windows to the latest version. And it made no noticeable difference in the crash frequency. After scouring what has to be hundreds of threads of people with similar (but not the same) problem on all different kinds of forums, I'm fairly certain it has to be a hardware issue.
I've been running Memtest86 for the past couple afternoons, to see if my RAM is the issue, and I haven't been able to complete a reliable number of passes, and it's always hanging up (memtest86 will freeze, and cause a restart) or generating an insane number of errors (in the tens of millions) in the same test (Test 8, random number sequence), sometimes it will pass just fine, but it will eventually do the same thing, just on a later pass. I've ran tests with one stick, both sticks, and moved them around to all the slots, and keep getting the same issues on the same test.
At this point, I'm fairly certain my RAM came DOA, or I somehow zapped it when I was setting everything up, but the fact that it's happening on every slot makes me worried that it may also be my MOBO. I'm wondering if someone could give me a second opinion, before I RMA my RAM and wait a week or more to get a new set back (since I live in Alaska).
System specs:
CPU: Intel i7-6700k @4.0 GHz (8 cores)
MOBO: ASUS Z170 Deluxe
GPU: EVGA GTX 970 SSC w/ACX 2.0+
RAM: G.SKILL TridentZ Series 16GB (8x2) DDR4 3200 1.35v
Storage: 500GB Samsung 850 (OS Drive), 250 GB Samsung 850 (Game drive), and a 2TB Seagate HDD (Nothing on this one).
BSOD Dumps:
This one is the CLOCK_WATCHDOG_TIMEOUT, it just didn't populate the bug check string for some reason
[code]==================================================
Dump File : 120716-4843-01.dmp
Crash Time : 12/7/2016 7:05:41 PM
Bug Check String :
Bug Check Code : 0x00000101
Parameter 1 : 00000000`00000018
Parameter 2 : 00000000`00000000
Parameter 3 : ffff8c80`af8bb180
Parameter 4 : 00000000`00000003
Caused By Driver : hal.dll
Caused By Address : hal.dll+4e490
File Description :
Product Name :
Company :
File Version :
Processor : x64
Crash Address : ntoskrnl.exe+14a510
Stack Address 1 :
Stack Address 2 :
Stack Address 3 :
Computer Name :
Full Path: C:\Windows\Minidump\120716-4843-01.dmp
Processors Count: 8
Major Version : 15
Minor Version : 14393
Dump File Size : 545,572
Dump File Time : 12/7/2016 7:06:35 PM
==================================================[/code]
Got another one when typing this up, and answering a captcha after previewing my post
[code]==================================================
Dump File : 120716-4828-01.dmp
Crash Time : 12/7/2016 7:26:44 PM
Bug Check String : SYSTEM_SERVICE_EXCEPTION
Bug Check Code : 0x0000003b
Parameter 1 : 00000000`c0000005
Parameter 2 : fffff802`7eb70dde
Parameter 3 : ffffaf01`5a4f26d0
Parameter 4 : 00000000`00000000
Caused By Driver : ntoskrnl.exe
Caused By Address : ntoskrnl.exe+14a510
File Description :
Product Name :
Company :
File Version :
Processor : x64
Crash Address : ntoskrnl.exe+14a510
Stack Address 1 :
Stack Address 2 :
Stack Address 3 :
Computer Name :
Full Path: C:\Windows\Minidump\120716-4828-01.dmp
Processors Count: 8
Major Version : 15
Minor Version : 14393
Dump File Size : 545,556
Dump File Time : 12/7/2016 7:28:14 PM
==================================================[/code]
What happens if I run Memtest86 and get errors:
[t]https://i.imgur.com/Pd03MtB.jpg[/t]
If you haven't already put the RMA/Warranty request in yet, do so now and start waiting. That RAM kit is definately shot.
These things happen. Having component failure happen early in the game isn't uncommon. Infant mortality (early failure) in hardware is a well known thing.
Luckily RAM is piss easy to RMA and most companies have lifetime warranties.
I was able to get my hands on some new RAM, and tried to boot up and run memtest, but it's still doing all the same stuff, plus the blue screens seem to be more frequent (I've gotten like 4 or 5 in the past 20 minutes). One of them was a CLOCK_WATCHDOG_TIMEOUT, but the others were a SYSTEM_SERVICE_EXCEPTION.
Is it still likely a hardware issue? I'm doing a 100% clean reset of windows again, but could it be another component, like my CPU or MOBO?
Run memtest on the new ram. If it's throwing out a lot of errors it could be a problem with the board (or most rarely the CPU)
[QUOTE=tyanet;51501386]I was able to get my hands on some new RAM, and tried to boot up and run memtest, but it's still doing all the same stuff, plus the blue screens seem to be more frequent (I've gotten like 4 or 5 in the past 20 minutes). One of them was a CLOCK_WATCHDOG_TIMEOUT, but the others were a SYSTEM_SERVICE_EXCEPTION.
Is it still likely a hardware issue? I'm doing a 100% clean reset of windows again, but could it be another component, like my CPU or MOBO?[/QUOTE]
Have you looked at the memory settings in BIOS?
[QUOTE=Richard Simmons;51502873]Run memtest on the new ram. If it's throwing out a lot of errors it could be a problem with the board (or most rarely the CPU)[/QUOTE]
I did, and it's doing the same as before. Either tens of millions of errors, the test locks up, or it restarts the computer mid test.
Is there any way to determine with confidence that it's the motherboard or the CPU?
[editline]9th December 2016[/editline]
[QUOTE=Killervalon;51502961]Have you looked at the memory settings in BIOS?[/QUOTE]
I did. I set it to 1.25 V (factory is 1.2) and a clock rate of 2400 (same as factory).
What power supply are you running and do you have any sort of OC on the CPU right now
The symptoms are identical to the regular symptoms of an unstable CPU from pushing it too far overclocking or the CPU failing to get enough voltage to get stable under certain conditions (motherboard or PSU problem)
Check your voltages in CPU-Z under various load and try disabling SpeedStep
[QUOTE=fishyfish777;51503408]What power supply are you running and do you have any sort of OC on the CPU right now
The symptoms are identical to the regular symptoms of an unstable CPU from pushing it too far overclocking or the CPU failing to get enough voltage to get stable under certain conditions (motherboard or PSU problem)
Check your voltages in CPU-Z under various load and try disabling SpeedStep[/QUOTE]
I'm running a using a Corsair RM750 from my last build.
I OC'd the CPU when I first set everything up, but I reset all the bios settings to default as soon as I started getting issues. It's been running at the stock 4.0 GHz since then.
The voltage defaults to around 1.2-1.25 V. I manually set it to around 1.35 V when I started getting issues to see if it would help, but it didn't seem to have an effect.
[editline]9th December 2016[/editline]
I'll try playing with speedstep and monitoring my voltages this afternoon.
I get that watchdog error every time when using display driver uninstaller for some reason.
So the problem is my batterys not working due to bios but to update the bios I need a battery -_-
[QUOTE=fishyfish777;51503408]Check your voltages in CPU-Z under various load and try disabling SpeedStep[/QUOTE]
So I disabled speed step, and it seems pretty stable now. I was able to run games for about 5 solid hours last night, which is longer that I could do before, and I left it on all last night and it didn't reset, which was impossible before.
It's nice that it works now, but could something that could still be wrong with my hardware? I don't want to be running on a faulty setup, since I'll have to make this rig last for a few years.
Also, I'm still having the issue where every time I boot up/restart my system, it goes straight into the BIOS, and I can only get into windows when I click the "save and exit" option.
as mentioned above, that may just be the motherboard or psu shitting out when asked to vary the voltage and clock speed.
I'd buy a burner cheapo microATX LGA1151 motherboard (preferably from a brick and mortar store) to check and RMA your current motherboard if it works with the cheapo. Return once the RMA comes through or keep as a backup.
You can also do that with a shitty 600w thermaltake PSU from best buy as well, but an RM750 is a good quality PSU that should still be holding up.
While you're at it, unplug/spray/replug all the cable connections in your computer lightly with some [URL="http://www.homedepot.com/p/WD-40-SPECIALIST-11-oz-Contact-Cleaner-300083/206356189"]WD40 electronics cleaner[/URL] (a.k.a isopropyl alcohol in a spray can) to make sure they have good contact. Loose cables can cause insufficient voltage.
Sorry, you need to Log In to post a reply to this thread.