QNAP TS-453B Random Restarts

Server Version#: V1. 12. 1

Hi all, I was wondering if you could help me with an issue I have been having over the past few weeks, please.

Around once or twice a day/every few days, my QNAP NAS (TS-453B) randomly restarts. I have attached images of the notifications/log entries that I get.

I use my NAS purely as a Plex Media Server. I have tried downgrading to a previous firmware, but still the same issue.


I have also ran the file check each time with no errors, but it keeps happening.

I am running 16GB Crucial RAM (2x8GB). I have also tried each 8GB RAM stick individually and still had the same issue both times.

I submitted a ticket to QNAP along with logs, who said it was Plex related, but couldn’t really help much further.

Thanks in advance.

@DonCaliOni

I’m sorry for not responding sooner. Thanks for pinging.

Yes, I did get to the bottom of a few things.

I’ve updated firmware to 4.5.1.1495.
The QNAP is stable now. I’ve not had any issues with it. It’s been running for 11 days now without issue. In that time, my UPS has engaged a few times due to winter power fluctuations / interruptions.

My TS-128 (development system) is not connected to the UPS and did lose power. It did have a dirty file system when it restarted.

A dirty file system after a power loss is VERY expected and normal for any computer. Windows systems run CHKDSK, Mac’s have their utility, Linux has fsck. QNAP doesn’t automatically clean filesystems. I wish they would change that behavior but so far no. I will continue to nudge them as best I can.

If your AC power is fluctuating enough to trigger a power-fail event every few days, I urge you to get a UPS.

I don’t know which model QNAP you have but you only need enough VA (VoltAmps) to keep it running for about a minute (short power interruption) plus have enough time to perform a controlled shutdown.

If you’d like some help selecting the right one, I’d be happy to help.
The UPS management is automatic with QNAP. You only need have a UPS with a USB port on it to connect to the QNAP. The QNAP will detect it and then manage it from there.

To ease the search, here’s the QNAP - UPS compatibility page.

For this link, I preselected the TS-453B

I personally use APC UPS units. They’ve served me well for the past 20+ years.

regarding that notification center?

I don’t use it. The Event Notifications window is more than enough for me to maintain the unit.

I don’t get the error messages you get.

I don’t know if I’ve asked but when’s the last time this firmware was updated / reloaded ?

Also, is the unit in a dusty environment? I have a fair amount of dust here (surrounded by farms) and need to clean my unit 2-3 times a year.

Thank you for the detailed response. This issue has occurred both before and after installing a UPS (I use an APC UPS which the QNAP recognised and provides about 35 minutes of uptime). I last updated the firmware about a week ago, this issue has occurred across 3 different firmwares, as I downgraded to see if it was a firmware issue.

I wouldn’t say the unit is in a dusty area and doesn’t look dusty on inspection.

Had a quick read over the QNAP ticket again, and the agent said: “For some reason, it looks like it happens because of the GPU in Plex.”

Sorry to the late reply my side too, I’m in the UK so I think we’re in different timezones!

Hi, thought I’d attach a set of QNAP logs too taken after a crash, in case it is helpful! Q18BB06082 (1).zip (724.8 KB)

@DonCaliOni

Thanks for supplying the log but we don’t know how to read their logs.

If there is a problem with the GPU then every TS-453B in existence would have the same problem – which isn’t the case.

Everything you’re describing sounds like the first TVS-1282 I had.

There was nothing wrong with Plex and everything wrong with that unit.
They eventually shipped me a new NAS and there hasn’t been a problem since.

Manufacturing defects happen. It might indeed be the CPU or might be as simple as the CPU isn’t seated in the socket or heat sink compound problem. Those things happen.

Please do press on them to replace the unit while still under warranty.

Unfortunately it’s out of warranty. It has been running completely fine for ages, and then suddenly, just starts to crash/restart almost daily. It wasn’t after any updates to PMS or the QNAP either.

I guess I’ll just have to keep trying different things and see where I get to. I’ll start with trying the stock QNAP RAM, as officially, these are made to take 2x4GB RAM sticks rather than 1x8GB.

While you’ve got the top off…

Found it: (the hardware can’t map a single 8GB stick properly)

System Memory 4 GB SO-DIMM DDR3L (2 x 2 GB)

you might sneak 8GB ( 2x 4GB) because that’s all the CPU supports.

https://ark.intel.com/content/www/us/en/ark/products/95594/intel-celeron-processor-j3455-2m-cache-up-to-2-3-ghz.html

Cool, will give that a try then! It’s weird that it worked fine for ages with 16GB of RAM though…

Linux kernels are like that. A small change in the MMU can cause a lot of strange behavior.

As I think through this now, you’re probably seeing hard “kernel panic” because of memory addressing faults. What you describe is, in retrospect, consistent with it.

Bring it down to 8 , obey the CPU spec, observe the results.

There were similar discussions early on about why the C2538’s can have 16 GB but the newer J-series could only have 8 GB. Those discussions are here in the forum somewhere.

Makes sense! Will give it a shot and report back

Whilst I’m waiting for the original RAM to arrive, do these log entries mean much to you? (I appreciate they aren’t Plex logs, so probably not)…

[ 5268.787082] [drm] GPU HANG: ecode 9:1:0xeeffefa1, in Plex Transcoder [25657], reason: Hang on bcs0, action: reset
[ 5268.797623] i915 0000:00:02.0: Resetting chip after gpu hang
[ 5268.805134] BUG: unable to handle kernel NULL pointer dereference at 0000000000000070

ChuckPa of you recall there was A LOT on the QNAP forums over this. After the 4.3.1 upgrades may people had random reboots and we found several Panic Errors in the logs specific to memory.

I had to change from 1 single 8 GB stick of ram to 2 - 4 GB sticks which is specifically what QNAP calls for. After I switched it has worked like a champ.

Thanks for the info @skwor01, fingers crossed it sorts my issue too!

Out of interest, were the two 4Gb RAM sticks aftermarket, or QNAP original?

I bought my TS-453 Be from QNAPDirect. A third part distributor. They sold the unit with 1 - 8gb stick.

After many hours of troubleshooting I finally bit the bullet and purchsed 2- 4 gb sticks direct from QNAP. Made all the difference in the world.

My opinion QNAP are very picky when it comes to ram. Best to be very exact and match their specs including using their ram.

1 Like

@skwor01

You might want to notify QNAPDirect and inform them their configuration is not compatible with current firmwares

You can explain the problem and how going by the Intel CPU specification and 2x 4GB resolved it.

You might be able to get some moneys back for the bad RAM they sold with it. :wink:

Did and tried, problem was i was a year into owning it when the issues with reboots started happening and I discovered that the actual specs called for 2 - 4 gb sticks.

I did make known to them my displeasure with their inability to sell units as a direct vendor and NOT align with the actual manufactures specifications :face_with_symbols_over_mouth:

I would still buy from them, just would make sure all the specs match myself now. Live and learn.

This started happening to me in January and I’m still trying to debug it. Also with a 453Be. Running QTS 4.5.1.1540 and PMS 1.21.2.3939. Seems to trigger by running multiple transcodes. Already did a full 1 pass Memtest86 and drive diagnostics seem fine.

@UncleBabyBilly

  1. 8GB of RAM? (2 x 4GB sticks – as per spec)
  2. 1.21.2.3939 has been superseded by a hotfix of 2 issues. Current now 1.21.2.3943

Yes Crucial 8GB DDR3L 1600 MHz SODIMM Memory Module Kit (2 x 4GB). I could find this in the QNAP Log below but doesnt appear to give any more info when checking mcelog. I’m not sure what this is pointing to as previously stated the Memtest passed. I do have the original 4gb stick somewhere.

2021-01-27 13:40:05 -06:00 <6> [ 0.049064] mce: [Hardware Error]: Machine check events logged
2021-01-27 13:40:05 -06:00 <0> [ 0.050004] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: a600000000020408
2021-01-27 13:40:05 -06:00 <0> [ 0.051002] mce: [Hardware Error]: TSC 0 ADDR fef13580
2021-01-27 13:40:05 -06:00 <0> [ 0.052002] mce: [Hardware Error]: PROCESSOR 0:506c9 TIME 1611754482 SOCKET 0 APIC 0 microcode 1c
2021-01-27 13:40:05 -06:00 <6> [ 0.053055] Performance Events: PEBS fmt3+, Goldmont events, 32-deep LBR, full-width counters, Intel PMU driver.

[/var/log] # mcelog --client > mcelog.txt
[/var/log] #
[/var/log] # cat mcelog.txt
Memory errors
SOCKET 1 CHANNEL 4 DIMM 0
DMI_NAME “A1_DIMM0” DMI_LOCATION “A1_BANK0”
corrected memory errors:
0 total
uncorrected memory errors:
0 total

SOCKET 1 CHANNEL 4 DIMM 1
DMI_NAME “A1_DIMM1” DMI_LOCATION “A1_BANK1”
corrected memory errors:
0 total
uncorrected memory errors:
0 total