Almost daily unreachable PMS (Chuck save me)

painbringer112 · February 27, 2020, 6:50pm

Alright, here are the logs from the most recent (containerized) instance. I thought maybe the issue was resolved as it went 3 days without issue but then finally happened last night

Plex Media Server Logs_2020-02-27_03-59-07.zip (6.2 MB)

Ill update with native logs after it happens

ChuckPa · February 27, 2020, 6:51pm

a bit confused here.

This is from the container?
It contains another container crash / failure?
It’s the Plex Relay message?

painbringer112 · February 27, 2020, 6:55pm

As we originally discussed I was running a native plex alongside the dockerized instance to avoid service interruption for my users, so yes these logs are from the docker instance.
Yes, after 3 days of running fine the docker instance finally became unreachable
Yes, this log AS WELL as the originally posted log from a few days back both displayed that plex relay message in thee logs.

Also just so you know I’m running the same version natively as I was inside the container: 1.18.4.2171-ac2afe5f8

ChuckPa · February 27, 2020, 7:00pm

Thanks. I just want to be absolutely certain of what I’m looking at.

What’s interesting is : The same binary executables are used for the container as well as the desktop. The only difference possible is glibc. (All of PMS has a single minimum version requirement. glibc must be 2.14.1 or higher.

If there is a difference between the container’s version and the typical desktop which is causing this, it will be a good and interesting find but likely a pain to resolve.

I have a couple things to do first but will start diving deep into this later this afternoon.

painbringer112 · February 27, 2020, 7:12pm

Sure, I appreciate it.

The container/native environment both report: (Ubuntu GLIBC 2.23-0ubuntu11)

ChuckPa · February 27, 2020, 8:12pm

This in concerning.

Feb 27, 2020 03:58:41.168 [0x7fccb6ffd700] ERROR - Failed to begin transaction (../Library/MetadataCollection.cpp:174) (tries=6): Cannot begin transaction. database is locked
Feb 27, 2020 03:58:41.586 [0x7fccb6ffd700] ERROR - Failed to begin transaction (../Library/MetadataCollection.cpp:174) (tries=7): Cannot begin transaction. database is locked

Where is the metadata stored? Is it on the local machine or across the network?

painbringer112 · February 27, 2020, 8:30pm

Local Machine. Those messages appeared in I believe both sets of logs.

But doesn’t it correct itself and keep going or does it die at that point?

ChuckPa · February 27, 2020, 8:44pm

It keeps going (as you can see by the retry count).

The concern is “why”. There is something monopolizing the DB resources.

I did see a flurry of sync activity before one set.
I think this is worth investigating; whether sync activity itself is ‘hogging’ the DB time

painbringer112 · February 27, 2020, 8:46pm

Yeah, I saw it keep going but after the tenth attempt I see:

Feb 27, 2020 03:58:43.439 [0x7fccb6ffd700] ERROR - Failed to begin transaction (../Library/MetadataCollection.cpp:174) (tries=10): Cannot begin transaction. database is locked
Feb 27, 2020 03:58:43.439 [0x7fccb6ffd700] ERROR - Fixup 29DatabaseFixupsSyncCollections threw an exception: Cannot begin transaction. database is locked
Feb 27, 2020 03:58:43.439 [0x7fccb6ffd700] DEBUG - Running database fix-ups completed in 1.1 seconds.

after which point it never appears again

ChuckPa · February 27, 2020, 8:47pm

Yes, that’s exactly where I’m looking too.

It’s starting to look like a messy sync list is causing problems.

painbringer112 · February 27, 2020, 8:48pm

I see, how could we investigate that further as the trigger?

ChuckPa · February 27, 2020, 8:49pm

I need to first speak with Engineering about how that works.

This might be the red herring error because database fixups are running and the root problem.

There does appear to be a race condition. Now to identify which elements are fighting.

painbringer112 · February 27, 2020, 8:54pm

Is there a way to clear out all the sync related data from the db in an effort to remove whatever leftover data is affecting the application?

ChuckPa · February 27, 2020, 8:57pm

Yes, manually go through your sync lists and clear out the dead ones.

painbringer112 · February 27, 2020, 8:58pm

Well I went to the “syncs” and removed everything. Also saw this

Feb 27, 2020 03:58:43.439 [0x7fccb6ffd700] ERROR - Fixup 29DatabaseFixupsSyncCollections threw an exception: Cannot begin transaction. database is locked
Feb 27, 2020 03:58:43.439 [0x7fccb6ffd700] DEBUG - Running database fix-ups completed in 1.1 seconds.

ChuckPa · February 27, 2020, 9:00pm

You cleared it out before or after that?

Also, after performing the cleanup, did you perform a shutdown / restart ?

I ask because, during the shutdown/restart sequence, the contents of the WAL (write-ahead log) is integrated into the static DB

painbringer112 · February 27, 2020, 9:00pm

This is after all the crashes that I removed them

ChuckPa · February 27, 2020, 9:01pm

so it’s recovering from the crash-- that’s going to happen.
The WAL is going to be recovered (replayed) to ‘fixup’ the current db image

painbringer112 · February 27, 2020, 9:07pm

Assuming i’ve cleared it out, there should be no mention of sync any more in the logs, correct? (assuming no one starts a new session)

ChuckPa · February 27, 2020, 9:08pm

correct. when sync is free of errors (LPE warnings, etc) and ‘pristine’, none of that should be happening

Topic		Replies	Views
PMS Crash due to sync tasks? NAS & Devices server-synology	8	753	November 22, 2018
Plex only sporadically reachable after installing PMS-1.18.1.1973-0f4abfbcc-x86_64 NAS & Devices server-synology	56	1042	February 19, 2020
PMS continues to automatically stop. Using Synology DSM Plex Media Server server-synology	59	563	October 21, 2022
Plex media Server manually stopped in Synology NAS NAS & Devices server-synology	60	656	April 15, 2023
Trying to find the cause, why PMS is corrupting its own database after a while NAS & Devices server-synology	30	345	April 6, 2020

Almost daily unreachable PMS (Chuck save me)

Related topics