Plex Database sqlite file has more than doubled in 10 days

OttoKerner · August 22, 2020, 4:57pm

I’d rather go with the above sugggestion to

simply because the “watched” history is a valuable tool for many.
You can use it to decide which items you want to keep and which ones you rather delete to make room for new stuff.
Or in the case of music, you can create some clever smart playlists using this data.

Volts · August 22, 2020, 4:59pm

Oh agreed! And that’s a good point. There are lots of reasons that you might care about that data. I don’t mean to imply that nobody does or should.

I haven’t done any surgery (or butchery) on it in a while, and mine’s still a reasonable size at the moment. It’s possible that I was experiencing something more like what @pshanew mentioned, and that a more delicate approach would have worked too.

pshanew · August 22, 2020, 6:38pm

Yeah, it was really odd. Whenever I remove media from my library (and sometimes just 'cause) I perform a manual database optimization. It normally takes seconds to complete (small libraries), but on this occasion it took minutes. And during that time, the web client was extremely unresponsive. That prompted my investigation of the database.

I don’t care much for those statistics myself, and since it that table doesn’t seem to be a requirement for other tables, I didn’t mind deleting the records. Had it not worked, I likely would have deleted that table’s entire contents.

I still have no idea what, if any, client was causing the issue.

danielg1976 · August 22, 2020, 8:33pm

As I read the various comments I still have one nagging question - what exactly is being tracked by this table? How are the statistics ‘manifest’ or ‘viewed’ in Plex? Is it really just watch history by client?

What I’ve collected so far is:

It gets written to often if any client is open
It is used for watch history, which helps understand usage and what is and isn’t being watched
Watch history can be used to make smart playlists (sounds intriguing, like a playlist of everything I have listened to within the last 30 days?)

pshanew · August 22, 2020, 8:38pm

My experience is that, under normal circumstances, this table is only written to when media is played, not just by virtue of having a client open.

It tracks the type of media played, the account that played it, a count, and a duration. I believe it is used to populate the Top Users and Play History tables in the web client’s dashboard.

Volts · August 22, 2020, 8:43pm

That’s statistics_bandwidth. An open client checks in frequently, which causes a little bit of bandwidth usage.

The statistics_* tables are definitely used to populate the Bandwidth and Play History graphs on Status → Dashboard.

I’m also curious what else they support/enable.

Oooh I see Otto typing. Waiting now …

danielg1976 · August 22, 2020, 8:44pm

I keep pondering this.

Okay, so if it is watch history, how come 99.9% of the entries are all written with a timestamp of the first day of the month, at midnight? In fact, I can’t find an entry that isn’t written with a midnight timestamp. Wouldn’t they have much more random timestamps, like when we actually play media? I’m sure our household doesn’t all of a sudden watch millions of streams at exactly midnight.

OttoKerner · August 22, 2020, 8:44pm

The excessive number of entries is a bug.

Precisely. Or the opposite, to avoid hearing the same tracks too often.
See [HowTo] Create Playlists

danielg1976 · August 22, 2020, 8:46pm

Wait, I think i realize that since these are STATS they are being compiled by plex server? Not necessarily being written right when something happens? So in that case the exact timestamp may be when data was compiled or less important…

pshanew · August 22, 2020, 8:46pm

The lowest time duration of the filters in those tables in the dashboard is seven days so it doesn’t need to track the actual time played, just the date.

OttoKerner · August 22, 2020, 8:47pm

These could very well be caused by a server bug, yes

Volts · August 22, 2020, 8:47pm

I’m not convinced these are “play history” at all.

I think they’re truly statistics. Rolled up into daily/weekly/monthly, periodically, so that the database doesn’t (shouldn’t) get gigantic.

The schema for statistics_media doesn’t support the idea that it’s tracking per-user/per-item history.

id|account_id|device_id|timespan|at|metadata_type|count|duration

danielg1976 · August 22, 2020, 9:00pm

Yes, that was starting to dawn on me as well. So I think there might be a server bug.

Millions of entries all have null account ID, our Living Room Sony Bravia TV (android) as the device. Three entries over and over again in various combinations (sometimes two of them repeat for a few million):

row, account ID, deviceid, timespan, at, metadata_type, count, duration
9968161	NULL	10	1	2020-08-01 00:00:00	1	0	0
9968162	NULL	10	2	2020-08-10 00:00:00	1	0	0
9968163	NULL	10	0	2020-01-01 00:00:00	1	0	0

We may have triggered it with a change in how we started using Plex. One day our internet went out for like 8 hours, some backhoe or something nearby. Even though I’d setup our server according to the article about offline access, many of the Plex clients around the house wouldn’t work. That when we found out that if we signed them out (using my phone’s hotspot) they worked much better with the internet down. We left most of our clients signed out, and just signed the server in when the internet came back up - just in case we had more internet outages. I wonder if signed out clients might be more likely to trigger bugs?

Volts · August 22, 2020, 9:03pm

I think metadata_item_views is where per-item watch history is stored.

Volts · August 22, 2020, 9:08pm

There was also a recent Plex authentication outage. That may also have helped cause/trigger/exacerbate something.

Otto mentioned that there is a bug. Maybe it’s not “supposed” to be writing the NULL at all, or maybe it isn’t doing the roll-up/clean-up/delete completely on them?

pshanew · August 22, 2020, 9:12pm

Indeed. I just ran:
SELECT grandparent_title, title, viewed_at FROM metadata_item_views WHERE grandparent_title LIKE 'Stargate SG-1';

And it returned all the views I’d expect, including repeats.

Volts · August 22, 2020, 9:26pm

Agreed, and I just did almost exactly the same thing to confirm. I was joining accounts to be fancy but you beat me to it.

I don’t think partially-watched item status is stored there, though. Now I’m curious about that too.

Edit: Move Viewstate/Ratings from One Install to Another | Plex Support

metadata_item_settings apparently. Wouldn’t have guessed from the name.

OttoKerner · August 22, 2020, 9:36pm

This might be caused by the DLNA client on that device. Some have a feature to build their own library by scanning all available DLNA servers in the local network.
If you cannot find a preference to switch that off, you might have to disable the Plex DLNA server.

danielg1976 · August 22, 2020, 9:40pm

Yes, I think the statistics table should actually be relatively small. metadata_type seems to just apply to very broad things like video (‘1’), TV (‘4’), special features & movies in the photo library (‘12’), pictures (‘13’), picture folders (‘14’), etc.

danielg1976 · August 22, 2020, 9:42pm

I did notice that in the DLNA logs, there is constant chatter from some clients, like my Sony BluRay players (I should disconnect them from the network…). Things like the smart TVs, not much choice to leave them on. I’ll look more into client DLNA settings and consider turning it off until such time as I really need it.

Topic		Replies	Views
Slow library performance due to increasingly growing database Plex Media Server server-linux	77	2788	May 23, 2022
Database size increasing significantly above the ~100MB Expectation General Discussions	43	534	January 8, 2020
Slow Query time - unable to scan or play media Plex Media Server server-linux	65	1695	February 22, 2022
DB locks and wait times Desktops & Laptops server-linux , server-docker	56	994	March 13, 2022
Database corruption issues on a database only a couple of days old General Discussions	21	476	January 8, 2020

Plex Database sqlite file has more than doubled in 10 days

Related topics