Plex Database sqlite file has more than doubled in 10 days

I’d rather go with the above sugggestion to

simply because the “watched” history is a valuable tool for many.
You can use it to decide which items you want to keep and which ones you rather delete to make room for new stuff.
Or in the case of music, you can create some clever smart playlists using this data.

Oh agreed! And that’s a good point. There are lots of reasons that you might care about that data. I don’t mean to imply that nobody does or should.

I haven’t done any surgery (or butchery) on it in a while, and mine’s still a reasonable size at the moment. It’s possible that I was experiencing something more like what @pshanew mentioned, and that a more delicate approach would have worked too. :slight_smile:

Yeah, it was really odd. Whenever I remove media from my library (and sometimes just 'cause) I perform a manual database optimization. It normally takes seconds to complete (small libraries), but on this occasion it took minutes. And during that time, the web client was extremely unresponsive. That prompted my investigation of the database.

I don’t care much for those statistics myself, and since it that table doesn’t seem to be a requirement for other tables, I didn’t mind deleting the records. Had it not worked, I likely would have deleted that table’s entire contents.

I still have no idea what, if any, client was causing the issue.

As I read the various comments I still have one nagging question - what exactly is being tracked by this table? How are the statistics ‘manifest’ or ‘viewed’ in Plex? Is it really just watch history by client?

What I’ve collected so far is:

  • It gets written to often if any client is open
  • It is used for watch history, which helps understand usage and what is and isn’t being watched
  • Watch history can be used to make smart playlists (sounds intriguing, like a playlist of everything I have listened to within the last 30 days?)

My experience is that, under normal circumstances, this table is only written to when media is played, not just by virtue of having a client open.

It tracks the type of media played, the account that played it, a count, and a duration. I believe it is used to populate the Top Users and Play History tables in the web client’s dashboard.

1 Like

That’s statistics_bandwidth. An open client checks in frequently, which causes a little bit of bandwidth usage.

The statistics_* tables are definitely used to populate the Bandwidth and Play History graphs on Status → Dashboard.

I’m also curious what else they support/enable.

Oooh I see Otto typing. Waiting now …

I keep pondering this.

Okay, so if it is watch history, how come 99.9% of the entries are all written with a timestamp of the first day of the month, at midnight? In fact, I can’t find an entry that isn’t written with a midnight timestamp. Wouldn’t they have much more random timestamps, like when we actually play media? I’m sure our household doesn’t all of a sudden watch millions of streams at exactly midnight.

The excessive number of entries is a bug.

Precisely. Or the opposite, to avoid hearing the same tracks too often.
See [HowTo] Create Playlists

Wait, I think i realize that since these are STATS they are being compiled by plex server? Not necessarily being written right when something happens? So in that case the exact timestamp may be when data was compiled or less important…

The lowest time duration of the filters in those tables in the dashboard is seven days so it doesn’t need to track the actual time played, just the date.

These could very well be caused by a server bug, yes

I’m not convinced these are “play history” at all.

I think they’re truly statistics. Rolled up into daily/weekly/monthly, periodically, so that the database doesn’t (shouldn’t) get gigantic.

The schema for statistics_media doesn’t support the idea that it’s tracking per-user/per-item history.

id|account_id|device_id|timespan|at|metadata_type|count|duration

Yes, that was starting to dawn on me as well. So I think there might be a server bug.

Millions of entries all have null account ID, our Living Room Sony Bravia TV (android) as the device. Three entries over and over again in various combinations (sometimes two of them repeat for a few million):

row, account ID, deviceid, timespan, at, metadata_type, count, duration
9968161	NULL	10	1	2020-08-01 00:00:00	1	0	0
9968162	NULL	10	2	2020-08-10 00:00:00	1	0	0
9968163	NULL	10	0	2020-01-01 00:00:00	1	0	0

We may have triggered it with a change in how we started using Plex. One day our internet went out for like 8 hours, some backhoe or something nearby. Even though I’d setup our server according to the article about offline access, many of the Plex clients around the house wouldn’t work. That when we found out that if we signed them out (using my phone’s hotspot) they worked much better with the internet down. We left most of our clients signed out, and just signed the server in when the internet came back up - just in case we had more internet outages. I wonder if signed out clients might be more likely to trigger bugs?

1 Like

I think metadata_item_views is where per-item watch history is stored.

1 Like

There was also a recent Plex authentication outage. That may also have helped cause/trigger/exacerbate something.

Otto mentioned that there is a bug. Maybe it’s not “supposed” to be writing the NULL at all, or maybe it isn’t doing the roll-up/clean-up/delete completely on them?

Indeed. I just ran:
SELECT grandparent_title, title, viewed_at FROM metadata_item_views WHERE grandparent_title LIKE 'Stargate SG-1';

And it returned all the views I’d expect, including repeats.

1 Like

Agreed, and I just did almost exactly the same thing to confirm. I was joining accounts to be fancy but you beat me to it.

I don’t think partially-watched item status is stored there, though. Now I’m curious about that too. :slight_smile:

Edit: Move Viewstate/Ratings from One Install to Another | Plex Support

metadata_item_settings apparently. Wouldn’t have guessed from the name.

1 Like

This might be caused by the DLNA client on that device. Some have a feature to build their own library by scanning all available DLNA servers in the local network.
If you cannot find a preference to switch that off, you might have to disable the Plex DLNA server.

Yes, I think the statistics table should actually be relatively small. metadata_type seems to just apply to very broad things like video (‘1’), TV (‘4’), special features & movies in the photo library (‘12’), pictures (‘13’), picture folders (‘14’), etc.

I did notice that in the DLNA logs, there is constant chatter from some clients, like my Sony BluRay players (I should disconnect them from the network…). Things like the smart TVs, not much choice to leave them on. I’ll look more into client DLNA settings and consider turning it off until such time as I really need it.