Tracks with null duration makes whole Subsonic scan fail

Issue description:

Hi there!

I have a quite large federated library on Funkwhale (~60k tracks), and sometimes, new imported tracks are given a null duration for reasons I don’t get.

The issue is that whenever suck tracks are encountered, the whole scan hangs and fails, until the field is set to an integer value (even 0 works) in the database.

A real example which is from 2025, but the behaviour is the same with latest version. Take the following excerpt from the Funkwhale Subsonic API:

{
  "id": 13615,
  "artistId": 15418,
  "name": "Allatte demo",
  "artist": "musichette",
  "created": "2025-08-11T21:41:53.000Z",
  "duration": null,
  "playCount": 0,
  "coverArt": "al-13615",
  "genre": "",
  "year": 2025,
  "songCount": 1
}

The fact that duration is null throws this exception in Symfonium:

2025-10-25 01:50:59.663 Error/SubsonicLogger: Error (0/3)
a40.h: Expected an int but was NULL at path $.subsonic-response.searchResult3.album[120].duration

If I manually change the duration (in the Funkwhale DB) with:

update music_upload set duration=0 where duration is null;

the scan works properly.


I understand that the root issue is that duration == null. Should not happen. But could Symfonium be more permissive in such events?

Thanks!

Logs:

I hope I gave enough details in the issue; otherwise I can manually reproduce the issue.

Media provider:

Subsonic

Symfonium relies on the provider giving correct data. As tracks cannot have a null duration, you’d be better off getting Funkwhale to sort out their scanner - once Funkwhale stops sending duff data then Symfonium will work.

@evilnick I understand your point. I wish I could do something, however it is maybe even beyond Funkwhale, i.e. some users uploading music with missing ID3v2 tags. Due to the federated nature of Funkwhale, this cannot be fixed easily (besides inserting a trigger in the database, which is dirty and ephemeral).

But I do not fully agree with your position. Maybe this is more of a philosophical debate. But when I code or design a system, I do it with the idea in mind that users can submit garbage. Nobody disagree with handling duff data when it comes to SQL injections.

But besides philosophical stance, it comes to the actual users of Symfonium. I hope we agree that user experience is important. So, are principles more important than user experience?

I think it is worth thinking about it.

I won’t bother you much if we do not agree afterwards. Really, my message is just a sincere, non-angry thought.

API have definitions for a reason, if servers send random data that does not respect an API then it’s the open door to everything, reason of many hacks in many places BTW.

They can easily fix by returning zero if the database contains null, they do not sanitize their output then this means anything can be sent, not only null and so could be problematic.

This is something they can easily fix if you report on their gitlab, they fixed a couple of other issues I reported there. (They also never fixed some other issues I reported there :p)

Hey, I feel for you, in this particular use case, but I still think you’re asking in the wrong place.

Clearly something is attempting to put invalid data into a place where an integer is supposed to be; and that something is where the fix should be made.

From what I can tell, Symfonium isn’t doing anything wrong, but Funkwhale is (or somehow the invalid user data that comes from Funkwhale is) so that’s where the fix should ideally be situated.

If Symfonium had to account for all the edge cases from all the other systems that it supports, it’d end up as a hackish mess, and nobody wants that to happen.

So, I wasn’t trying to come across as a big ol’ meanie, it’s (to me) more of a question of where the effort should be focused to come to the best solution for the situation you’re in.

I hope that makes sense!

@evilnick @Tolriq thanks for clarifying your position, I really appreciate your time.

To be fair I really get the idea that Symfonium should not implement dirty, ephemeral fixes for data that do not adhere to the API definition. It makes dirty code full of obscure conditionals. It would be like adding a database trigger on a particular instance each time data is wrong.

However, I realize that I did not make a proposition. The issue there is that a whole library scan can abort because of that exception. I don’t think that “monkeypatching” data is a good way.

But do you have any objection to continue the scan even if one particular track is wrong, and simply ignore it? In my case, a persistent, zombie federated instance keeps overriding the DB and only 8 tracks out of 50k are a null duration. What do you think about that?

/me steps aside for @Tolriq :stuck_out_tongue:

Much easier to remove 8 tracks with bogus data no? :wink:

Data integrity :wink: What about skipping an artist ?

And easier to deal with failed sync that trying to find why a couple of songs are missing.

@Tolriq yeah, it feels right. Should not happen often, and if it happens on an artist’s track, it is more likely to happen on others of the same “block”. Less dirty than having sparse sync at track level. And also more encouraging for users when such event happen :grinning_face_with_smiling_eyes:

Yeah, sure! But, as I mentioned earlier, the federated nature of Funkwhale makes it difficult to “remove” tracks. Those are not my tracks, but come from public instances and/or libraries than me or other users follow.

And if you think about it, solving this is much more difficult than it seems. Doing ad-hoc blacklisting is not a solution. Changing the local database is ephemeral, because it would be overwritten by the next sync. Contacting the responsible user so that it changes metadata… well, not sure it works. Asking Funkwhale devs; probably the right way (I didn’t know that the Subsonic API was strict), but they could also oppose that users are responsible for their metadata.

I inserted a trigger/procedure in my local database. It will do for now.

But, besides me, I feel that this is not the right angle. Data is messy in practice. Whether is should be or not is kind of speaking of an ideal, theorical world. To me the right question is : given what’s around us, and users having to deal with it, what can be done to improve their experience without messing with the code? What do you think? :wink: