Music skip to next randomly

Yes, this is a debug message, meaning that the client closed the connection before the server could send the response.

Yes, this is a debug message, meaning that the client closed the connection before the server could send the response.

Thanks for the clarification !

Deleted HTTP_Frontend (because only here to redirect to HTTPS) and NAVIDROME_backend tunings, and drop values of HTTPS_Frontend :

frontend HTTPS_Frontend
    [...]
    timeout client 50s
    timeout http-request 10s
    timeout http-keep-alive 2s

Seems to works fine.

Digging up this old topic as I’m facing the same issue with Navidrome behind HAProxy, and the HAProxy configuration suggested above does not fix the issue for me. I had to spend a few hours to find a proxy configuration that works, and I am still not 100% happy with it.

I spent a few hours looking at the detailed timeline of the issue trying to understand exactly what is happening, and I do think there are a few things that could be changed in Symfonium to prevent the issue from happening or make it easier to troubleshoot and find the right configuration.

Bug reproduction timeline

All timestamps are in london timezone UTC+01:00. My phone is connected to my server via a public IP over 4G network. I am using the default debian configuration for HAProxy, in particular timeout client 50s.

0. 2024-08-30T23:04:04 Beginning of timeline

I add “song A” and then “song B” to an empty play queue in the Symfonium app, playback starts immediately.

1. 2024-08-30T23:04:04 Navidrome gets stream.view request

Navidrome gets original request for song A and starts streaming the file

From navidrome logs:

2024-08-30T23:04:04.477901+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:04:04+01:00" level=debug msg="API: New request /rest/stream.view" client=Symfonium requestId=pierre-nuc/<REDACTED>-010121 username=pierre version=1.13.0
2024-08-30T23:04:04.480679+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:04:04+01:00" level=debug msg="Found matching player" client=Symfonium id=<REDACTED> requestId=pierre-nuc/<REDACTED>-010121 type=Symfonium/Android username=pierre
2024-08-30T23:04:04.485201+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:04:04+01:00" level=debug msg="Streaming RAW file" id=ad537e7727b74013f2290829888c532c originalBitrate=1336 originalFormat=flac path="/remote/library/Album/songA.flac" requestBitrate=0 requestFormat= requestId=pierre-nuc/<REDACTED>-010121 requestOffset=0 selectedBitrate=0 selectedFormat=raw
2024-08-30T23:04:04.511375+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:04:04+01:00" level=info msg="Streaming file" artist="artistA" bitRate=0 cached=false format=raw originalBitRate=1336 originalFormat=flac requestId=pierre-nuc/<REDACTED>-010121 title="songA" transcoding=false user=pierre

2. 2024-08-30T23:04:37 Navidrome is done sending data for song A

We don’t get full details for every packet sent, but after about 33s navidrome seems to indicate that it is done sending the file

From navidrome logs:

2024-08-30T23:04:37.820005+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:04:37+01:00" level=debug msg="HTTP: GET http://<REDACTED>/rest/stream.view?id=ad537e7727b74013f2290829888c532c&u=pierre&t=[REDACTED]&s=[REDACTED]&v=1.13.0&c=Symfonium&f=json" elapsedTime=33.34s httpStatus=200 remoteAddr="<REDACTED>" requestId=pierre-nuc/<REDACTED>-010121 responseSize=40172994 userAgent="Symfonium/11.1.0 (Linux;Android 11)"

Response seems to be successful (HTTP 200), and response size is 40172994 which is the exact size of the file (in bytes)

There is no “Request was interrupted” warning message in navidrome logs, or anything that indicates that HAProxy closed the connection on the server side at that point.

I could not find a relevant log line around that time in symfonium logs explicitly telling me what symfonium is doing with the data, but we see the periodic “Verbose/ExoPlayer: loading” which I assume means the data is being fed regularly into the decoder and buffered somewhere in memory.

3. 2024-08-30T23:05:28 Things go wrong

HAProxy logs the streamview request with success HTTP response code but indicates that the connection was terminated abruptly due to a timeout on the client side

From HAProxy logs:

2024-08-30T23:05:28.375441+01:00 pierre-nuc haproxy[51863]: <REDACTED>:48764 [30/Aug/2024:23:04:04.475] navidrome_front~ navidrome_back/navidrome 0/0/0/835/83900 200 33572384 - - cD-- 1/1/0/0/0 0/0 "GET https://<REDACTED>/rest/stream.view?id=ad537e7727b74013f2290829888c532c&u=pierre&t=<REDACTED>&s=<REDACTED>&v=1.13.0&c=Symfonium&f=json HTTP/2.0"

The cD code indicates (see: HAProxy Enterprise Documentation version 2.9r1 (1.0.0-328.464) - Configuration Manual)

  • c: the client-side timeout expired while waiting for the client to send or receive data.
  • D: the stream was in the DATA phase

This is consistent with the reported byte count of 33572384 wich is about 7MB short of the total file size, and this explains why we only get part of the song.

Looking at the timeout diagram here: Introduction to HAProxy Logging

  • The total active time (Ta) of ~84s reported for the http session is consistent with log timestamps: original request sent at 23:04:04, response finished sending by the server about 33s later, connection interrupted about 51s later.
  • The 51s value is suspiciously close to my client timeout in haproxy (50s).
  • The Haproxy measured server response time (Tr) of 835ms confirms that navidrome started sending at least some data fairly quickly. Most likely it sent the whole file as fast as it could and it took 32s in total which is sensible (phone is connected over 4G network).

I have rather blurry memories of how TCP and HTTP work, and I’m unsure exactly how HAProxy enforces the timeout, but to me this suggests that:

  • either the problem is a lack of “ack”: symfonium sent the last ack around the time the server finished sending the data. It could mean that it acked the first packet, or the first few packets but then stopped? I think this is how a TCP timeout would be implemented but I don’t think that is what is happening here.
  • Possibly, in HTTP mode, the problem is not that there was no ack at all for 50s but that it took more than 50s after the server stopped sending data for the client to confirm it received all the data (so to send all tcp acks, presumably). This could be mean that symfonium consumes the data (and sends individual tcp acks) too slowly even if it continues sending acks at regular intervals. This would be consistent with the UI behavior where the song loading bar progresses slowly, slightly ahead of the playing cursor on the “Now playing” view.

I could not find any error message suggesting the connection closing is detected in symfonium logs. Note that symfonium logs have many connection errors referring 192.168.1.43, this is unrelated (and, arguably, a separate bug in symfonium): I have two subsonic media providers configured, one with a public IP and one with a local IP, I have disabled the one with the local IP using the “filter” feature in the UI, so I can only view, select and play songs from the one with the public IP. The local IP is not reachable anyway since my phone is using 4G and not connected to wifi. I have no idea why symfonium still tries to hit that private IP but it probably should not.

4. 2024-08-30T23:06:16 Navidrome gets next stream.view request

Navidrome gets a request for the next song and starts streaming it.

From Symfonium logs:

2024-08-30 23:06:15.688 Verbose/ExoPlayer: loading [eventTime=26584.64, mediaPos=130.82, window=0, period=0, true]
2024-08-30 23:06:15.793 Verbose/ExoPlayer: loading [eventTime=26584.74, mediaPos=130.92, window=0, period=0, false]
2024-08-30 23:06:15.800 Verbose/ExoPlayer: loading [eventTime=26584.75, mediaPos=130.93, window=0, period=0, true]
2024-08-30 23:06:15.803 Verbose/ExoPlayer: timeline [eventTime=26584.75, mediaPos=130.93, window=0, period=0, periodCount=2, windowCount=2, reason=SOURCE_UPDATE
2024-08-30 23:06:15.804 Verbose/ExoPlayer:   period [240.51]
2024-08-30 23:06:15.805 Verbose/ExoPlayer:   period [?]
2024-08-30 23:06:15.807 Verbose/ExoPlayer:   window [240.51, seekable=true, dynamic=false]
2024-08-30 23:06:15.808 Verbose/ExoPlayer:   window [?, seekable=false, dynamic=false]
2024-08-30 23:06:15.810 Verbose/ExoPlayer: ]
2024-08-30 23:06:15.811 Verbose/MusicPlayer: onTimelineChanged: 1
2024-08-30 23:06:15.811 Verbose/FirebaseManager: Result: T999
2024-08-30 23:06:15.815 Verbose/MediaPrepareManager: getPreparedMedia[M]: PlayableMediaItem(source=MediaItem{externalId='dc1e695025fa74b09f999313b1ad8c5b', mediaType=Song, file='Album/songB.flac', title='songB', resume=0}, url=https://<REDACTED>/rest/stream.view?id=dc1e695025fa74b09f999313b1ad8c5b&u=REDACTED&t=REDACTED&s=REDACTED&v=1.13.0&c=Symfonium&f=json, externalThumbnail=, originalUrl=, urlIncludeResumePoint=false, urlIncludeResumePointValue=null, alreadyResolved=false, mimeTypeOverride=, payload=, cacheKey=, preparationError=false, preparationErrorMessage=, playlistPosition=1, repeatCount=0, headers=null, mediaFormat=null, lyrics=[], chapters=null, uuid=868459ea-9829-455b-bf3d-442545ce4158, supportSeeking=true, playsAsAdvertisement=false, isPublicStreamable=false)

From navidrome logs:

2024-08-30T23:06:16.906257+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:06:16+01:00" level=debug msg="API: New request /rest/stream.view" client=Symfonium requestId=pierre-nuc/<REDACTED>-010139 username=pierre version=1.13.0
2024-08-30T23:06:16.909512+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:06:16+01:00" level=debug msg="Found matching player" client=Symfonium id=<REDACTED> requestId=pierre-nuc/<REDACTED>-010139 type=Symfonium/Android username=pierre
2024-08-30T23:06:16.911215+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:06:16+01:00" level=debug msg="Streaming RAW file" id=dc1e695025fa74b09f999313b1ad8c5b originalBitrate=1328 originalFormat=flac path="/remote/library/Album/songB.flac" requestBitrate=0 requestFormat= requestId=pierre-nuc/<REDACTED>-010139 requestOffset=0 selectedBitrate=0 selectedFormat=raw
2024-08-30T23:06:16.920520+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:06:16+01:00" level=info msg="Streaming file" artist=artistB bitRate=0 cached=false format=raw originalBitRate=1328 originalFormat=flac requestId=pierre-nuc/<REDACTED>-010139 title="songB" transcoding=false user=pierre

This seems to be a logical consequence of (3) on the client side. Symfonium is still happily playing song A at that point (about 2:16 into the song) and it is processing the data it got from the server about 1 minute ahead of that, judging from the progress bar on the “now playing” screen. When the loading bar reaches about 3:12 out of the 4 minutes song, it runs out of data as the server didn’t send the rest (presumably it remained buffered in HAProxy, not sure, but navidrome claims to have sent it all in step (2)). Symfonium doesn’t detect this as an error but simply loads the next song B, while continuing to play song A as long as there is data left in the buffer.

5. 2024-08-30 23:07:20 Song interrupted in symfonium

Symfonium client is still happily playing song A, but when it reaches about 03:12 out of the 04:00 song, it runs out of data and skips to the next song.

Symfonium logs show:

2024-08-30 23:07:20.281 Verbose/ExoPlayer: positionDiscontinuity [eventTime=26649.23, mediaPos=0.01, window=1, period=1, reason=AUTO_TRANSITION, PositionInfo:old [mediaItem=0, period=0, pos=240512], PositionInfo:new [mediaItem=1, period=1, pos=0]]
2024-08-30 23:07:20.282 Verbose/MusicPlayer: onPositionDiscontinuity [0]
2024-08-30 23:07:20.283 Verbose/ExoPlayer: mediaItem [eventTime=26649.23, mediaPos=0.01, window=1, period=1, reason=AUTO]
2024-08-30 23:07:20.285 Verbose/ExoPlayer: tracks [eventTime=26649.23, mediaPos=0.01, window=1, period=1
2024-08-30 23:07:20.286 Verbose/ExoPlayer:   group [
2024-08-30 23:07:20.287 Verbose/ExoPlayer:     [X] Track:0, id=null, mimeType=audio/flac, pcmEncoding=21, channels=2, sample_rate=44100, supported=YES
2024-08-30 23:07:20.288 Verbose/ExoPlayer:   ]
2024-08-30 23:07:20.290 Verbose/ExoPlayer:   Metadata [
2024-08-30 23:07:20.291 Verbose/ExoPlayer:     VC: artist=artistB
2024-08-30 23:07:20.291 Verbose/ExoPlayer:     VC: TITLE=songB
2024-08-30 23:07:20.292 Verbose/ExoPlayer:     VC: ALBUM=Album
2024-08-30 23:07:20.292 Verbose/ExoPlayer:     VC: GENRE=Classical
2024-08-30 23:07:20.293 Verbose/ExoPlayer:     VC: TRACKNUMBER=06
2024-08-30 23:07:20.294 Verbose/ExoPlayer:     VC: ENCODER=Lavf59.27.100
2024-08-30 23:07:20.296 Verbose/ExoPlayer:   ]
2024-08-30 23:07:20.300 Verbose/ExoPlayer: ]
2024-08-30 23:07:20.300 Verbose/MusicPlayer: onTracksInfoChanged
2024-08-30 23:07:20.303 Verbose/PlaybackController: closePreparedForRendering

This seems to be the logical consequence of what happened at points (3) and (4). Note that if there are no other songs in the queue in such a case, Symfonium will continue “playing” song A, just playing silence until it reaches the 4 minutes mark.

There does not seem to be any relevant logs on the navidrome side, and in general it seems Navidrome is blissfully unaware of what is happening as it is just streaming the next song already at this point. We do see one warning for “Request was interrupted” about 10 seconds before the song skips:

2024-08-30T23:07:10.292217+01:00 pierre-nuc navidrome[40260]: time="2024-08-30T23:07:10+01:00" level=warning msg="Request was interrupted" endpoint=/rest/stream.view error="context canceled" requestId=pierre-nuc/<REDACTED>-010139

Given how late this happens in the process I suspect it is either unrelated, or a consequence of the problem, rather than a cause.

Questions and change suggestions

Regarding the client timeout in step (3) which is apparently the root cause of all our problems here:

  • Could symfonium send acks more regularly / or send all the acks faster (I’m not 100% sure exactly how client inactivity is defined in the data receiving phase, but there must be a way for the client not to be considered inactive) rather than wait for the song to progress as it seems to be doing?
  • Failing that, when the connection is closed abruptly by the remote endpoint, can this be detected in symfonium? Rather than silently ignoring it and moving to the next song: clearly log it, possibly retry loading the same song from where it failed, if the API allows that, rather than loading the next song?
  • Failing that, what is the recommended client timeout setting? I cannot seem to find a good value: for instance using timeout client 180s in my HAProxy frontend config makes the issue considerably less frequent, but it still happens for longer songs (> 7-8 mins). Maybe using a timeout longer than my longest song would work, but that does not seem very practical and seems to go against HAProxy configuration good practices.

There may be a way for Symfonium to fix it by addressing the problem at step (4): if it realises that it runs out of data at 03:15 for a song that it knows should last 4 minutes, Symfonium could request the data again?

Note that there may be other (possibly related) issues with how Symfonium handles the TCP connection. When I increase the client timeout in HAProxy, I see many connections terminated with “CD” in HAProxy logs:

2024-08-30T23:39:44.380166+01:00 pierre-nuc haproxy[51863]: <REDACTED>:48898 [30/Aug/2024:23:39:44.156] navidrome_front~ navidrome_back/navidrome 0/0/1/21/222 200 475408 - - CD-- 1/1/2/1/0 0/0 "GET https://<REDACTED>/rest/stream.view?id=dc1e695025fa74b09f999313b1ad8c5b&u=pierre&t=<REDACTED>&s=<REDACTED>&v=1.13.0&c=Symfonium&f=json HTTP/2.0"

The uppercase “C” code indicates: the TCP session was unexpectedly aborted by the client. This does not seem to have any adverse effect, but seems unexpected nonetheless, so pasting it here in case it can help with the investigation.

That’s it, I’ve been using symfonium happily for months without HAProxy, so I’m hoping this can help make it more proxy-friendly.

The loading is handled by Google media3 that is optimized a lot by them to ensure things works correctly and using okHttp that is the most used, stable and RFC compliant http client on Android.

Then it works with all other proxy and without proxy, and connection cut are properly detected with every other cases.

So there is something HAProxy does not do like all the other things do.

If you want to have Symfonium load everything without buffer, enable playback cache and configure song preloading.

But there’s not much I can do on my side without impacting battery usage a lot.

“It works fine with X so it must be Y” :wink: I suspect it works fine without a proxy, or with many default proxy configurations, because they have no client timeouts. This is however not a very nice proxy configuration and I’m worried it may create other issues with multiple users.

I do not think it very likely there is a bug in the implementation of the haproxy client timeout, so my operating assumtion here is that haproxy is doing the right thing and simply revealing the underlying problem. Of course it may be wrong, but let’s entertain the idea for a few minutes.

What makes me curious is why does it take 79 seconds for Symfonium to process data that has already been sent by the server (between steps (2) and (4)) in the event sequence above? It looks like it is waiting for the song to progress enough before reading the next batch of data from the socket, and possibly just reading this data from the underlying http library and storing it somewhere in memory would be enough to fix the problem?

Apologies if what i’m suggesting doesn’t make sense, it’s quite hard to debug stuff without looking at the code.

Note that for some reason the issue still happens when I enable playback cache, and I can’t really explain it. Presumably, with caching, all data would be read from the socket immediately and stored somewhere in heap memory, so maybe my idea above doesn’t really work. Hard to say as I haven’t done the sequencing work I pasted above with cache enabled.

I’m also curious to hear what configuration other people are using to get HAProxy to work nicely. Currently I’m using

timeout connect 5000 # debian default
timeout server  50000 # debian default
timeout client 600s

The 600s client timeout makes the issue mostly go away but theoretically it could still happen for a song longer than 10 minutes. I’m also wondering if it means that the proxy is frantically trying to re-transmit packets that it believes have been lost while waiting for the final acks.

The loader buffers around 1 minute of data but keep the connection up as per the spec, it seems that in all cases the proxy does not close the connection correctly due to the wrong timeout.

So there’s load to the server usually at most every 1m30, so the issue is not due to Symfonium not requesting data for 10 minutes, there’s no enough memory to buffer that much.

What I suspect is that Symfonium request the file, the proxy buffers a lot, then at some point timeout with Navidrome leading to this issue.
Navidrome probably have timeouts too to release the transcoding session if you transcode.

Check if there’s settings about the cache and pre load and how large file transfer is handled by the proxy.

And for the record, the only issue reported here is about Navidrome and HAProxy, so quite limited in scope.

Could also be some issue when switching between http2 and 1.1.

How does the loader ask for 1 minute of data? The navidrome server seems to send everything as part of the response to a single HTTP GET stream.view request in step (2).

By which mechanism are you expecting to limit the flow of data returned by the navidrome server? Is it on purpose that you don’t consume the data so that it is not acked and the server withholds it for you? Is that not going to cause wasteful TCP retransmissions?

On the other hand if the flow control mechanism is at the HTTP level I would expect HAProxy to handle it properly and not interfere, but I will dig a bit deeper into preload / file transfer configuration for HAProxy.

What do you think is the right timeout in that case?

Possible, but there is no sign of any timeout between HAProxy and the navidrome server, neither in the navidrome logs, nor in HAproxy logs, except one at 23:07:10 which is after the issue happened.

Indeed that was my first hypothesis but I think I managed to rule it out because I see the same issue when checking the “Slower downloads” option in Symfonium which I believe will force using only HTTP1.1.

You probably want to use wireshark and check more about how this works at low level for http.

For the timeout I meant handling, for every other proxy, Symfonium properly receive close events or bad EOF and not a proper close that make it think that the data is fully sent.

We are talking about transcoding so the data is transmitted until there’s a proper ACK to know the file is ended, if Media3 does not error then something is wrong at the network level.

And no slower downloads have nothing to do with playback.