Add two setting to adjust thresholds for Skip Silence

norgan · January 10, 2024, 6:41pm

Feature description:

The Skip Silence (as far as I can tell) feature currently skips all silences completely, even very short silences. My idea is to add two settings to set thresholds for how the Skip Silence feature works…

The first setting would allow a user to set a threshold for how long a period of silence needs to be in order for it to be skipped. For example, a “Silence Threshold” setting of “2 seconds” would mean that silences that only last 1.5 second are not skipped, while silences lasting longer than 2 second will be skipped as normal.

The second setting would allow a user to set an amount of silence that will be not skipped (Apologies, I don’t have a nice user-friendly name for this at the moment, maybe something like “Keep X seconds of silence”?). Basically, if there is a period of silence lasting 30 seconds that will be skipped, this value could be set to something like 2 seconds, which would result in only 28 seconds of the silence being skipped, with 2 seconds of silence being left in.

These two settings can be used in conjunction with each other to achieve a effect where, as an example, all silences that are longer than 5 seconds will effectively be truncated down to 1 second, and any silences shorter than 5 seconds will be left alone and not skipped.

Problem solved:

I’ve read in some other threads that the Skip Silence feature is intended for audio books, but it incidentally has been very nice for skipping long periods of silence that often separate a bonus track on the last track of an album.

Problem 1 is: the Skip Silence feature currently skips the silence so completely that the bonus track kicks in very abruptly with no audible separation. Allowing a setting to leave a second or two of silence would allow these long periods of silence (sometimes minutes of silence) to be skipped, but also leave in a second or two of breathing room.

Problem 2 is: the Skip Silence feature currently skips even very short periods of silence that is can often ruin the pacing of audio tracks with intended short periods of silence, for example some albums with dialog or skit tracks where speaking and/or musical hits are sparse and separated, but the timing of them is important for musical or humor effect. Allowing a setting to set the threshold for how long a silence needs to be in order to be skipped will allow users to, at their own discretion, avoid these shorter intentional silences from being ruined.

I have an example track that I can share in which the Skip Silence feature is very useful due to a long silence separating the bonus track, but also negatively affects the bonus track’s intro due to how aggressive/sensitive the skipping is.

Brought benefits:

I believe this would make the Skip Silence feature useful for more use cases beyond the intended audio book use case, and allow users to fine-tune the behavior to their own preference. It would certainly make listening to albums with silence-separated bonus tracks more enjoyable to listen to, while not negatively affecting the original use case of audio book listening.

Additional description and context:

I’ve attached two audacity screenshots the demonstrate the ultimate effect my feature request is hoping for. It is focused on a large gap of silence leading into a bonus track. The red sections I added to show sections of the song that the the Skip Silence feature will actually skip.

You’ll notice the bit around 6:40 where there are very short bits of silence that currently get skipped. A “Silence Threshold” setting of 5 seconds could prevent these short bits of silence from being skipped, as shown in the second screenshot.

You’ll also notice the large periods of silence currently are skipped to completely that there is no breathing room while listening. While listening to this track with the current Skip Silence feature, itsoundslikewhatasentencelookslikewithoutanyspaces. A “Leave X Seconds of Silence” setting of 2 could prevent this collapse and make it a much more enjoyable experience while listening, while still skipping almost all of the silence. You can see this in the 2nd screenshot around 5:30, 6:00, and 6:30.

I hope this helps explain what I’m talking about.

Screenshots / Mockup:

Tolriq · January 11, 2024, 8:24am

The skip works at audio packet level not file level so do not know the future.

What’s possible is configurable minimal duration before it’s considered silence (Current default is 150ms)
And configurable padding silence that must be lower than the previous duration (Current default is 20ms)

So the padding would be possible but before the skip not after on your image.

And you need to understand that high values like 5 seconds means that the audio is delayed to process that much amount of data before taking actions, on most cases this is not really a problem as device process data way faster than actual playback but on some device with high bitrate files this can generate issues.

norgan · January 16, 2024, 7:40pm

Hey Tolriq,

Thanks for the reply and explanation. I certainly was assuming the silence skipping feature had more foresight than it actually does. In fact, I was testing this out on a song with a much larger section of silence, and indeed I saw it was only skipping as far as what was buffered repeatedly until the whole silent section was eventually buffered and skipped over the course of a few seconds.

The two configurations you described sound exactly like what I was hoping for, and with better names too! I haven’t found anything like them in any of the settings screens, so I assume they are not exposed to us end users currently. Now that I know better how the skip feature works, though, I understand even if I could change those values, it may not work as smoothly as I hope for my use case. And I understand that allowing users to change those values can potentially create side effects that are not obvious to end users and may even function inconsistently for users with different hardware and connections. So with all of that in mind, I completely understand if you feel it is best to not expose those settings within the app.

I have one other thought that may be less imposing but may still address what I’m hoping for. Some of the bits being skipped too aggressively in the songs I’ve tried out aren’t actually perfectly silence, but rather are only very quiet. I understand that every audio file will have a different noise floor due to its encoding as well as it’s mastering, and audiobooks in particular tend to be a bit noisy, so the dB threshold your silence detector uses is of course configured appropriately. Instead of the two settings I originally asked about, would it be more reasonable to instead allow users to adjust the dB threshold for what is treated as a silence?

Anyway thanks again for your work on Symfonium and your open communication and patience here. I really appreciate it.

Tolriq · January 16, 2024, 8:06pm

The threshold is configured via silenceThresholdLevel that is defined as The absolute level below which an individual PCM sample is classified as silent. That makes the value quite abstract to explain in a settings

splinter · January 16, 2024, 8:46pm

Just out of interest: what kind of values are allowed for that setting? Does that differ with the bitrate?

Tolriq · January 17, 2024, 6:46am

It’s a short between 0 and 32767, I don’t think bitrate impact but sample rate probably.

splinter · January 17, 2024, 5:17pm

No sample rate makes no sense, because the value affects each individual PCM sample. So no matter if 48k or 96k samples per second, they all will be affected.

But the PCM bit depth (not the same as e.g. mp3 bitrate - sorry, I chose the wrong term above) actually determines how many different values a sample can have, and by that how many different values the amplitude of the wave signal can have, so effectively how many volume steps there are in a sample. With 16 bits you have 65536 different values, but as values could be above or below the zero line, you have 32768 values in each phase direction. For the volume it is not relevant if the sample is above or below the zero line. So this value seems to be adjusted for 16 bit (L)PCM, and it defines the value of deviation from the zero line whoch should still be treated as silence.

After searching for the description you quoted it is no surprise anymore:

An AudioProcessor that skips silence in the input stream. Input and output are 16-bit PCM.

(from: SilenceSkippingAudioProcessor | Android Developers)

Does that mean that ExoPlayer internally deals with 16 bit PCM only? I don’t want to open that box, might just be interesting for the hi-res folks.

Tolriq · January 17, 2024, 5:23pm

No just that some processors only works when the data is 16 bit.