Fix I/O freeze with Linux have large RAM
TLDR;
According to this pull-request setvm.dirty_bytes
to268435456
andvm.dirty_background_bytes
to134217728
which is same as 256 Mb and 128 Mb.
I never had any issue on my Linux laptop before. Well, at least before I install Steam on my Linux laptop. Linux gaming is really sky-rocketing when the time I bought my gaming laptop. Thanks to Valve and Proton, gaming on Linux is more and more reliable after the Steam Deck release. I can enjoy almost every game in my Steam library. When I start to download games I notice there is some random freeze during the download and game installation.
After some investigation I realise it's most likely because my BTRFS + CryptSetup bottleneck my NVMe performance. However, at the time when I decide which encode I should setup my BTRFS partition, I already done the research that normal usage of I/O will not cause the freeze based on calculations, it should be totally fine. Hence I towards my direction to dirty page.
Dirty page, how dirty is that?
About how dirty page works, this StackOverflow answer will describe this much more detail and better than me.
I'm not going to explain all the technical details about how dirty page works in Linux, but can picture you this. Dirty page is like pagefile
in Windows, but instead of create a pagefile.sys
inside your C:\
, or like Linux's swapfile
sits inside your partition, it put cache into RAM and till the time free pages are all full, Kernel will generate a flusher process to flush the dirty pages. dirty_ratio
should be around 20 percents of your total RAM, and dirty_background_ratio
should be 10 percents of your total RAM by default. In my case it is around 4.8 G of dirty_background_ratio
and 9.6 G of dirty_ratio
. This size is based on 48G RAM sits inside my laptop. There is no such of HDD or NVMe drive can handle 4.8G write speed per second, eventually will make Kernel to hang all the process and wait flusher to clean out the dirty pages.
How to prevent I/O freeze
First thought is reduct dirty ratio percentage, but you cannot just lower the value and hope it will fix everything. Zero out both dirty_ratio
and dirty_background_ratio
will simply make your system not usable because Kernel flusher process is constantly writing data to HDD, making system to freeze almost all the time. So the ideal situation is find the perfect value of dirty_background_ratio
and dirty_ratio
make flusher can process dirty page without hanging all the process on the side. However, I do not know how to accomplish this due to I have no idea how the calculation works with dirty_ratio
even after few days of googling around. Moreover, dirty_ratio
and dirty_background_ratio
will not accept decimal as value, so the lowest value will be around 480Mb (1%) and 960Mb(2%).
Percentage or Bytes?
From reading The Linux Kernel documentation, I realise there is an alternative option about dirty_ratio
is defined as dirty_bytes
. Which means I can limit the dirty page to precise size, which is more reliable in my case. After tried different sizes of dirty_bytes
without success, I decide to dig more into various Linux distros. Hopefully I can find one distro has default setting of dirty_bytes
but not dirty_ratio
instead.
Solution, Thank you random stranger!
After awhile I found that Pop!OS Github have a similar issue. Moreover, there is a pull-request which is using dirty_bytes
. I put those values into my /etc/sysctl.conf
and system is almost freeze-free!
However random freeze still happens from time to time when install a game or doing large file transferring, I'm happy for what I get now. If someday I decided to remove cryptsetup
from my system and stick to non-encrypt BTRFS (which full wipe is needed), Probably will never get freeze again.