Fix I/O freeze with Linux have large RAM

Photo by Nathaniel Sison / Unsplash

TLDR;

According to this pull-request set vm.dirty_bytes to 268435456 and vm.dirty_background_bytes to 134217728 which is same as 256 Mb and 128 Mb.

I never had any issue on my Linux laptop before. Well, at least before I install Steam on my Linux laptop. Linux gaming is really sky-rocketing when the time I bought my gaming laptop. Thanks to Valve and Proton, gaming on Linux is more and more reliable after the Steam Deck release. I can enjoy almost every game in my Steam library. When I start to download games I notice there is some random freeze during the download and game installation.

After some investigation I realise it's most likely because my BTRFS + CryptSetup bottleneck my NVMe performance. However, at the time when I decide which encode I should setup my BTRFS partition, I already done the research that normal usage of I/O will not cause the freeze based on calculations, it should be totally fine. Hence I towards my direction to dirty page.

Dirty page, how dirty is that?

About how dirty page works, this StackOverflow answer will describe this much more detail and better than me.

I'm not going to explain all the technical details about how dirty page works in Linux, but can picture you this. Dirty page is like pagefile in Windows, but instead of create a pagefile.sys inside your C:\, or like Linux's swapfile sits inside your partition, it put cache into RAM and till the time free pages are all full, Kernel will generate a flusher process to flush the dirty pages. dirty_ratio should be around 20 percents of your total RAM, and dirty_background_ratio should be 10 percents of your total RAM by default. In my case it is around 4.8 G of dirty_background_ratio and 9.6 G of dirty_ratio. This size is based on 48G RAM sits inside my laptop. There is no such of HDD or NVMe drive can handle 4.8G write speed per second, eventually will make Kernel to hang all the process and wait flusher to clean out the dirty pages.

How to prevent I/O freeze

First thought is reduct dirty ratio percentage, but you cannot just lower the value and hope it will fix everything. Zero out both dirty_ratio and dirty_background_ratio will simply make your system not usable because Kernel flusher process is constantly writing data to HDD, making system to freeze almost all the time. So the ideal situation is find the perfect value of dirty_background_ratio and dirty_ratio make flusher can process dirty page without hanging all the process on the side. However, I do not know how to accomplish this due to I have no idea how the calculation works with dirty_ratio even after few days of googling around. Moreover, dirty_ratio and dirty_background_ratio will not accept decimal as value, so the lowest value will be around 480Mb (1%) and 960Mb(2%).

Percentage or Bytes?

From reading The Linux Kernel documentation, I realise there is an alternative option about dirty_ratio is defined as dirty_bytes. Which means I can limit the dirty page to precise size, which is more reliable in my case. After tried different sizes of dirty_bytes without success, I decide to dig more into various Linux distros. Hopefully I can find one distro has default setting of dirty_bytes but not dirty_ratio instead.

Solution, Thank you random stranger!

After awhile I found that Pop!OS Github have a similar issue. Moreover, there is a pull-request which is using dirty_bytes. I put those values into my /etc/sysctl.conf and system is almost freeze-free!

However random freeze still happens from time to time when install a game or doing large file transferring, I'm happy for what I get now. If someday I decided to remove cryptsetup from my system and stick to non-encrypt BTRFS (which full wipe is needed), Probably will never get freeze again.