What is this post about?
Couple of days ago some colleagues are complaining about unbearable performance of SAP Application which is currently host with OpenSUSE on Azure VM, shortly after my manager call me in on Monday morning and tells me to reboot production environment of SAP.
Even though I know something will hit the fan because I surely did remember last time we reboot that machine, is 700+ days before.
As expected VM stuck at closing db process for an hour and a half, after briefing my manager, he said "Hard reset." and it died, completely.
I tried to stop it but NooOoOOooOOoo they don't listen to me!
So this post is about how I spent 26 hours only depend on Microsoft messy documentation and try and error.
Just in case I won't step into this mud again.
CAUTION
Perform such things like hard reset during a linux shutdown process will highly likely to damange your boot partition.
Diagnose Problem
On Azure Portal find your VM and check the boot log by click the Serial Console on the left.
If message like down below appears, you might have yourself a broken boot partition.
Failed to start File System Check on /dev/disk/by-uu...d-121f462d7e8d
Solution
Prerequisite
Install az cli extension
Refer Installation Guide
Install vm-repair
Extension
If you haven't install this extension before.
$ az extension add -n vm-repair
If you already installed this extension, it's always a good idea to check extensions update.
$ az extension update -n vm-repair
Setup Repair VM
$ az vm repair create -g {{MyResourceGroup}} -n {{myVM}} --repair-username {{username}} --repair-password {{password!234}} --verbose
What this command will act is copy system partition files and mount it to a new VM which is automaticaly created.
Wait for couple of minutes and you can ssh into your new VM.
Start Repair
Repair Command
SSH into your repair VM, use $ lsblk
to check the device Id of your broken parition and run:
$ fsck /dev/{{device_name}}
Y on all questions.
Unattened Script
Azure also provided a automatic repair script but I haven't tried it.
$ az vm repair run –g {{MyResourceGroup}} –n {{MyVM}} -–run-on-repair --run-id 2 --verbose
Complete and Restore
Delete Repair VM
Use command down below to delere your repair VM.
$ az vm repair restore -g {{MyResourceGroup}} -n {{MyVM}} --verbose
This will replace the partition we just fixed to broken partition.
And depends on your needs you can choose to keep or delete your repair VM, but it will still be charged even it's not booted.
Boot up VM
Now you can boot up your VM with fixed boot partition.