Bug #252
OpenNebula shutdown check in 1.4 for KVM is harmful
Status: | Closed | Start date: | 06/04/2010 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | Jaime Melis | % Done: | 0% | |
Category: | Core & System | |||
Target version: | Release 2.0 | |||
Resolution: | Pull request: | |||
Affected Versions: |
Description
OpenNebula 1.4 implements the shutdown of KVM machines like this:
cmd="#{LIBVIRT[:shutdown]} #{deploy_id} && " \ "while [ $(#{LIBVIRT[:poll]} #{deploy_id} > /dev/null 2>&1; " \ "echo $?) -eq \"0\" ]; do sleep 2; done ; sleep 4"
This behavior is -- how do I put it politely? -- extremely not good. What it basically does is that it tries to shutdown the machine and then executes virsh dominfo over and over again until this command returns an error. In short: If anything goes wrong, we say everything is fine. In my scenario libvirt sometimes produces weird error messages and so it happens regularly (at least for me) that virsh dominfo fails although the machine is still running. Which is bad since it will invoke scripts copying the image of the still running machine back to the master host.
I implemented a more robust version of the shutdown command:
cmd = "#{LIBVIRT[:shutdown]} #{deploy_id} && " \ "c=0 && while [ $c -lt \"5\" ]; do " \ "if [ $(#{LIBVIRT[:poll]} #{deploy_id} > /dev/null 2>&1; echo $?) -ne \"0\" ]; " \ "then let 'c=c+1'; else c=0; fi; sleep 2; done; sleep 4"
It's not a perfect solution but it's much more robust since it requires virsh dominfo to fail 5 times consecutively before it accepts the machine as shutdown. After changing the code accordingly I never had any problems like the one described above (machine still running although ONE regards it as shutdown) again.
History
#1 Updated by Ruben S. Montero about 11 years ago
- Status changed from New to Assigned
- Assignee set to Javi Fontan
- Target version changed from Release 1.4 to Release 2.0
#2 Updated by Ruben S. Montero almost 11 years ago
- Assignee changed from Javi Fontan to Jaime Melis
#3 Updated by Jaime Melis almost 11 years ago
- Status changed from Assigned to Closed
Updated and tested according to the suggested solution. Closing ticket.