Bug #594
failed to restart vm
Status: | Closed | Start date: | 04/27/2011 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | - | % Done: | 0% | |
Category: | - | |||
Target version: | - | |||
Resolution: | fixed | Pull request: | ||
Affected Versions: | OpenNebula 2.2 |
Description
After a reboot, openenbula failed to restart a VM.
in oned.log:
Wed Apr 27 09:25:05 2011 [ReM][D]: VirtualMachineAction invoked
Wed Apr 27 09:25:05 2011 [DiM][D]: Restarting VM 19
Wed Apr 27 09:25:05 2011 [TM][D]: Message received: LOG - 19 tm_delete.sh: Deleting
Wed Apr 27 09:25:05 2011 [TM][D]: Message received: LOG - 19 tm_delete.sh: Executed "rm -rf ".
Wed Apr 27 09:25:05 2011 [TM][D]: Message received: TRANSFER SUCCESS 19 -
Wed Apr 27 09:25:10 2011 [ReM][D]: HostPoolInfo method invoked
it is on ubuntu 11.4 with OpenNebula from GIT (27 may 2011).
History
#1 Updated by Frederic Dreier about 10 years ago
It seems to appear only when I kill the process on the node, restart OpenNebula on controller, wait VM to be set in "unknown" state and run "onevm restart <id>" command.
If I kill the process and restart the VM when in unknown state (skipping the daemon restart): the VM reboot properly.
I observed it once when I got a power problem on a controller and a node. And I was able to reproduce it twice on a test system (fresh install).
#2 Updated by Frederic Dreier about 10 years ago
I get an error when trying to resubmit the failed VM:
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: DST: /srv/cloud/one/opennebula/var//2/images/disk.0
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Creating directory /srv/cloud/one/opennebula/var//2/images
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Executed "mkdir -p /srv/cloud/one/opennebula/var//2/images".
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Executed "chmod a+w /srv/cloud/one/opennebula/var//2/images".
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Cloning /srv/cloud/images/template-ubuntu10.4_64bits_10g.img
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: Executed "cp -r /srv/cloud/images/template-ubuntu10.4_64bits_10g.img /srv/cloud/one/opennebula/var//2/images/disk.0".
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: ERROR: Command "chmod a+rw /srv/cloud/one/opennebula/var//2/images/disk.0" failed.
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: LOG - 2 tm_clone.sh: ERROR: chmod: changing permissions of `/srv/cloud/one/opennebula/var//2/images/disk.0': Operation not permitted
Wed Apr 27 22:53:01 2011 [TM][D]: Message received: TRANSFER FAILURE 2 chmod: changing permissions of `/srv/cloud/one/opennebula/var//2/images/disk.0': Operation not permitted
When actually permissions look like that (before the failure, since then the images directory is deleted):
/srv/cloud/one/opennebula/var/2/images:
total 1452920rw-r--r- 1 oneadmin cloud 787 2011-04-27 22:47 deployment.7rw-rw-rw 1 root root 1487405056 2011-04-27 22:48 disk.0rw-r--r- 1 libvirt-qemu kvm 374784 2011-04-27 22:44 disk.1
#3 Updated by Ruben S. Montero about 8 years ago
- Status changed from New to Closed
- Resolution set to fixed
- Affected Versions OpenNebula 2.2 added
The restart cycle has been improved and should work by now