StarCluster - Mailing List Archive

Re: load balanced nodes accepting jobs before ready

From: Stewart, Andrew <no email>
Date: Mon, 14 Apr 2014 18:02:20 +0000

pkginstaller was called during add_node, but the node was added to the host list and its queue enabled before pkginstaller had a chance to finish installing dependencies. So it looks like a race condition. I did bump pkginstaller to the front of the plugins line (ahead of IPCluster) but I haven’t yet bothered to test whether that helps the situation any. The most certain way to handle it would be to just disable the queue until provisioning is complete.

I actually think the simpler solution would be to bypass pkginstaller and just share managed packages with compute nodes via NFS. Why reinstall the same package N times?


--
Andrew Stewart
Office of Research Information Services (ORIS),
Office of the Chief Information Officer (OCIO),
Smithsonian Institution
202-505-3633
From: Rajat Banerjee <rajatb_at_post.harvard.edu<mailto:rajatb_at_post.harvard.edu>>
Date: Monday, April 14, 2014 at 10:49 AM
To: Andrew Stewart <stewarta_at_si.edu<mailto:stewarta_at_si.edu>>
Cc: "starcluster_at_mit.edu<mailto:starcluster_at_mit.edu>" <starcluster_at_mit.edu<mailto:starcluster_at_mit.edu>>
Subject: Re: [StarCluster] load balanced nodes accepting jobs before ready
Hi,
Does that mean that the pkginstaller plugin doesn't get called during add_node ? before the host is added to the SGE host list?
Raj
Received on Mon Apr 14 2014 - 14:02:25 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject