StarCluster - Mailing List Archive

Re: Cluster-wide DB access

From: Justin Riley <no email>
Date: Thu, 5 Apr 2012 10:40:38 -0400

Hi Dan,

Awesome, thanks for sharing these details. Do you by chance have a
mongodb plugin for StarCluster that does all of this or would you be
willing to put one together? I'd like to include a mongodb plugin in the
next feature release...

~Justin

On Wed, Apr 04, 2012 at 08:57:01AM -0400, Dan Yamins wrote:
> 1) What is the format of your data? And how big are the entries? I
> like MongoDB for this sort of thing, but it may depend on what kind thing
> you want to store.
> 2) ssh tunnels maybe a good solution for having a common DB backing the
> cluster. Basically, if you use a DB that is accessible as a service on a
> port, then if you ssh tunnel from the various worker nodes to the node
> running the DB, software running on the worker nodes can act "as if" the
> database were purely local.
> In other words, do three things
> A) set up a single DB actually running on one designated node, one
> some port. e.g. port 27017 on master.
> B) write code in your worker that pretends the DB is local on the
> port (here's pythonesque code for mongoDB):
> connection = pymongo.connection(host='localhost', port=27017)
> collection = conn['my_database']['my_collection']
> collection.insert(my_record)
> <etc.....>
> C) and then separately establish an ssh tunnel from the worker node
> to the master (or wherever the single DB is running). This can be done
> in a starcluster plugin in the "add_node" or "run" methods like this:
> workernode.ssh.execute("ssh -f -N -L 27017t:localhost:27017
> root_at_master")
> Of course you could start this by hand on all the nodes as well, but that
> gets a little tedious, and the plugin system is perfect for this kind of
> job.
> Having done A), B), and C), when you run the code in B) on your worker
> node, the code will simple read and write to the single master database
> from A) without having to know anything about the fact that's running on a
> cluster.
> On Tue, Apr 3, 2012 at 11:22 PM, Chris Diehl <[1]cpdiehl_at_gmail.com> wrote:
>
> Hello,
> I would like to use StarCluster to do some web scrapping and I'd like to
> store the collected data in a DB that is available to all of the cluster
> nodes. Is there a way to have a common DB backing the entire cluster?
> Any particular DBs that anyone has had success with?
> Thanks for your assistance!
> Chris
> _______________________________________________
> StarCluster mailing list
> [2]StarCluster_at_mit.edu
> [3]http://mailman.mit.edu/mailman/listinfo/starcluster
>
> References
>
> Visible links
> 1. mailto:cpdiehl_at_gmail.com
> 2. mailto:StarCluster_at_mit.edu
> 3. http://mailman.mit.edu/mailman/listinfo/starcluster

> _______________________________________________
> StarCluster mailing list
> StarCluster_at_mit.edu
> http://mailman.mit.edu/mailman/listinfo/starcluster




  • application/pgp-signature attachment: stored
Received on Thu Apr 05 2012 - 10:40:41 EDT
This archive was generated by hypermail 2.3.0.

Search:

Sort all by:

Date

Month

Thread

Author

Subject