qsubsec (2015-12-03)

qsubsec is a simple template language system for creating Grid Engine job scripts. The source can be found on github.

Python JSV API Available on github (2014-04-28)

Now Grid Engine also has a Python JSV implementation available. Adam Tygart just published one on github. Thanks Adam!

AJO - The Asynchronous Job Operator Portal for Grid Engine (2013-02-09)

AJO simplifies the task of job submission and file-staging over a secure connection from a submission portal (or just your local computer) to your compute farm. It was an internal project at RDLAB and recently published under the open source GPL3 license. The scripts are downloadable as a tarball or accessible via SVN from http://svn-rdlab.lsi.upc.edu/subversion/ajo/public. The username and password is both public_ajo.

All what it requires is an ssh client installation as well as Ruby. AJO is a set of Ruby scripts including a configuration file (config.rb), which must be adapted to your environment. You have to set the hostname of the Grid Engine submission host, the path to your remote Grid Engine installation ($SGE_ROOT). For the encryption you need to set cipher salt and keys. The next thing to configure in the configuration file is the local directories and files with data you need on your remote Univa Grid Engine cluster as well as the output directories/files you need back on your local host. Those files/directories are going to be copied transparently to the cluster during job launching. Finally there is a section where you can insert the GE job script contents (the job scripts are generated on the fly). Each of them is going to be started as an own Grid Engine job.

After you configured your template you can launch it with:

      ./ajo -c config.rb -s
      Job submitted correctly. The job identifier is
      da558c9bf39fd052806e69d6afbc36a7e0718a53604eaff47bf6efd081fe40a239aa6572...
 

The status of your jobs you can track remotely with a secure token generated during submission.

      ./ajo -q
      da558c9bf39fd052806e69d6afbc36a7e0718a53604eaff47bf6efd081fe40a239aa...
      Your job has finished running on Sat Feb 9 09:36:58 2013. You can now do
      './ajo --retrieve ID' to download the output files and folders.
 

This token based system makes it an ideal candidate for using within a web based job submission portal.

Finally you want to get the output back to your local machine.

      ./ajo --retrieve da558c9bf39fd052806e69d6afbc36a7e0718a53604eaff47bf....
      Downloaded the output to /tmp/tmp.ZpiQWNfD..
 

More detailed information about how it works you can find on their web page at RDLAB.

Wok - A Workflow Management System Supports Grid Engine through DRMAA

Wok describes itself as follows:

"Wok is a workflow management system implemented in Python that makes very easy to structure the workflows, parallelize their execution and monitor its progress among other things. It is designed in a modular way allowing to adapt it to different infrastructures.
For the time being it is strongly focused on clusters implementing any DRMAA compatible resource manager (i.e. Oracle Grid Engine) which working nodes have a shared folder in common. Other, more flexible infrastructures (such as the Amazon EC2) are considered for future implementations..."

Read more in the github Wok project.

KNIME - The Konstanz Information Miner

KNIME is a graphical compute workflow tool based on the Eclipse framework. It is available for free as well as with commercial support. What you basically have is a drawing board where you visually design your compute workflow (similar to WEKA). You can drag and drop and connect different nodes. Nodes are representing some functionality of your workflow. There are nodes for reading data from databases or files, nodes for doing some calculations like data clustering, and nodes for handling output or to visualize the results. The enterprise edition has capabilities to exploit a compute cluster by submitting jobs to Univa Grid Engine.

XtalOpt supports Grid Engine

From XtalOpt:

"XtalOpt is a free and truly open source evolutionary algorithm designed to predict crystal structures. It is implemented as an extension to the Avogadro molecular editor."

It also supports Grid Engine as job scheduler. More information provides this blog entry.

PythonGrid - A High Level Python Wrapper for Grid Engine

Pythongrid is freely available (GNU GPL v2) at github.com and offers job submission and job monitoring capabilities for Grid Engine. The developers describe the project as follows (excerpt from http://code.google.com/p/pythongrid/):

"This module provides high level functionality for cluster computing in python using the Sun Grid Engine. As some cluster environments are notoriously unreliable, pythongrid attempts to handle job monitoring and resubmission (in case of sudden death of nodes) under the hood, while providing the user with a simple map-reduce like interface.

main features
  • Uses ZMQ-based heart-beat to monitor job status
  • Robust error detection (out-of-memory, node failure)
  • Automated resubmission in case of unexpected failure
  • Error emails, including CPU/MEM statistics
  • Optional web-interface to monitor jobs
  • Let's you easily switch between local multiprocessing and cluster computing"

 

Qmem: Grid Engine Memory Usage Statistics

From: https://github.com/txemaheredia/qmem

"Qmem is a script designed to describe the memory usage of a SGE cluster. If your cluster has memory restrictions, the usage of qstat solely is not enough to monitor its state properly. Qmem attempts to solve that."