Gridware Cluster Scheduler 9.0.7: Enhanced Stability and Performance (2025-07-08)
Release Date: 2025-07-08
We're pleased to announce the release of Gridware Cluster Scheduler 9.0.7, built on Open Cluster Scheduler 9.0.7 (formerly known as "Sun Grid Engine"). This release continues our commitment to delivering reliable, high-performance workload management for HPC environments across diverse computing architectures.
Key Improvements in 9.0.7
Enhanced Stability and Reliability
Version 9.0.7 addresses several important areas to improve system reliability:
Thread Safety Improvements: The accounting and reporting code has been made fully thread-safe, eliminating potential race conditions in high-throughput environments.
Core Binding Fixes: Resolved issues with both striding and explicit core binding strategies that could prevent optimal core allocation even when cores were available.
Error Reporting: Fixed truncation issues in error messages displayed by qstat -j
, ensuring administrators receive complete diagnostic information.
Installation Improvements: Addressed installer issues and corrected documentation references to ensure smooth deployment experiences.
Seamless Binary Replacement Upgrades
For existing 9.0.x deployments, upgrading to 9.0.7 remains straightforward with our binary replacement approach:
- Stop current services
- Replace binaries with 9.0.7 versions
- Restart services
No configuration changes or extended maintenance windows are required, making this an ideal upgrade for production environments.
Comprehensive Architecture Support
Open Cluster Scheduler 9.0.7 maintains extensive support for modern computing architectures:
- x86-64: Full support across major Linux distributions (RHEL, Rocky, Ubuntu, SUSE)
- ARM64: Comprehensive support including NVIDIA Grace Hopper platforms
- Specialized Architectures: Support for PowerPC (ppc64le), s390x, and RISC-V platforms
- Operating Systems: Linux distributions, FreeBSD, Solaris, and macOS (client tools)
This broad compatibility ensures organizations can deploy consistent workload management across heterogeneous computing environments.
Notable Features from the 9.0.x Series
Since many users may be upgrading from earlier versions, it's worth highlighting key capabilities introduced throughout the 9.0.x series:
qtelemetry (Developer Preview)
Integrated metrics exporter for Prometheus and Grafana, providing detailed cluster monitoring including host metrics, job statistics, and qmaster performance data.
Enhanced NVIDIA GPU Support
The qgpu
command simplifies GPU resource management with automatic setup, per-job accounting, and support for Grace Hopper architectures.
MPI Integration Templates
Out-of-the-box support for major MPI distributions (Intel MPI, OpenMPI, MPICH, MVAPICH) with ready-to-use parallel environment configurations.
Advanced Resource Management
- RSMAP (Resource Map) complex type for managing specialized resources like GPU devices
- Per-host consumable resources
- Resource and queue requests per scope for parallel jobs
Performance and Scalability
The 9.0.x series represents significant performance improvements over previous versions through:
- Multi-threaded Architecture: Separate thread pools for different request types
- Enhanced Data Stores: Multiple data stores reducing internal contention
- Automatic Session Management: Ensures data consistency while maintaining performance
- Optimized Scheduling: Improved algorithms for large-scale deployments
Continued 9.0.x Support
We remain committed to supporting the entire 9.0.x series with ongoing maintenance, security updates, and technical support. This provides organizations with confidence in their long-term deployment strategy while allowing flexibility in upgrade timing.
Getting Started
Quick Evaluation
For testing Open Cluster Scheduler 9.0.7 (the most feature rich and modern open source "Sun Grid Engine" successor) on major Linux distributions:
# Review the script before running
curl -s https://raw.githubusercontent.com/hpc-gridware/quickinstall/refs/heads/main/ocs.sh | OCS_VERSION=9.0.7 sh
If you are interested in our commercially supported Gridware Cluster Scheduler, please speak with us.
Production Deployment
Production environments should follow our comprehensive installation guide included with the release, ensuring proper configuration for specific requirements and environments.
Resources
- Source Code & Documentation: GitHub Repository
- Release Notes: Complete technical details and full changelog
- Community Support: Active development and user community
Looking Forward
Version 9.0.7 reflects our ongoing dedication to providing robust, high-performance workload management solutions. Whether you're running traditional HPC simulations, modern AI workloads, or mixed computing environments, Gridware Cluster Scheduler delivers the reliability and performance your critical applications require.
The combination of enhanced stability, seamless upgrade paths, and broad architecture support makes 9.0.7 an excellent foundation for both current and future computing needs.
For technical questions or deployment assistance, please connect with our community through GitHub or contact our support team. We're committed to helping you maximize the value of your HPC infrastructure.
HPC Gridware Unveils Gridware Cluster Scheduler 9.0.2, Adding NVIDIA Grace Hopper Support (2025-01-23)
A Leap Forward for AI and HPC Workloads
Gridware has unveiled the latest iteration of its Gridware Cluster Scheduler (GCS) 9.0.2, now featuring native support for NVIDIA’s Grace Hopper Superchip. This update, highlighted by HPCwire, marks a significant stride in optimizing high-performance computing (HPC) and AI infrastructure.
Podcast: Open Cluster Scheduler vs. Gridware Cluster Scheduler (2024-10-21)
I couldn't resist using NotebookLM to create a podcast about our first releases: the Open Cluster Scheduler and the Gridware Cluster Scheduler. NotebookLM is gaining viral attention, thanks to its remarkable capabilities tailored for such tasks.
Creating this podcast was a five-minute task—simply uploading the blog posts about the Open Cluster Scheduler and the Gridware Cluster Scheduler and letting the conversation be generated. Most of the time was spent double-checking the content, and I must admit, it got it right on the first try!
As one of the co-founders of HPC Gridware, I absolutely agree with what the AI is saying about it. Hear for yourself: :-)
Open Cluster Scheduler: The Future of Open Source Workload Management (2024-06-10)
See also our announcement at HPC Gridware
Dear Community,
We are thrilled to announce that the source code repository for the Open Cluster Scheduler is now officially open-sourced and available at github.com/hpc-gridware/clusterscheduler.
The Open Cluster Scheduler is the cutting-edge successor to renowned open-source workload management systems such as "Sun Grid Engine", "Univa Grid Engine Open Core", "Son of Grid Engine," and others. With a development history spanning over three decades, its origins can be traced back to the Distributed Queueing System (DQS), and it achieved widespread adoption under the name "Sun Grid Engine".
A Solution for the AI Era
As the world pivots towards artificial intelligence and high-performance computing, the necessity for an efficient and open-source cluster scheduler has never been more urgent. In today's GPU cluster environments, harnessing full hardware utilization is not only economically beneficial but also accelerates results, enables more inference tasks per hour, and facilitates the creation of more intricate AI models.
Why Open Cluster Scheduler?
There is a real gap in the market for open-source workload managers, and Open Cluster Scheduler is here to fill it with a whole host of remarkable features:
- Dynamic, On-Demand Cluster Configuration: Make changes without the need to restart services or daemons.
- Standard-Compliant Interfaces and APIs: Enjoy compatibility with standard command-line interfaces (qsub, qstat, …) and standard APIs like DRMAA.
- High Throughput: Efficiently handle millions of independent compute jobs daily.
- Mixed Job Support: Run large MPI jobs alongside small, short single-node tasks seamlessly without altering configurations.
- Rapid Submission: Submit thousands of different jobs within seconds.
- High Availability: Ensure reliability and continuous operation.
Optimized for Performance
Open Cluster Scheduler is meticulously optimized across all dimensions:
- Binary Protocol Between Daemons: Enhances communication efficiency.
- Multi-threaded Scheduler: Ensures optimal performance.
- Written in C++/C: Delivers robust and high-speed computing.
- Multi-OS and Architecture Support: Compatible with architectures including AMD64, ARM64, RISC-V, and more.
Looking Forward
We are committed to evolving Open Cluster Scheduler into a modern solution that will be capable of managing highly demanding compute workloads across diverse computational environments, whether on-premises or in the cloud.
We invite you to explore, contribute, and join us in this exciting new chapter. Together, we can shape the future of high-performance computing.
Visit our repository: github.com/hpc-gridware/clusterscheduler
Thank you for your continued support and enthusiasm.
Sincerely,
Daniel, Ernst, Joachim
UberCloud Releases Multi-Cloud, Hybrid-Cloud HPC Application Platform (2020/11/06)
The way enterprises run High Performance Computing (HPC) applications has changed. With Cloud providers offering improved security, better cost/performance, and seemingly endless compute capacity, more enterprises are turning to Cloud for their HPC workloads.
However, many companies are finding that replicating an existing on-premise HPC architecture in the Cloud does not lead to the desired breakthrough improvements. With this in mind, from day one, the UberCloud HPC Application Platform has been built with cloud computing in mind, resulting in highly increased productivity of the HPC engineers, significantly improving IT security, reducing cloud costs and administrative overhead to a minimum, and maintaining full control for engineers and corporate IT over their HPC cloud environment. Today, we are announcing UberCloud’s next-generation HPC Application Platform.
Building blocks of the UberCloud Platform, including HPC, Cloud, Containers, and Kubernetes, have been previously discussed on HPCwire: Kubernetes, Containers and HPC, and Kubernetes and HPC Applications in Hybrid Cloud Environments.
Key Stakeholders when Driving HPC Cloud Adoption
When we started designing the UberCloud HPC Application platform we recognized that three major stakeholders are crucial for the overall success of a company’s HPC cloud journey: HPC engineers, Enterprise IT, and the HPC IT team.
HPC application engineers are the driving force behind innovation. To excel in (and enjoy) their job they require a frictionless, self-service user portal for allocating the computational resources when they are required. They don’t necessarily need to understand how compute nodes, GPUs, storage, or fast network interconnects have to be configured. They expect to be able to allocate and shutdown fully configured HPC application environments.
Enterprise IT demands the necessary software tools and pre-configured containerized HPC applications for creating fully automated, completely tested environments. These environments must be suited to interact with HPC applications and their special requirements for resources and license servers. The platform needs to be pluggable to modern IT environments and support technologies like CI/CD pipelines and Kubernetes orchestration.
The HPC IT team (often quite independent from Enterprise IT) requires a hybrid cloud strategy for enhancing their existing on-premise HPC infrastructure with cloud resources for bursting and hybrid cloud scenarios. This team demands control on software versions and puts emphasis on the entire engineering lifecycle, from design to manufacturing.
Introducing the UberCloud HPC Application Platform
The UberCloud HPC Application Platform aims at supporting each of the three major key stakeholders during their HPC cloud adoption journey. How is that achieved?
For the HPC application engineers UberCloud provides a self-service HPC user interface where they select their application(s) along with the hardware parameters they need. With a single click the fully automated UberCloud HPC Application Platform allocates the dedicated computing infrastructure, deploys the application, and configures access for the engineer for instant productivity. Similarly, the HPC application infrastructure can be resized at any given point in time to run distributed memory simulations, parameter studies, or a design of experiments. After work is done the application and the simulation platform can be safely shut down.
Enterprise IT operations often have their own way of managing cloud-based resources. Infrastructure as Code, GitOps, and DevOps are some of the paradigms found in those organizations. The UberCloud HPC Application Platform contains a management tool which can be integrated in any kind of automation or CI/CD pipeline tool chains. UberCloud’s application platform management tool takes care of all aspects of managing containerized HPC applications using Kubernetes based container orchestrators like GKE, AKS, and EKS.
HPC IT teams require integration points for allocating cloud resources and distributing HPC jobs between their on-premise HPC clusters and dynamically allocated cloud resources. The UberCloud HPC Cloud Dispatcher provides batch job interfaces for hybrid cloud, cloud bursting, and high-throughput computing. It relies on open standards through the whole application stack to provide stable integration interfaces.
Putting UberCloud’s HPC Application Platform into Practice
Our first customer that enjoyed the benefits of the UberCloud HPC Application Platform is FLSmidth, a Danish multinational engineering company providing global cement and mineral industries with factories, machinery, services and know-how. The Proof of Concept implementation at the end of last year has been recently summarized here, and the extended case study (including a description of the hybrid cloud architecture) is freely available through This e-mail address is being protected from spambots. You need JavaScript enabled to view it. .
More Articles...
- UberCloud Webinar with Microsoft (2020/10/26)
- Hybrid Cloud interactive HPC Applications on Kubernetes (2020-03-19)
- Univa Grid Engine 8.4.1 Released (2016-07-19)
- Update of the Univa Grid Engine Vagrant integration (2016-06-15)
- Univa Tech Days and Upcoming Univa Grid Engine Webinar (2016-06-14)
- Univa Grid Engine 8.4 Release (2016-06-14)
- Free Univa Webinars in April (2016-4-4)
- Webinar: High Performance Computing in the Cloud? (2016/02/01)
- Univa is Founding Member of Cloud Native Computing Foundation (2015-07-21)
- Univa Tech Day - Gothenburg - 17th of March (2015-03-12)
- Univa Tech Days 2015 (2015-02-06)
- Univa Grid Engine 8.2.1 Available (2014-12-15)
- Webinar About Using Coprocessors in Univa Grid Engine (2014-11-24)
- New Grid Engine Trainings (2014-07-21)
- Univa Interview at ISC 2014 in Leipzig (2014-07-02)
- Schlumberger's ECLIPSE Integrates Univa Grid Engine (2014-06-20)
- Webcast about Workload and Resource Management (2014-06-12)
- Univa Grid Engine meets Sahara Force India Formula 1 (2014-06-04)
- SoGE 8.1.7 released (2014-06-03)
- UniCloud Explained in May (2014-04-28)
- The cgroups Grid Engine Webinar Available for Download (2014-04-12)
- Free Webinar about Univa UniSight on April, 16th 2014 (2014-04-12)
- Univa Grid Engine Forums 2014 (2014-02-18)
- This Wednesday: Webinar about Upgrading to Univa Grid Engine (2014-03-03)
- Tomorrow: Free Webinar about cgroup integration in Univa Grid Engine (2014-02-24)
- Grid Engine Training Locations for 2014 (2014-02-05)
- Univa Grid Engine 8.1.7 Released (2014-01-15)
- Article about 20th Anniversary of Grid Engine (2013-11-21)
- Univa got Editors' Choice Award for Top 5 Vendors to Watch (2013-11-20)
- Visit Univa at Supercomputing 2013 in Denver (2013-11-16)
- Slidecast about Univa and Aquistion of Grid Engine Assets from Oracle (2013-11-06)
- Great Day for Grid Engine and Univa - Univa got Copyrights of Grid Engine Code from Oracle (2013-10-23)
- Univa at HEPiX Fall 2013 Workshop (2013-10-17)
- Grid Engine Forum 2013 and Grid Engine Training (2013-10-10)
- Univa Grid Engine 8.1.6 is out! (2013-10-9)
- Grid Engine Training and Forum Series Continues 2 (2013-10-03)
- Grid Engine Forum Series Continues (2013-09-25)
- Grid Engine Forum 2013 and Grid Engine Training (2013-08-01)
- Son of Grid Engine 8.1.4 Released (2013-09-06)
- License Orchestrator - Why License Management is important (2013-08-28)
- Univa Announces Partnership with MapR (2013-07-23)
- insideHPC Technical Computing Survey (2013-03-25)
- Grid Engine in 2013 (2013-03-20)
- Son of Grid Engine 8.1.3 released (2013-02-27)
- Free Archimedes / Univa Webinar in March about Shared Hadoop Infrastructures (2013-02-25)
- Interview with Univa CEO about Grid Engine's ARM, License Orchestrator, and Hadoop Support (2012-02-16)
- Univa Announces Grid Engine Support for ARM-Servers - Partnership with Calxeda (2013-02-14)
- A First Outlook on the Univa Grid Engine License Orchestrator (2012-02-13)
- Univa Announces Grid Engine 8.1.3 at SC 2012 (2012-11-13)
- Grid Engine in the News: "4 Ways to Create Business Value in a Bad Economy With Infrastructure Transformation" (2012-10-26)
- Grid Engine in the News: "Big data projects: Is the hardware infrastructure overlooked?" (2012-10-18)
- Interested in Grid Engine? Join us on 1./2. October 2012 in Regensburg (Germany)
- Univa is Hiring!
- Grid Engine in the News: "Managing MapReduce Applications in a Shared Infrastructure" (2012-09-26)
- Grid Engine in the News: "Grid Engine: Running on All Four Cylinders" (2012-09-25)
- Son of Grid Engine 8.1.2 released
- Univa Grid Engine 8.1 Available for Public Download (2012-08-20)
- Grid Engine Evolution Summit 2012 (2012-07-25)
- Univa Grid Engine 8.1 Enhancement (Part 7): Univa Grid Engine Job Classes (2012-06-29)
- Univa is filling Sun gap in EDA industry (2012-06-29)
- ISC '12 - Univa Booth (2012-06-18)
- Interview With Univa CEO (2012-06-12)
- Son of Grid Engine Released Version SoG 8.1.0 with Security Fix and 5 New Bug Fixes since 8.0e (2012-06-12)
- Univa Grid Engine 8.1.0 in the News (2012-05-02)
- Univa Releases Results of HPC Survey (2012-03-15)
- DRMAA Version 2 Final Publication (2012-01-27)
- Univa Grid Engine Man Pages Added (2012-01-02)
- New Grid Engine Survey (2011-11-04)
- Univa Grid Engine 8.0.1 released
- Univa Grid Engine 8.0.1 reached beta state
- Son of Grid Engine publishes binaries
- Univa offers free Univa Grid Engine Trial version!
- Univa Grid Engine On Demand RightScale Webinar
- Univa Grid Engine Summer Summit 2011