Saturday, May 1, 2010

Grid Monitoring Tool

Most of the distributed system like grid or cloud is using ganglia as resource monitoring tool. Out of the box solution provided by ganglia is powerful and useful enough for operation center. However, in term of flexibility and ease of integration to external application, is not straight forward.

I had designed and developed a web apps that run on top of ganglia with few enhanced features as below.
  1. Process and save ganglia gmetad xml file to relational database
  2. API and hook to process user defined data in gmetad XML
  3. Event triggering and configuration
  4. Threshold configuration
  5. Job monitoring and statistic
  6. New UI and data presentation to give new better overview of resource status
  7. Data mining and report generation
  8. User authentication and authorization

Following are few screen shots for the monitoring tool
Eagle Eye view - For management and show case

Eagle eye view - 2nd level view for each graph

Threshold and event triggering control panel

Event panel with multiple events acknowledgment. Three severity and number of recurring.

Event detail with double click on event at panel

Tree view using lazy loading. Three level grid, cluster and node.

Hot Spot view that able user to group multiple cluster within hot spot.

Ganglia View - summary of grid and cluster

Ganglia View - cluster level summary

Ganglia View - Node level detail view.

Assorted report panel.

Snapshot Report available for each three level.

Summary Report available for each three level.

User's Job Accounting Report

Job Summary Report

No comments:

Post a Comment