Project Description

WebFamulus is a web application using the Quartz scheduler framework for process automation. It provides a web interface to schedule, monitor, and interact with Java processes that are configured as 'jobs'. Access to user and administrative interfaces is based on permissions and group ownership. Likewise, the configuration of jobs, alerts, performance monitoring and statistics is based on group ownership and permissions granted to an individual user. The application is currently under development and a beta-version is available to selected clients.


Typical applications for the Quartz framework are network and hardware monitors, database maintenance, reporting, etc. Tedious, time-consuming tasks can run repeatedly whenever possible following sophisticated schedules, including selected dates and weekdays, black-out times, holidays, etc. The field of application, however, is limited only by what the programming language cannot provide. Quartz makes it possible to schedule and execute automatically just about anything that can be programmed in Java.


WebFamulus, in addition, supports jobs to be executed as sequences based on the (optional) evaluation of the preceding job's exit status. Furthermore, the application is backed by an extensive database that makes it possible to let non-administrative users create, configure, interact with, and monitor their own jobs, alerts, and reports on-line or via electronic mail.


2013-12-06

4. Using Quartz to Monitor SMART Hard Drives as an Automated Service

Processing Steps

Using the Quartz Scheduler to monitor SMART-enabled hard drives makes it possible to automate the entire process from data generation and collection to reporting. Key advantages are minimal user intervention and the compilation of a consistent long-term data set. Quartz provides a powerful and higly flexible scheduling framework that can execute hundreds or thousands of computing tasks repeatedly without user intervention. Once set up, a schedule will run until the scheduler is interrupted or shut down. This is especially advantageous for recurring and repetitive tasks such as monitoring hardware. When the data gathered by such tasks is not only evaluated for possible errors or faulty behavior but also stored on a long-term basis, it is possible to compile a record of 'normal operation' from which newly occurring anomalies stand out more clearly.

SMART data is generated, of course, by running smartctl with the appropriate options on the host machine (see 1. below). If the host machine is accessible via SSH, WebFamulus can execute smartctl itself and capture the output. Alternatively, a script (Python, Perl, etc.) installed on the host machine and configured to be executed by a utility program such as cron can run smartctl and forward the output via email or file upload to WebFamulus.

SMART-Data Processing

The Smartmontools daemon (smartd) can be configured to send out emails when a new error is detected in a logfile or when a SMART attribute is failing. However, error logs are only updated when self-tests are run and, hence, an error may only be detected after a test has been initiated. SMART-enabled devices usually can run two self-tests: short and long. A short test, which takes about two to three minutes, updates the SMART attributes, checks the electrical and mechanical parts, and performs spot checks on the disk surface. A long test--which scans the entire disk surface for errors--may take hours to complete, depending on the size of the disk. Hence, regular testing of hard drives is imparative in order to minimize the danger of data loss due to equipment failure.

WebFamulus uses the Quartz scheduler and an extensive background database to generate, store, and analyze SMART data. All steps are automated and run on a daily, weekly, or longer schedule but, through a web interface, they can also be executed on demand. Furthermore, some jobs are executed as sequences. For example, before a daily report is issued, a collector job gathers all SMART reports that have not been processed and initiates a parsing job; then graphs and summary reports are compiled, and a mailer job sends the report to the stakeholders. (That is, steps 2 to 6 are executed as a sequence before step 7 is initiated. Otherwise, step 2, for example, may be running independently all the time at regular intervals.)

The advantages of regular testing and reporting are obvious: it establishes a basis of normal operation against which errors and faulty behavior is more easily noticed. A rise in the average working temperature, for example, could be an indication of a fan not working properly. This may go unnoticed if only occasional spot checks are done. With automated SMART testing and long-term, data storage the chances for early detection--before a serioushardware failure--are much higher.

Yet, even a small office may have tens of computers to monitor, and testing and evaluating can be a time-consuming task. Automating the process with a Quartz-based application like WebFamulus allows hundreds or thousands of hard drives to be monitored every day with minimal intervention beyond the initial set-up. And, having test results stored in a database makes it possible to document past performance and maintenance, which is very useful for system administrators when hard disk monitoring is part of a service agreement.

No comments:

Post a Comment