Apps:documentation/GSSIM

Z AppsWiki

GSSIM User Guide

Spis treści

Introduction

Grid scheduling algorithms have been a subject of intense research over the last decade. However, evaluation and comparative analysis of these algorithms and research experiments are often difficult to perform. This is caused by many problems, including, for example, difficulties in obtaining exclusive access to large scale infrastructures for research purposes or lack of certain functionalities of real resource management systems, such as advance reservation (AR).

Therefore, taking into account diversity of distributed resource management systems and significant technical effort needed to establish large-scale grid environments, simulations are commonly used to evaluate candidate algorithms and architectures. However, most of simulations are developed for a specific purpose by developers of algorithm so that they cannot be re-used elsewhere. For this reason, several generic grid simulations tools were introduced such as SimGrid, GridSim, etc. Some of these tools provided a good basis for implementation and simulation of wide range of algorithms. Nevertheless, in most cases developers must implement the whole experiments by themselves using just the basic functionality of simulators. In consequence, setting up an experiment also requires a lot of work and rarely is usable by others.

Additionally, in order to perform a reliable simulation experiment researchers must cope with several issues. Workloads usually come from single and often independent local clusters, collected under specific conditions, and do not contain information about workflows, co-allocation requests etc. Additionally, available simulation environments usually do not allow simulating multiple autonomous scheduling elements, such as Grid and local schedulers. As a consequence of all these problems researchers have a limited chance to reuse and compare results of their analysis.

To address these issues we introduce the Grid Scheduling Simulator (GSSIM), which provides an automated framework for management of experiments related to grid resource management. GSSIM is based on GridSim extending its basic functionality and adding additional tools on top of it. These tools and addins enable flexible and automated management of the research experiment including plugging scheduling algorithms into the simulated environment, modeling realistic workloads as well as adopting real traces, configuration of environment topology, and many others described in detail in the sequal. GSSIM framework is complemented by the GSSIM portal, which provides, apart from generic information about the project, descriptions of workloads and resources, possibility to generate these descriptions on-line, and implementations of various algorithms.

In this way GSSIM provides a comprehensive environment enabling researchers to test resource management algorithms and architectures, and exchange not only workloads but also results of experiments and implementations of algorithms.

GSSIM architecture and model

The GSSIM framework is based on GridSim and Sim-Java2 packages. However, it provides a layer added on top of the GridSim adding capabilities to enable easy and flexible modeling of Grid scheduling components. GSSIM also provides an advanced generator module using real and synthetic workloads. The overall architecture of GSSIM is illustrated in Figure 1.

Figure 1. Generic GSSIM architecture
Figure 1. Generic GSSIM architecture

GSSIM distinguishes between two types of scheduling components: Grid brokers and resource providers. As shown in Figure 1 multiple scheduling strategies may be plugged into both levels. Input data can be read from real sources or generated using the generator module.

One of the major GSSIM objectives is flexibility in terms of using a variety of Grid scheduling strategies and workloads. However, Grid jobs may have various shapes and levels of complexity ranging from workflows, through largescale parallel applications, up to single tasks that require single resources. Depending on type of Grid jobs, scheduling strategies may have different scope and need different input data. Therefore, to make development of scheduling plugins easier on one hand and keep it flexible on the other, we distinguish in GSSIM several levels of information about incoming jobs. These levels are presented in Figure 2. We assumed that there is a queue of jobs submitted to a Grid scheduler. Each job consists of one or more tasks. Thus if preceding constraints are defined a job may be a whole workflow.

Figure 2. Levels of information about jobs
Figure 2. Levels of information about jobs

Input data modeling

In general, input data in GSSIM consist of single configuration file, description of workload and resources. Users may both generate new or read the existing synthetic data. Third party real workloads can also be imported by GSSIM. If any parameters are missing after importing a workload, they can be generated by GSSIM and added.

Configuration file

The Experiment configuration file has typical, java resource bundle format. List of all available parameters and their interpretation is available below.

  • gridschedulingpluginname - specifies which plugin is used to manage broker interface
  • exectimeestimationpluginname - specifies which plugin is used to estimate task execution time
  • localallocpolicypluginname - specifies which plugin is used to manage local policies
  • resdesc - path to file containing description of resources
  • networktopologyfilename - path to txt file containing description of network topology (optional)


  • readscenario.workloadfilename - path to workload file (in swf or gwf format)
  • readscenario.inputfolder - path to directory with xml job descriptions (optional)


  • createscenario.tasksdesc - path to xml file which describes detail workload generator configuration
  • createscenario.outputfolder - path to directory where all generated jobs will be placed
  • createscenario.workloadfilename - name of workload file in swf format which will be generated
  • createscenario.overwrite_files - determines if previously generated files should be overwritten. Two possible values of this field are: "true" and "false".


  • creatediagrams.processors - determines if processors gantt chart should be generated. Two possible values of this field are: "true" and "false".
  • creatediagrams.resources - determines if resources gantt chart should be generated. Two possible values of this field are: "true" and "false".
  • creatediagrams.tasks - determines if tasks gantt diagram should be generated. Two possible values of this field are: "true" and "false".
  • creatediagrams.taskswaitingtime - determines if tasks waiting time gantt diagram should be generated. Two possible values of this field are: "true" and "false".
  • creatediagrams.reservations - determines if reservations of tasks gantt diagram should be generated. Two possible values of this field are: "true" and "false".


  • createstatistics.accumulatedresources - specifies whether accumulated resources statistics should be created
  • createstatistics.extendedtasks - specifies whether extended tasks statistics should be created
  • createstatistics.gridlethistory - specifies whether history of gridlets should be created
  • createstatistics.jobs - specifies whether jobs(accumulated task) statistics should be created
  • createstatistics.simulation - specifies whether accumulated simulation statistics should be created
  • createstatistics.formatoutput - determines if generated statistic files should use tabular(true value) or semicolon separated format


Possible values of parameters from above createstatistics group are: "true" and "false". By default, all createstatistics fields, except formatoutput, are set to "true".

Parameters from readscenario and createscenario groups should be used mutual exclusively.

Path to this configuration file is the main argument of the GSSIM program. It should be used as follows:

java simulator.GridSchedulingSimulator path/to/workload.properties

Workload

Workload contains information about jobs, their structure, resource requirements, relationships, time intervals etc. We assumed a model in which each job consists of one or more tasks. A job may contain preceding constraints between tasks (workflow). The next sections provide information on how workloads are described and generated in GSSIM.

Workload description

The workload used to perform experiment consists of two parts: required swf/gwf format file and optional xml job description. Swf is the standard workload format, described by Dror Feitelson - see http://www.cs.huji.ac.il/labs/parallel/workload for details. Xml job description format is described by following xsd schema GrmsJobDescriptionSchema.xsd. This schema was borrowed from Grid Resource Management System (GRMS), which is part of the Gridge initiative.

Better flexibility and expected functionality is achieved in GSSIM by introducing support of new swf header comments:

StartTime - defines moment in time when the simulation starts. If it is not provided, then "Thu Jan 01 01:00:00 CET 1970" is used.

Example:

;StartTime: Mon Nov 03 10:00:00 CET 2008

PUSpeed - defines processing unit speed. This value has no predefined unit. It is up to you how this field is interpreted, for example instruction per second or million of instructions per second. If it is not provided, then default value 1 is used. The value of PUSpeed is used to estimate task length expressed in instructions, which is calculated as multiply of PUSpeed value and Run Time and Number of Allocated Processors fields.

Example:

;PUSpeed: 1

IDMapping - this section allows you to join multiple jobs from swf file into single job with multiple tasks. IDMapping section consist of:

  • begin line: ;IDMapping: swfID:jobID:taskID
  • mapping between swf job id and new job and task id: ; id form swf file:new job id:task id
  • end line: ;IDMapping: end

Example:

Assume, that swf file contains two tasks with id 1 and 2. You can create new job, with two tasks by defining following mapping:

;IDMapping: swfID:jobID:taskID
; 1:4:10, 2:4:20
;IDMapping: end

New job with id = 4 consisting of two tasks with id 10 and 20 will be created.

The only constraint of IDMapping section is that swf jobs, which will become tasks in new job, must occur in swf one by one. No other jobs are allowed between these swf jobs which are mapped to tasks of one new job.

The experiment can be executed with usage of single swf file or swf file with xml extension. If single swf is used, then task requirements like cpu count and requested memory are read directly from swf file. Notice, that information included in swf file is insufficient for using advance reservation in scheduling algorithm. To do so, you must provide xml extension of each job description and fill up its executionTime section. In xml files you can use any ids for job and tasks but you must provide correct IDMapping section (in swf file header) between xml job/task ids and swf job id. Otherwise, task start up parameters like submit time or task length in instructions will not be calculated correctly. If xml job description is used, then task requirements are read from xml description instead of swf file.

Workload generation

The main goal of workload design was to ensure, that all job descriptions which were used in real resource management system like GRMS or obtained from swf/gwf log can be used to perform experiment in GSSIM simulator. However, it my be difficult for all users to reach such workloads, therefore workload generator was created.

Workload generator allows you to create any number of jobs and tasks, with sophisticated resource and time requirements. The result of generation process are: desired number of job descriptions in xml format and swf file with job descriptions and all necessary header parameters (see workload description for details).

Configuration options are provided by two files:

  • *.properties file, which should provide values of all parameters from createscenario group and resdesc parameter (see configuration file for details)
  • xml configuration file, which is described by xsd schema WorkloadSchema.xsd and contains configuration of random numbers generators, used to create job/task/workload parameters.

Details about properties file are described above. Following part of this section describes all elements of xml configuration file.

Main workload configuration elements:

SimulationStartTime

Defines start time of the simulation in human readable form. The value should be provided in xsd time format. See www.w3.org for details.

Example:

 <SimulationStartTime>2009-01-15T10:00:00</SimulationStartTime>
JobCount

Defines number of jobs to be generated. This element is used as an alternative for <SimulatonTime/>.

<JobCount/> is element of type RandParams.

Example shows how to create exactly 100 jobs:

 <JobCount avg="100" distribution="constant"/>
SimulationTime

Defines length of the simulation. Generator will create number of jobs which can be executed in order during SimulationTime. The value should be provided in xsd duration format. See www.w3.org for details. This element is used as an alternative for <JobCount/>.

TaskCount

Defines number of tasks in each job.

<TaskCount/> is element of type RandParams.

Example shows how to configure generator to create minimum 1 and maximum 10 tasks in each job. The average number of tasks in job will be 5, with standard deviation 3.0 and normal distribution.

 <TaskCount avg="5" min="1" max="10" stdev="3.0" distribution="normal"/>
TaskLength

Defines length of the task in number of instructions. This value will be translated to the seconds with assumption that task of this length will be executed on a single and the slowest processor. The minimum speed of the processor is fixed as a minimum value of cpuspeed host parameters from resource description. The resource description file is specified by resdesc parameter in *.properties file.

<TaskLength/> is element of type RandParams.

Example shows how to configure generator to create task of minimum 500 and maximum 1500 instructions. The average length of all tasks will be 1000 instructions with standard deviation 500.0 and normal distribution.

 <TaskLength avg="1000" min="500" max="1500" stdev="500.0" distribution="normal"/>

The value of all fields in swf file expressed in seconds are calculated as division of task length in instructions and minimum speed of the single processor. If the minimum speed of the processor is 2, then the value of runtime field in swf file for task of length 764 instructions will be calculated as 764/2 = 382 seconds.

JobPackageLength

Defines number of jobs which have the same submit time. Tasks which belongs to one job have always the same submit time. In swf file, submit time is interpreted as number of seconds after simulation start time.

<JobPackageLength/> is element of type RandParams.

JobInterval

Defines time space between submission of successive jobs. In other words, this is the difference between submission time of two successive jobs. The value is expressed in seconds.

JobInterval is element of type RandParams.

ComputingResourceHostParameter

Defines generator which creates <hostParameter/> element in task resource requirements section. See GrmsJobDescriptionSchema.xsd for detail description of task resource requirements.

This element requires attribute named: metric. The value of this attribute will be passed to task <hostParameter/> element as a value of its name attribute. The possible values of <hostParameter/> name attribute and thereby <ComputingResourceHostParameter/> metric attribute are: osname, ostype, puarch, osversion, osrelease, memory, freememory, cpucount, freecpus, cpuspeed, application, diskspace, freediskspace, remoteSubmissionInterface, localResourceManager, hostname. In simulation values of cpucount and memory host parameters are used by default. Others are currently ignored.

<ComputingResourceHostParameter/> is element of type RandParams.

Preferences

This is the complex element, which was designed to describe section <preferences/> in task requirements. See GrmsJobDescriptionSchema.xsd for detail description of task resource requirements.

Element <preferences/> consist of list of <parameter/> elements. Each parameter must provide <name/>, <importance/>, <optimizationType/> and <value/> elements. <endpoint/> is optional. The values of these elements are passed to the attributes in parameter element in task resource requirements section. Name of the parameter elements and task parameter attributes are the same.

Workload generated only for simulation purpose does not require section <parameter/> in task resource requirements, therefore element <preferences/> my be skipped in xml workload configuration.

Importance and value are elements of type RandParams.

ExecutionTime

This complex element was designed to describe <ExecutionTime/> section in task description. See GrmsJobDescriptionSchema.xsd for details.

<ExecutionTime/> consists of four child elements which have following interpretation:

  • <execDuration/> - describes user expectation about how long the task is. It differs from prior element <TaskLength/> which defines real length of the task. Value of this element is interpreted as number of instructions, and it will be translated into the seconds in the same way as <TaskLength/> is.
  • <periodStart/> - defines point in time from which task execution can be started. Value of this element is interpreted as number of seconds after <SimulationStartTime/>.
  • <periodEnd/> - defines point in time until task task execution must end. Value of this element is interpreted as number of seconds after <SimulationStartTime/>. It can be used as an alternative for <periodDuration/>.
  • <periodDuration/> - defines number of seconds after which task execution must end. This element can be used as an alternative for <periodEnd/>.

<execDuration/>, <periodStart/>, <periodEnd/> and <periodDuration/> are elements of type RandParams.

<ExecutionTime/> element is optional and it is not compulsory to use it in xml workload configuration. However, if advance reservation will be used, then <periodStart/> and <periodEnd/> (<periodDuration/>) are used as a begin and end time of the reservation.

PrecedingConstraints

This element allows to create task workflow. Currently it is not supported in simulation process.

RandParams

RandParams represents set of attributes and elements which are used to configure random numbers generator and the way it is used. The attributes of RandParams type can be divided into two groups:

  • defining statistics - following attributes are constraints which must be satisfied by the set of numbers created by generator:

avg - average value, stdev - standard deviation, min - minimum value, max - maximum value, seed - number which initialize generator, distribution - generated set of numbers will have distribution determined by this attribute; possible values are: constant, normal, poisson, uniform, exponential, gamma, harmonic.

  • defining dependency - following attributes are used to define dependency between any elements in xml configuration file:

id - element identifier, must be unique in entire file. Value of this attribute is required if value of containing element will be referenced by another element.

refElementId - identifier of the element which is referenced by containing element.

expr - defines dependency function. The x (independent variable) is pointed by the value of refElementId attribute. Defined expression may have any form acceptable by BeanShell interpreter. In general all mathematical operators like +, -, *, /, and brackets (, ) can be used.

Example:

 <ComputingResourceHostParameter metric="memory">
       <value id="memory" refElementId="cpuspeed" expr="x*100"/>
 </ComputingResourceHostParameter>
 <ComputingResourceHostParameter metric="cpuspeed">
       <value id="cpuspeed" refElementId="cpucnt" expr="x+10"/>
 </ComputingResourceHostParameter>
 <ComputingResourceHostParameter metric="cpucount">
       <value id="cpucnt" avg="5" min="2" max="10" stdev="3.0" seed="21" distribution="normal"/>
 </ComputingResourceHostParameter>

The order in which above values are resolved is following: cpucnt -> cpuspeed -> memory.

Value for cpucnt will be calculated based on generator parameters.

Value for cpuspeed will be calculated as a cpucnt generator result + 10; cpuspeed = cpucnt + 10

Value for memory will be calculated as a cpucnt generator result + 10 and multiply by 100; memory = (cpucnt + 10) * 100

Non linear functions can be also defined:

 <ComputingResourceHostParameter metric="memory">
       <value id="memory" refElementId="cpuspeed" expr="x*x"/>
 </ComputingResourceHostParameter>

It is possible to join generator definition with dependency definition. In such case, value of the element is calculated according to the function from expr attribute. The result is then added to the value calculated by the random numbers generator.

Example:

 <ComputingResourceHostParameter metric="cpuspeed">
       <value id="cpuspeed" refElementId="cpucnt" expr="x+10" avg="15" min="10" max="20" stdev="3.0" distribution="normal"/>
 </ComputingResourceHostParameter>
 <ComputingResourceHostParameter metric="cpucount">
       <value id="cpucnt" avg="5" min="2" max="10" stdev="3.0" seed="21" distribution="normal"/>
 </ComputingResourceHostParameter>

If cpucnt = 5, then cpuspeed = 32. Explanation: dependency expression returns 5 + 10 = 15, cpuspeed random numbers generator creates some value, for example 17, so the result is 15 + 17 = 32.


In addition to above list of attributes, RandParams type allows to define two child elements which can be used to define different configuration of generator for some time period or percentage of generated values.

  • PeriodicValidValues - attributes of this element defines generator configuration. <BeginValidTime/> child element defines start time of period when this generator configuration is mandatory. <EndValidTime/> child element defines end of this time period. For all time periods which are not covered by the time interval described by <BeginValidTime/> and <EndValidTime/>, generator defined in involving element is mandatory.

Example:

 <execDuration avg="5" min="1" max="10" stdev="3">
   <PeriodicValidValues avg="10" min="1" max="20" stdev=5">
      <BeginValidTime>1970-01-01T01:10:00</BeginValidTime>
      <EndValidTime>1970-01-01T01:20:00</EndValidTime>
   </PeriodicValidValues>
   <PeriodicValidValues avg="20" min="1" max="40" stdev=18">
      <BeginValidTime>1970-01-01T01:30:00</BeginValidTime>
      <EndValidTime>1970-01-01T01:50:00</EndValidTime>
   </PeriodicValidValues>
 </execDuration>

Lets assume, that simulation starts at 1970-01-01T01:00:00 and ends at 1970-01-01T02:00:00. There are three different generator configurations, one in <execDuration/> level and two on <PeriodicValidValues/> level. The interpretation of this configuration is as follows: the average execution duration for tasks which are submitted between 01:10:00 and 01:20:00 equals 10; the average execution duration for tasks which are submitted between 01:30:00 do 01:50:00 equals 20. Average execution durations for tasks which are submitted in any other time period ([01:00:00, 01:10:00], [01:20:00, 01:30:00], [01:50:00, 02:00:00]) equals 5.

  • MultiDistribution - allows to define different generator configurations for some percentage of generated values.

Example:

 <execDuration >
   <MultiDistribution>
      <dist avg="5" min="1" max="10" stdev="3" distribution="normal">0.3</dist>
      <dist avg="5" min="1" max="10" stdev="3" distribution="uniform">0.5</dist>
      <dist avg="5" min="1" max="10" stdev="3" distribution="poisson">0.2</dist>
   </MultiDistribution>
 </execDuration>

Interpretation of above configuration is as follows: generator described by normal distribution is used for 30% of generated values, uniform for 50% and poisson for 20%. Percentage values can be interpreted as probability of usage this particular generator.

It is not allowed to use generator configuration attributes (avg, min, max, etc) in involving element if MultiDistribution is used. In such case, all values of <dist/> elements must sum up to 1.0.

Resources

Resource description

Resources in GSSIM are also described using an XMLbased format.

GrmsHostParametersSchema.xsd (binding file)

Energy usage concept

Implementation of the energy usage concept is based on resource energy profiles. Different parts of the computing system have different influence and contribution in total amount of energy consumed by the resource. Therefore, GSSIM enables expressing this contribution in the form of energy profile, which characterize parts of the computing system - processors and computing nodes.

The main aim of the energy profile is to calculate the amount of energy used by the whole computing resource or resource component. This calculation should be made based on the current resource state, voltage and frequency configuration. Furthermore, each calculation is performed in the context of specific task (resource load), thereby advanced and application specific energy consumption model can be implemented.

Besides the energy consumption estimations, resource energy profile allows to manage resource energy state and configuration parameters. The four basic states are predefined: ON, OFF, SLEEP, HIBERNATE. The new states can be defined by extending EnergyState enumeration class. During the simulation it is also possible to change voltage and frequency of the components. This have particular meaning for processors management, but may be less important in the context of the whole computing node. Default values of the energy profile parameters must be provided by the class which implements EnergyInterface interface.

Energy interface description
Methods Descriptions
EnergyState getEnergyState() Returns current energy state of the resource unit
boolean setEnergyState(EnergyState state) Changes resource unit current energy state. This method should return false if the transition from current state to the new one is incorrect, e.g. for processor transition from state OFF to SLEEP is incorrect.
int getFrequency() Returns current frequency
boolean setFrequency(int freq) Changes current frequency. Should return false, if new frequency is beyond acceptable interval.
int getVoltage() Returns current voltage
boolean setVoltage(int voltage) Changes current voltage. Should return false, if new voltage is beyond acceptable interval.
int getEnergyConsumption() Calculates energy consumed by the resource in current state.
int getIdleEnergyConsumption() Calculates energy consumed by the resource working with out any load (idle).
EnergyInterface clone() Returns copy of the current object.
void visit(...) This method should accept resource unit type, for which EnergyInterface implementation is provided.

In order to create new, specific implementation of resource unit energy profile, just for coding convenience, it is recommended to extend AbstractEnergyProfile rather then implementing directly EnergyInterface interface.

Network topology

Network topology description

Network topology file is a txt file and contains information about routers(NETWORK NODES), users and resources(GRID SITES) and connections between them(LINKS).

It has the following form:

# specify the number of routers

number_of_routers

# specify the name of each router

router_name1

router_name2

router_name3

... // other router names

# specify the number of connections between routers

number_of_connections_between_routers

# specify the connection between routers

router_name1 router_name2 baud_rate(GB/s) prop_delay(ms) mtu(byte)

router_name1 router_name3 baud_rate(GB/s) prop_delay(ms) mtu(byte)

... // linking other routers

# specify the number of resources

number_of_resources

# specify the connection between resource and router

resource_name router_name baud_rate(GB/s) prop_delay(ms) mtu(byte)

... // linking other resources

# specify the number of users(at present must equal 1)

number_of_users

# specify the connection between user and router(at present user_name must be named "BrokerInterfaceEntity")

user_name router_name baud_rate(GB/s) prop_delay(ms) mtu(byte)


Names of connections are created according to the following rules:

Links between routers are combination of routers names(in definition order and separated by "_") and "_link" word. Other connections are formed by appending "_link" to node name.

Each resource has to be defined in appropriate resource description file.

Scheduling interfaces

This section contains description of scheduling interfaces at Grid and local levels. Also forecast interface is presented. Each scheduling plugin must implement one of the presented interfaces.

Grid scheduler interface

This interface simulates an environment of a Grid scheduler. It provides all necessary information needed to schedule jobs in Grids and imposes implementation of basic functionality required from Grid schedulers. The major method of the interface responsible for handling different types of events is schedule which performs scheduling when specific event occurs. This method enables implementing various scheduling strategies: off-line scheduling for whole sets of incoming jobs, dynamic scheduling based on specific events, periodic rescheduling, etc. The following events relevant for Grid scheduler have been considered in GSSIM:

  • TIMER,
  • JOB_ARRIVED,
  • TASK_ARRIVED,
  • TASK_CANCELED,
  • TASK_FAILED,

Events TIMER and JOB_ARRIVED have particular importance. TIMER is used to enable scheduling periodically and JOB_ARRIVED is executed when new jobs arrive. The interface method returns scheduling decisions that contain information about assignment of task to selected resources. In the case of scheduling based on advance reservation, identifiers of reservation are also included in scheduling decision.

GridSchedulingPlugin is a basic abstract interface considering Grid scheduler. It consists of the following methods: schedule, getPluginName, initPlugin, getConfiguration. Schedule method is called when event declared in plugin configuration occurs. Frequency of TIMER event can be set by adding an Integer value for key TIMER in plugin configuration. This value is measured in seconds. Parameters of methods provide information necessary to prepare scheduling plan.

Two queues are distinguished: job and task queue. Job queue is the queue of uncompleted jobs (job has not been scheduled yet). Instead of the job queue, developer can use the task queue which contains tasks ready for execution (without preceding constraints). In addition to the information about incoming jobs, a Grid scheduler needs knowledge about the environment. Resources, network topology, reservations and estimation of runtime and resource requirements are given in parameters. A reservation entity provides information about reservations and allows a Grid scheduling plugin to request and negotiate specific reservations. Particular strategies may use various levels of knowledge about resources.

Basic scheduling strategies base their decisions on very limited information while more advanced algorithms can apply knowledge about running jobs, performance predictions, and network topology to schedule jobs efficiently. A prediction strategies can be developed using a Prediction class, where run time of task is predicted based on task length and resource description. Grid scheduler may perform best-effort scheduling based on available information about resources or apply scheduling with QoS by negotiating offers from resource providers. QoS is done by using ReservationManager. This interface provide getting offers from Local Schedulers and management of reservations.

Additionally, to make the development of plugins easier and more focused a few abstract classes were prepared (e.g. JobSchedulingPlugin or TaskSchedulingPlugin). Hence, the following methods are available: scheduleJobs, scheduleJob, scheduleTasks, and scheduleTask. The advantage of those methods is that the authors of scheduling plugins may choose at which level they implement their algorithm (i.e. which method to override). For instance, if a scheduling algorithm is focused on matching single tasks to resources, then only scheduleTask need to be implemented. If an algorithm schedules all jobs at once only the scheduleJobs method must be overwritten. In example section GridschedulingFCFSRR and GridschedulingARFCFS plugins are presented.



Local scheduler interface

There are two types of interfaces distinguished in GSSIM at the level of local schedulers. They correspond to two different types of scheduling approaches in local systems. The ”basic” interface is based on a ”best effort" approach where tasks are submitted to local queues and no guarantee is given concerning the start time, resource availability, etc. The class name of this "basic" interface is LocalSchedulingPlugin.

The second type assumes scheduling with QoS guarantees. In this case, resource providers advertise and possibly negotiate their offers and, if they are successful, reserve requested resources for a certain period in the future using advance reservation mechanism. The class name of this interface is LocalSchedulingARPlugin.

Basic Scheduling Interface (LocalSchedulingPlugin)

This interface provides queue management mechanisms for plugin developers. It consists of the following methods:

  • getPluginName()
  • initPlugin()
  • getConfiguration().
  • schedule() choose which tasks should be moved to execution; my be invoked periodically (when TIMER event occurs). Frequency of TIMER event can be set by adding an Integer value for TIMER key in plugin configuration.
  • placeTasksInQueues() is responsible for distributing tasks between queues. New tasks should be placed in queues, and after that, when declared event arrives, moved to execution.
  • estimatePowerConsumption() estimates power consumption for whole computing resource/queuing system. This method should also calculate power required to calculate single task, if such information is required in task statistics.
  • chooseResourcesFor() retrieve resource units which satisfies given resource requirements. If schedule() method defines only order in which tasks are moved to execution, then this method must choose particular resource units, e.g. processors. If schedule() method calculates the best resource allocation (with all details), then this method must return previously calculated resource allocation for specific task.

Input data for scheduling methods consists off: list of tasks being executed, queues, and state of resources. In example section an implementation of LocalSchedulingPlugin is described (FCFSAllocPolicy).

QoS-based Scheduling Interface (LocalSchedulingARPlugin)

This interface provides advance reservation and negotiation mechanisms for plugin developers. It distinguishes initial and committed reservations. Hence, it enables development of reservation based on two-phase commit protocols. It extends a LocalAllocPlugin of set of methods enabling reservations management and preparing offers for Grid scheduler. The following methods may be implemented: getOffers, createReservation, commitReservation, cancelReservation, modifyReservation, getStatus. Those methods allow to develop different scheduling strategies concerning different resource allocation policies. GetOffers returns reservation offers based on time and resource requirements. CreateReservation initially reserves a requested slot or rejects the request. If the reservation is not committed before certain time the initial reservation expires. CommitReservation, cancelReservation and modifyReservation methods are responsible for management of reservations. Input data for scheduling methods of this interface include a list of reservation requests, lists of existing committed and initial reservations, tasks being executed, queues, and state of resources. Additionally, time and resource requirements, and proposed offers are passed to methods responsible for negotiations of reservations. LocalGridPluginAR is an example of implementation.

Methods Descriptions
getOffers Returns reservation offers based on time and resource requirements
createReservation Initially reserves a requested slot or reject the request; if the reservation is not committed before certain time the initial reservation expires
commitReservation Commits reservation or rejects the request
getStatus Returns status of reservation
cancelReservation Cancels a reservation
modifyReservation Decides whether the reservation can be modified

Execution Time Estimation interface

This interface is responsible for estimation of task execution time. It provides necessary information to compute execution time. The method execTimeEstimation can be used to specify various strategies of time computation. Input parameters are: allocated resource units, task requirements and remaining task length. Output of this method is an estimated execution time of a task. This value is used as the task execution time during simulation. Execution Time Estimation plugin allow to develop different approaches and strategies of execution time computation. In example section a implementation of this interface is presented (ExecTimeEstimationPluginImpl).

Statistics

This section concerns the description of usage and development of statistics module which is a part of the GSSIM. Statistics module has been created to help in comparative evaluations of scheduling algorithms. The textual and gantt statistics are generated to illustrate results of simulation. There is also opportunity to obtain accumulated statistics for many simulations which are computed using data generated by each simulation separately. In the following subsections details of generated data are presented.

Textual statistics

The statistics module generates textual and gantt statistics. Textual statistics contains mean, variance etc. values concerning different properties of execution. In the following subsections details about these values and their interpretation are described. By default all textual statistics, listed below, are computed and presented. The level of granularity of generated data can be easy modified by using dedicated parameters in configuration file. However, always raw resources and tasks statistics are generated (they are necessary to compute other, more complex statistics).

General statistics

General simulation statistics are generated to Stats_simulationID_Simulation.txt file. For each simulation, statistics are printed into separate files with different simulationID values. Moreover accumulated simulations statistics are also calculated. These statistics illustrate global statistics for n simulations and they are generated to Stats_Accumulated_Simulations.txt file. Both statistics consist of basic characteristics of simulation. For each characteristic the following values are calculated:

  • mean - arithmetic mean
  • stdev - standard deviation
  • variance - variance
  • minmum - minimum value
  • maximum - maximum value
  • sum - sum of all values used to computation of aforementioned statistics
  • count - number of values used to computation of aforementioned statistics

The following basic characteristics are computed:

  • Delayed tasks - number of tasks for which Cj > dj.
  • Failed requests (tasks) - number of tasks which have not been finished.
  • Makespan - the length of the schedule of all tasks: max(Cj).
  • Resources queue length - mean number of task in queue in the moment of submission time.
  • Task completion time: mean completion time of all tasks.
  • Task execution time: mean execution time of all tasks.
  • Task start time: mean start time of all tasks.
  • Task flow time: mean flow time of all tasks.
  • Task waiting time: mean waiting time of all tasks.
  • Task lateness: mean lateness of all tasks.
  • Task tardiness: mean tardiness of all tasks.
  • Resources reservation load - reservation utilization computed as a mean percentage reservation load of all resources(including processors) during the simulation.
  • Resources total load - utilization computed as a mean percentage load of all resources (including processors) during the simulation. For each processor p the Resource Load (RLp) is calculated as:

RL_p = \frac{duration\ of\ executed\ tasks\ on\ processor} {duration\ of\ simulation}

Resources statistics

Raw resources statistics are generated to Stats_simulationID_Raw_Resources.txt file. For each resource we can find the following statistics:

  • Resource name
  • Available MEMORY - available resource memory
  • Available CPUs - number of processors
  • CPU speed - each processor speed
  • Queue length - number of task in queue at the moment of submission time
  • Load - each processor utilization, calculated as duration of executed task on processor/duration of simulation
  • Reservation Load - each processor reservation utilization, calculated as duration of reservation on processor/duration of simulation

If createstatistics.accumluatedresources parameter in configuration file is set to "true", resources accumulated statistics are calculated. They are printed into Stats_simulationID_Accumulated_Resoruces.txt file.

  • Load - resource utilization computed as a mean percentage load of all available processors during the simulation
  • Reservation Load - resource reservation utilization computed as a mean percentage reservation load of all available processors during the simulation

Tasks statistics

Raw tasks statistics are generated to Stats_simulationID_Raw_Tasks.txt file. Here, for each task, general statistics are presented.

  • JobID - the job ID
  • TaskID - the task ID
  • UserDN - the id (distinguish name) of the user who has submitted this task
  • ResName - the GridResource names that executed this task
  • CpuCnt - the number of CPUs requested to run this task
  • ExecStartDate - the latest execution start date
  • ExecFinishDate - the finish date of task in a GridResource
  • ExecEndDate - the latest date when the execution of task must be ended
  • GB_SubDate - date when task was submitted
  • LB_SubDate - the submission or arrival date of task from the latest GridResource

If createstatistics.extendedtasks parameter in configuration file is set to "true", other statistics are also showed.

  • CompletionTime - Cj (ExecFinishDate − simStartDate)
  • ExecStartTime - Sj (ExecStartDate − simStartDate)
  • ExecutionTime - CjSj, where Cj and Sj are a completion and execution start time of task j, respectively
  • StartTime - Sjsubj, where Sj is an execution start time and subj is a submission time(GB_SubDate − simStartDate) of task j
  • ReadyTime - the earliest time when the execution of task can be started (if ready time is not specified it is assumed to be equal to submission time)
  • FlowTime - Cjrj, where rj is a ready time of task j
  • WaitingTime - Sjlsubj, where lsubj is a submission time of task j to local scheduler (LB_SubDate − simStartDate)
  • GQ_WaitTime - lsubjsubj (LB_SubDate − GB_SubDate)
  • Lateness - Cjdj, where dj is a due date (ExecEndDate − simStartDate) of task j
  • Tardiness - max(0,Cjdj)

Gridlet statistics

Gridlet statistics are generated to Stats_simulationID_Gridlets.txt file. Here, gridlets processing histories are described (http://www.gridbus.org/gridsim/doc/api/gridsim/Gridlet.html).

Jobs statistics

Jobs statistics are generated to Stats_simulationID_Jobs.txt file. This file contains some general statistics(mean values) for job which consists of one or more tasks.

  • meanTaskCompletionTime - mean completion time of all tasks in job
  • meanTaskExecutionTime - mean execution time of all tasks in job
  • meanTaskStartTime - mean start time of all tasks in job
  • meanTaskFlowTime - mean flow time of all tasks in job
  • meanTaskWaitingTime - mean waiting time of all tasks in job
  • meanTaskGQ_WaitingTime - mean global queue waiting time of all tasks in job
  • lateness - mean lateness of all tasks in job
  • tardiness - mean tardiness of all tasks in job
  • makespan - the length of the schedule of all tasks in job: max(Cj)

Gantt statistics

This sections describes charts generated after the simulation. Four types of charts are differentiated: Processors gantt chart, Resource gantt chart, Task gantt chart and Waiting Time gantt chart. If Advanced Reservation is used an additional charts are generated. The Reservations gantt chart presents all reservations on all resources. Moreover for each resource gantt chart is generated to show reservations committed during simulation. Details considering each type of chart are described in the following subsections.

Processors gantt chart

In Fig. 3 we can see a gantt chart generated after the simulation. A horizontal and a vertical axis represent processors and time respectively. Processors are noted as ID@ResourceName. Tasks are differentiated using colors. A legend below a chart informs which color stands for which task.

Figure 3. Gantt for processors
Figure 3. Gantt for processors

Resources gantt chart

Fig. 4 illustrates a chart presenting loads of resources. A horizontal axis represents number of occupied CPUs and a vertical axis represents time. This chart shows how many CPUs on each resource was used during the simulation.

Figure 4. Gantt for resources
Figure 4. Gantt for resources

Tasks gantt chart

Fig. 5 presents task gantt diagram. Here, a horizontal axis consists of tasks. A vertical axis represents time. Colors indicate resources on which tasks were executed. A legend below a chart informs which color stands for which resource.

Figure 5. Gantt for tasks
Figure 5. Gantt for tasks

Waiting time gantt chart

Fig. 6 is very similar to Fig. 5. Additionally, waiting time of task is shown using a transparent color of resource on which the task is executed.

Figure 6. Gantt for waiting times of tasks
Figure 6. Gantt for waiting times of tasks

Reservations gantt charts

Fig. 7 presents reservations gantt diagram. Here, a horizontal axis consists of task reservations. A vertical axis represents time. Colors indicate resources on which reservations were committed. Transparent colors indicate time when resources were not being used during the reservation. A legend below a chart informs which color stands for which resource.

Figure 7. Gantt for reservations of tasks (All resources)
Figure 7. Gantt for reservations of tasks (All resources)

Fig. 8, 9 and 10 present reservations of task on resource. For each resource a separated chart is generated.

Figure 8. Gantt for reservations of tasks (Resource 1)
Figure 8. Gantt for reservations of tasks (Resource 1)
Figure 9. Gantt for reservations of tasks (Resource 2)
Figure 9. Gantt for reservations of tasks (Resource 2)
Figure 10. Gantt for reservations of tasks (Resource 3)
Figure 10. Gantt for reservations of tasks (Resource 3)

Network statistics

Network statistics are generated to Stats_simulationID_Network.txt file. They are computed for each link existing in network topology and consist of three sections: general link statistics, reservation statistics and transmitted data statistics.

General link statistics contains basic characteristic of a given link such as:

  • baud rate(bits/s) - baud rate of the link(in bits/s)
  • delay(ms) - transmission delay that this link introduces
  • size of transmitted data(in bytes) - size of transmitted data over this link(without reservation)
  • load - link utilization; computed this way: (size_of_transmitted_data + sum(duration_of_reservation * reservation_size)) / (link_baud_rate * duration_of_simulation)
  • reservations load - link utilization(taking into account only reservations); computed this way: sum(duration_of_reservation * reservation_size) / (link_baud_rate * duration_of_simulation)

Reservations characteristics consists of:

  • reservation ID - unique reservation ID
  • links - reserved links
  • start date - start date of the reservation
  • end date - end date of the reservation
  • bandwidth(bits) - reservation size
  • size of transmitted data(in bytes) - size of transmitted data, within a given reservation time
  • load - reservation utilization; computed this way: (size_of_transmitted_data / reservation_size) / duration_of_reservation
  • status - status of the reservation(CANCELED or FINISHED)

Transmitted data statistics:

  • JobID - the job ID
  • TaskID - the task ID
  • file name - name of file(data)
  • data route - route of transmitted data
  • start date - start date of transmission
  • end date - end date of transmission
  • reservation ID - reservation ID, which identifies reservation on a given path during transmission

Examples

Quick start

This section describes how to run an example experiment. Apache Ant is required to compile GSSIM and run an experiment.

  • Prepare working directory (working_dir) in your local system and download GSSIM sources from SVN.

The following command can be used to checkout GSSIM sources from SVN:

svn checkout https://apps.man.poznan.pl/svn/gssim/gssim/
  • The example experiments configurations are placed in working_dir/trunk/gssim/example directory.
  • Go to the main project directory working_dir/trunk/gssim and run Ant task:

Parameters:

run - ant target, it is responsible for compiling source code and starting experiment execution

-Dconfig - path to *.properties file with experiment description

bash$ ant run -Dconfig="example/experiment1.properties"


Simulation is started by call of main method from GridSchedulingSimulator class. Input parameter is an array of Strings. First String specifies the path to properties file. For example:

String args[] = {"example/experiment1.properties"};
GridSchedulingSimulator.main(args);

build.xml

build.xml file contains Apache Ant script. Available targets are designed to compile source code, build jars and run experiment. The main target are:

target name description
compile compile all source code except test.* package
jar depends on target compile. Creates following jars: gssim.jar - contains all compiled classes, schedframe.jar - contains classes only from schedframe.* package, gssim_light.jar - contains all classes except those from schedframe.* package.
clean removes build directory from file system
run depends on compile target. It starts execution of the experiment. This target needs an additional parameter: -Dconfig, which value is a path to the experiment description file. Usage: and run -Dconfig="example/experiment1.properties"

Properties File

Example properties file:

# Use single *.swf file as workload description.
# Computing environment consists of two computing resources.
# Plugin implementation and resources does not support reservation scheduling model.
# Grid plugin implements simple round robin algorithm
gridschedulingpluginname=example.gridplugin.GridRoundRobinPlugin
# Local plugin starts task execution in FCFS order
localallocpolicypluginname=example.localplugin.FCFSLocalPlugin
# Linear model of cpu processing power is used for task execution time estimations. 
exectimeestimationpluginname=example.timeestimation.ExecTimeEstimationPlugin
# Path to xml file which describes resource characteristics.
resdesc=example/workload/HostParameters.xml
# Path to swf file with workload description
readscenario.workloadfilename=example/workload/workload.swf
# Choose if files with task processing history should be created.  
printhistory=true
# Choose directory where all result files should be placed.
stats.outputfolder=../experiment1_result

As we can see, it is necessary to specify three plugins (gridscheduling, exectimeestimation and localallocpolicy) to run simulation. Moreover we have to specify resource description file. This file should be valid with GrmsHostParametersSchema.xsd . It should contain description of resources (e.g. number of processors, their speed, memory etc.). Furthermore, workload file is necessary to obtain information concerned with execution of simulation (e.g. execution of tasks). This workload file should be in the standard workload format (http://www.cs.huji.ac.il/labs/parallel/workload/swf.html)

Above files and plugins are sufficient to run simulation in read scenario. In this scenario tasks and their properties are loaded from a workload file. Resources are created based on the information included in resource description file. Plugins specify behavior of grid scheduler and local providers. Statistics are generated after simulation. Read scenario do not attend reservation. Hence, there are generated only four gantts.


Resource description

Energy parameters

The following example shows how to provide classes which implements EnergyInterface for processors and/or computing nodes. If specific configuration (classes) is not provided, GSSIM will use default implementation of EnergyInterface for processors and computing node.

<computingResource  resourceId="compRes1">
	<machineParameters>
		<hostParameter name="cpucount">
			<paramValue>3</paramValue>
			<property name="energyprofile">
				<value>example.energy.profile.CPUEnergyProfile</value>
			</property>
		</hostParameter>
        <otherParameter name="energyprofile">
        	<paramValue>example.energy.profile.ComputingNodeEnergyProfile</paramValue>
        </otherParameter>
	</machineParameters>
</computingResource>

Network topology file

For exemplary network topology:

Figure 11. Exemplary network topology
Figure 11. Exemplary network topology


appropriate description file should look like this:

# total number of routers
5
# each router name
R1
R2
R3
R4
R5    
# total number of connections between routers
5         
# connections between routers
R1  R2  0.01     300    1500000
R2  R5  0.01     300    1500000
R1  R3  0.01     300    1500000
R3  R4  0.01     300    1500000
R4  R5  0.01     300    1500000
# total number of resources
3
# connections between resource and router
compRes1 R2 0.01     300    1500000
compRes2 R4 0.01     300    1500000
dataRes1 R5 0.01     300    1500000
# total number of users
1
# connection between user and router
BrokerInterfaceEntity R1 0.01     300    1500000

The following connections will be created:

R1_R2_link, R2_R5_link, R1_R3_link, R3_R4_link, R4_R5_link and also compRes1_link, compRes2_link, dataRes1_link and BrokerInterfaceEntity_link.


Grid scheduling plugin

In this sections three Grid scheduler plugins are presented. GridschedulingFCFSRR is an example of a simple Grid FCFS Round Robin list scheduling. Tasks are submitted due to order of submission time and providers are selected in round robin mode, hence the name. However, GridschedulingARFCFS is an example of a simple Grid FCFS list scheduling with advance reservation mechanism and GridschedulingARNRDijkstra extends GridschedulingARFCFS with network reservation mechanism.

GridschedulingFCFSRR

This plugin is an implementation of GridSchedulingPlugin interface. In this implementation only schedule method has been implemented. Tasks are served according to FCFS practice. First, each task is taken from taskQueue. Resources are selected in round robin mode. Schedule method returns scheduling plan containing allocation of tasks to resources. Scheduling plan express decision about order of scheduled tasks, time of executions, reservations, resources etc. Next, this decision is used to execute task gridlets on specified resources.

public SchedulingPlanInterface<grms.types.schemas.schedulingplan.SchedulingPlan> schedule(
			SchedulingEvent event, Queue<? extends JobInterface<?>> jobQueue,
			Queue<? extends TaskInterface<?>> taskQueue,
			JobRegistry jobRegistry, 
			ModuleList moduleList,
			Prediction prediction)
			throws Exception {
 
		ResourceDiscovery resources = null;
		for(int i = 0; i < moduleList.size(); i++){
			Module m = moduleList.get(i);
			switch(m.getType()){
				case RESOURCE_DISCOVERY: resources = (ResourceDiscovery) m;
					break;
			}
		}
 
		SchedulingPlan plan = new SchedulingPlan();
		int size = taskQueue.size();
 
		// order of the resources on this list is not determined
		List<ResourceDescription> availableResources = resources.getResources();
		int resourceIdx = -1;
		for(int i = 0; i < availableResources.size(); i++){
			ResourceDescription rd = availableResources.get(i);
			// this works, because only two resources are available
			if(!lastUsedResource.equals(rd.getProvider().getProviderId())){
				resourceIdx = i;
				break;
			}
		}
 
		for(int i = 0; i < size; i++) {
			TaskInterface<?> task = taskQueue.remove(0);
 
			ResourceDescription rd = availableResources.get(resourceIdx % 2);
			resourceIdx++;
			lastUsedResource = rd.getProvider().getProviderId();
 
			Host host = new Host();
			host.setHostname(rd.getProvider().getProviderId());
 
			Allocation allocation = new Allocation();
			allocation.setProcessQuantity(1);
			allocation.setHost(host);
 
			ScheduledTask scheduledTask = new ScheduledTask();
			scheduledTask.setTaskId(task.getId());
			scheduledTask.setJobId(task.getJobId());
			scheduledTask.addAllocation(allocation);		
 
			plan.addTask(scheduledTask);
		}
		return plan;
 
	}

GridschedulingARFCFS

This plugin is an implementation of GridSchedulingPlugin interface. In schedule method the advanced reservation mechanism is used. Tasks are served according to FCFS practice. First, each task is taken from taskQueue. Then, the initialization step is required. In initialization phase reservation manager is used to obtain offers of resource providers for current task. These offers have to meet requirements of the task. Plugin chooses the offer and creates initial reservation. Next, the reservation is committed. Reservation ID and offer is used to create task scheduling decision which compose scheduling plan. Schedule method returns scheduling plan containing allocation of tasks to reservation on resources. Scheduling plan express decision about order of scheduled tasks, time of executions, reservations, resources etc. Next, this decision is used to execute task on specified resources.

	public SchedulingPlanInterface<grms.types.schemas.schedulingplan.SchedulingPlan> schedule(
			SchedulingEvent event, 
			Queue<? extends JobInterface<?>> jobQueue,
			Queue<? extends TaskInterface<?>> taskQueue,
			JobRegistry jobRegistry, 
			ModuleList moduleList,
			Prediction prediction)
			throws Exception {
 
		SchedulingPlan plan = null;
 
		ResourceDiscovery resources = null;
		ReservationManager reservManager = null;
		for(int i = 0; i < moduleList.size(); i++){
			Module m = moduleList.get(i);
			switch(m.getType()){
				case RESERVATION_MANAGER: reservManager = (ReservationManager) m;
					break;
				case RESOURCE_DISCOVERY: resources = (ResourceDiscovery) m;
					break;
			}
		}
 
		// this method is called any time new event arrived. 
		// Events types which are expected to appear here are defined in plugin configuration.
		// See getConfiguration() method
 
		// choose correct method to serve the event
		switch(event.getType()){
 
			case TASK_ARRIVED: plan = scheduleNewTask(jobQueue, taskQueue, 
						jobRegistry, resources, 
						reservManager, null, 
						prediction); 
				break;
 
			case TASK_CANCELED: TaskCanceledEvent e = (TaskCanceledEvent) event; 
					    plan = rejectTask(e.getJobId(), e.getTaskId());
				break;
 
			default: log.info("Scheduling event " + event.getType().name() + " is not" +
					" supportd by the plugin ");
				break;
		}
 
		return plan;
 
	}
 
 
	protected SchedulingPlan scheduleNewTask(Queue<? extends JobInterface<?>> jobQueue,
			Queue<? extends TaskInterface<?>> taskQueue,
					JobRegistry jobRegistry, 
					ResourceDiscovery resources,
					ReservationManager reservManager,
					NetworkManagerInterface networkManager,
					Prediction prediction)
					throws Exception{
 
		// prepare plan with decision of allocating resources to tasks
		SchedulingPlan plan = new SchedulingPlan();
		int size = taskQueue.size();
 
		// iterate over task queue
		for(int i = 0; i < size; i++) {
			TaskInterface<?> task = taskQueue.remove(0);
 
			// prepare description of task time and resource requirements
			TimeRequirements timeRequirements = new TimeRequirements(task);
			ResourceRequirements resourceRequirements = new ResourceRequirements(task);
 
			// get offers from all local resources. Grid plugin MUST be aware of the format
			// and the way in which local resource (plugin) returns offers.
			List<Offer> offers = reservManager.getOffer(timeRequirements, resourceRequirements);
 
			// choose the offer which provides enough free resources. Order of the list "offers" 
			// is not determined, therefore results my differ between executions. 
			Offer offer = null;
			for(int offerIdx = 0; offerIdx < offers.size() && offer == null; offerIdx++){
				// offer contains time allocations suited exactly to task resource and time 
				// requirements, therefore it can be used directly to create reservation. 
				// See local plugin description for details.
				offer = offers.get(offerIdx);
			}
 
			// create reservation. Offer should have only these time resource allocations
			// which should be reserved - all unnecessary allocations must be removed. Allocations
			// may be also brand new objects, but remember to set correct provider info.
			// There is different reservation for different time allocation created.
			List<Reservation> reservations = reservManager.createReservation(offer);
 
			// get the first reservation. In this example, there is only one created
			Reservation r = reservations.get(0);
 
			// set information about task (job and task id) for which this reservation
			// was created
			r.setJobId(task.getJobId());
			r.setTaskId(task.getId());
 
			// commit reservation. Form now reservation can not expire.
			r = reservManager.commitReservation(r);
 
			// prepare allocation description in scheduling plan
 
			// information about destination host
			Host host = new Host();
			host.setHostname(r.getAllocatedResource().getProvider().getProviderId());
 
			// information about created reservation
			Allocation allocation = new Allocation();
			allocation.setProcessQuantity(1);
			allocation.setReservationId(r.getId());
 
			allocation.setHost(host);
 
			// information about the task itself
			ScheduledTask scheduledTask = new ScheduledTask();
			scheduledTask.setTaskId(r.getTaskId());
			scheduledTask.setJobId(r.getJobId());
			scheduledTask.addAllocation(allocation);		
 
			plan.addTask(scheduledTask);
		}
		return plan;
	}
 
	protected SchedulingPlan rejectTask(String jobId, String taskId){
		// prepare plan with decision about task rejection.
		SchedulingPlan plan = new SchedulingPlan();
			// create task 
			ScheduledTask task = new ScheduledTask();
				task.setJobId(jobId);
				task.setTaskId(taskId);
				// set its status as rejected
				task.setStatus(AllocationStatus.REJECTED);
		plan.addTask(task);
		return plan;
	}

GridschedulingARNRDijkstra

This plugin is an implementation of GridSchedulingPlugin interface. It is similar to GridPlugin wtih advance reservation mechanism, but extends it with network reservation mechanism. Reservations are created according to Dijkstra algorithm.

After resource reservation process, the number of required input files for task is computed. Then for every file all necessary network reservation parameters are gathered. These information are used to reserve network infrastructure between two nodes. Basically there are two ways to create such reservation. First, is shown in this example and is based on Dijkstra algorithm which finds the best (shortest) path between two nodes according to time and bandwidth requirements. However, user is able to define his own path relying on task requirements and network parameters provided by network reservation manager. Failed reservation (due to e.g. unsatisfied bandwidth requirements) results in setting scheduledTask status to AllocationStatus.REJECTED. Finally mapping between created reservation and specified file is defined. This association is returned, as a part of scheduling plan, by schedule method. Next, this decision is used to execute task gridlets on specified resources and to send task required files to appropriate resource.

IMPORTANT!

If allocated resource and file host are the same, createReservation method will throw an exception and reservationID will be set to -1. Therefore, this value has to be set to 0, which means sending file without using reservation. To avoid this situation file host resource should be excluded from potential computing resources. Solution is creating file host resource with reservation parameter set to false (when using advance reservation extension), assuming that this resource is only data storage, or implementing appropriate functionality in grid plugin.

	public SchedulingPlanInterface<grms.types.schemas.schedulingplan.SchedulingPlan> schedule(
			SchedulingEvent event, 
			Queue<? extends JobInterface<?>> jobQueue,
			Queue<? extends TaskInterface<?>> taskQueue,
			JobRegistry jobRegistry, 
			ModuleList moduleList,
			Prediction prediction)
			throws Exception {
 
		SchedulingPlan plan = null;
		ResourceDiscovery resources = null;
		ReservationManager reservManager = null;
		NetworkManager networkManager = null;
		for(int i = 0; i < moduleList.size(); i++){
			Module m = moduleList.get(i);
			switch(m.getType()){
				case RESERVATION_MANAGER: reservManager = (ReservationManager) m;
					break;
				case RESOURCE_DISCOVERY: resources = (ResourceDiscovery) m;
					break;
				case NETWORK_MANAGER: networkManager = (NetworkManager) m;
			}
		}	
		// this method is called any time new event arrived. 
		// Events types which are expected to appear here are defined in plugin configuration.
		// See getConfiguration() method
 
		// choose correct method to serve the event
		switch(event.getType()){
			case TASK_ARRIVED: plan = scheduleNewTask(jobQueue, taskQueue, 
						jobRegistry, resources, 
						reservManager, networkManager, 
						prediction); 
				break;
 
			case TASK_CANCELED: TaskCanceledEvent e = (TaskCanceledEvent) event; 
						plan = rejectTask(e.getJobId(), e.getTaskId());
				break;
 
			default: log.info("Scheduling event " + event.getType().name() + " is not" +
					" supportd by the plugin ");
				break;
		}
 
		return plan;
 
	}
 
 
	protected SchedulingPlan scheduleNewTask(Queue<? extends JobInterface<?>> jobQueue,
			Queue<? extends TaskInterface<?>> taskQueue,
					JobRegistry jobRegistry, 
					ResourceDiscovery resources,
					ReservationManager reservManager,
					NetworkManagerInterface networkManager,
					Prediction prediction)
					throws Exception{
 
		SchedulingPlan plan = new SchedulingPlan();
		int size = taskQueue.size();
 
		for(int i = 0; i < size; i++) {			
			//see GridARFirstFit plugin for details
			TaskInterface<?> task = taskQueue.remove(0);
 
			TimeRequirements timeRequirements = new TimeRequirements(task);
			ResourceRequirements resourceRequirements = new ResourceRequirements(task);
 
			List<Offer> offers = reservManager.getOffer(timeRequirements, resourceRequirements);
 
			Offer offer = null;
			for(int offerIdx = 0; offerIdx < offers.size() && offer == null; offerIdx++){
				offer = offers.get(offerIdx);
			}
 
			List<Reservation> reservations = reservManager.createReservation(offer);
 
			Reservation r = reservations.get(0);
 
			r.setJobId(task.getJobId());
			r.setTaskId(task.getId());
 
			r = reservManager.commitReservation(r);
 
			Host host = new Host();
			host.setHostname(r.getAllocatedResource().getProvider().getProviderId());
 
			Allocation allocation = new Allocation();
			allocation.setProcessQuantity(1);
			allocation.setReservationId(r.getId());
			allocation.setHost(host);
 
			ScheduledTask scheduledTask = new ScheduledTask();
			scheduledTask.setTaskId(r.getTaskId());
			scheduledTask.setJobId(r.getJobId());
 
			/************NETWORK MANAGEMENT**********/
 
			//get the number of required input files for task
			int nrOfFiles = 0 ;
			try{
				nrOfFiles = ((Task)task).getDescription().
						getExecution().
						getStageInOut().
						getStageInOutItemCount();
			}
			catch(Exception e)
			{
				//e.printStackTrace();
			}
 
			//prepare allocation properties to store mapping between reservation ID 
			//and specified file name
			AllocationProperty[] alpT = new AllocationProperty[nrOfFiles];
 
			//iterate over task input files
			for(int j = 0; j < nrOfFiles; j++){
 
				//gather information about task input file
				String fileHost = ((Task)task).getDescription().
						getExecution().
						getStageInOut().
						getStageInOutItem(j).
						getFile().
						getLocation().getContent();
				String fileName = ((Task)task).getDescription().
						getExecution().
						getStageInOut().
						getStageInOutItem(j).
						getFile().getName();
				double fileSize = ((Task)task).getDescription().
						getExecution().
						getStageInOut().
						getStageInOutItem(j).
						getFile().getSize();
 
				//estimate network reservation parameters
				long reservationBandwidth = 500000;
				DateTime reservationStart = new DateTime(
						task.getExecutionEndTime().getMillis() - 
						task.getExpectedDuration().getMillis() - 
						2 * MILLI_SEC * ((long)fileSize * BITS / reservationBandwidth ));
				DateTime reservationEnd =  new DateTime(
						task.getExecutionEndTime().getMillis() - 
						task.getExpectedDuration().getMillis());
 
				int	nrid = -1;
 
				//create reservation
				try{
					nrid = networkManager.createReservation( 
						fileHost, 
						r.getAllocatedResource().getProvider().getProviderId(), 
						reservationStart, 
						reservationEnd, 
						reservationBandwidth);
				} catch (NetworkException e){
 
					//if allocated resource and file host are the same, 
					//createReservation method will throw an exception
					//and reservationID will be set to -1, therefore we should set nrid to 0, 
					//which means sending file
					//without using reservation; during our tests we created file host resource 
					//with reservation parameter set to false
					//(we assumed that this resource was data storage)
					if(fileHost.compareTo(r.
						getAllocatedResource().
						getProvider().getProviderId()) == 0)
					{
						nrid = 0;
					}
					//if reservation fails set status to AllocationStatus.REJECTED
					else {
						scheduledTask.setStatus(AllocationStatus.REJECTED);
 
						log.warn("Failed to create reservation -" + e.getMessage());
					}
				}
 
				//create mapping between reservation ID and specified file name
				alpT[j]= new AllocationProperty();
				alpT[j].setName(Integer.toString(nrid));
				alpT[j].setContent(fileName);
 
			}
 
			//add previous created mapping to allocation properties
			AdditionalProperties adp = new AdditionalProperties();
			adp.setAllocationProperty(alpT);
			allocation.setAdditionalProperties(adp);			
			/***********************************/
 
			scheduledTask.addAllocation(allocation);	
 
			plan.addTask(scheduledTask);
		}
		return plan;
	}
 
	protected SchedulingPlan rejectTask(String jobId, String taskId){
		// prepare plan with decision about task rejection.
		SchedulingPlan plan = new SchedulingPlan();
			// create task 
			ScheduledTask task = new ScheduledTask();
				task.setJobId(jobId);
				task.setTaskId(taskId);
				// set its status as rejected
				task.setStatus(AllocationStatus.REJECTED);
		plan.addTask(task);
		return plan;
	}

Local grid plugin

In this section two local scheduler plugins are presented. FCFSAllocPolicy is an implementation of "basic" local interface (Basic Scheduling Interface). However, LocalGridPluginAR is an example of QoS-based plugin (QoS-based Scheduling Interface) where advanced reservation mechanism is implemented.

FCFSAllocPolicy

It is an implementation of LocalSchedulingPlugin interface. Tasks are served according to FCFS algorithm. Two methods are implemented in this plugin: placeTasksInQueues and schedule. In both methods each task which can be executed on considered resource is moved from taskQueue to execute list (list of tasks currently being executed).

/*
* This example implementation puts all new tasks in first queue.
* Tasks are served in the same order they appear in newTasks list.
*/
public int placeTasksInQueues(List<? extends TaskInterface<?>> newTasks,
			List<? extends Queue<? extends TaskInterface<?>>> queues,
			ResourceUnitsManager unitsManager) {
 
		// get the first queue from all available queues.
		Queue<? extends TaskInterface<?>> q = queues.get(0);
 
		// do this trick to enable access to add() method.
		Queue<TaskInterface<?>> queue = (Queue<TaskInterface<?>>) q;
 
		// move tasks from newTask list to the queue.
		for(int i = 0; i < newTasks.size(); i++){
			TaskInterface<?> task = newTasks.remove(0);
			queue.add(task);
		}
 
		return 0;
	}
 
public void schedule(SchedulingEvent event,
			List<? extends TaskInterface<?>> inExecution,
			List<? extends Queue<? extends TaskInterface<?>>> queues,
			ResourceUnitsManager unitsManagerInterface) {
 
		// do this trick to make add() method available
		List <TaskInterface<?>> execute = (List<TaskInterface<?>>) inExecution;
 
		// chose the events types to serve. 
		// Different actions for different events are possible.
		switch(event.getType()){
			case START_TASK_EXECUTION:
			case TASK_FINISHED:
				// our tasks are placed only in first queue (see 
				// BaseLocalPlugin.placeTasksInQueues() method)
				Queue<? extends TaskInterface<?>> q = queues.get(0);
				// check all tasks in queue
				for(int i = 0; i < q.size(); i++){
					TaskInterface<?> task = q.get(i);
					// if status of the tasks in READY
					if(task.getStatus() == Gridlet.READY){
						// then try to execute this task. Add it the execute list.
						if(execute.add(task)){
							// if task started successfully, 
							// then remove it from the queue.
							q.remove(i);
							i--; // index trick to get the right position 
							     // in the queue after task removal.
						}
					}
				}
				break;
		}
 
	}
 
/*
* This implementation calculates total energy as sum of energy consumed by all 
* computing nodes. Task context - different energy characteristics of different tasks 
* (cpu intensive vs data intensive) is not taken into the consideration. 
*/
public int estimatePowerConsumption(InExecuionList list, ResourceUnitsManagerInterface unitManager){
 
		int sum = 0;
 
		// Get all computing nodes managed by this queuing system.
		Collection<ComputingNode> collection = unitManager.getComputingNode();
		Iterator<ComputingNode> itr = collection.iterator();
 
		while(itr.hasNext()){
			ComputingNode node = itr.next();
 
			// sum energy consumed by all computing nodes. 
			sum += node.getEnergyProfile().getEnergyConsumption();
		}
 
		return sum;
	}

LocalGridPluginAR

LocalGridPluginAR plugin is an implementation of LocalSchedulingARPlugin interface. LocalSchedulingARPlugin interface represents methods of a local scheduling plugin with advance reservation functionality. This interface has much more functions to implement. Hence, abstract class BaseLocalARPlugin has been written to facilitate implementation of this interface. This implementation shows how to use the LocalReservationManager to manage reservations on local site. In LocalGridPluginAR the method getOffers is implemented to enable negotiation mechanism. This method returns offers for resource consumer. In LocalGridPluginAR getOffers returns earliest allocation which meets time and resource requirements.

public List<Offer> getOffers(AbstractTimeRequirements<?> timeReqs,
			AbstractResourceRequirements<?> resReqs,
			List<? extends TaskInterface<?>> inExecution,
			List<Queue<? extends TaskInterface<?>>> queues,
			ResourceUnitsManager unitsManagerInterface,
			LocalReservationManager reservManager)
			throws ReservationException {
 
		LocalReservationManagerImpl reservationManager = (LocalReservationManagerImpl) reservManager;
		ResourceUnitsManagerImpl unitsManager = (ResourceUnitsManagerImpl) unitsManagerInterface;
 
		List<TimeResourceAllocation> usageList = null;
		int reqCpuCnt = 0;
		List<Offer> list = new ArrayList<Offer>(1);
 
		try {
 
			// get number of processors requested by the task
			reqCpuCnt = Double.valueOf(resReqs.getCpuCntRequest()).intValue();
 
			// create unit for which resource usage will be created
			Processors processors = new Processors(unitsManager.getResourceName(),
						 unitsManager.getNumPE(), reqCpuCnt);
 
			DateTime startTime = timeReqs.getStart();
			DateTime endTime = timeReqs.getEnd();
 
			// get resource usage
			usageList = reservationManager.resourceUsageList(processors, startTime, endTime);
 
			// check if resource can provide enough resources in requested time.
			// Prepare one, the earliest time period, which is suited exactly
			// for resource and time requirements.
			TimeResourceAllocation allocation = createSuitedAllocation(usageList,
								(TimeRequirements) timeReqs, 
								(ResourceRequirements) resReqs);
			// allocation is null when it is impossible to provide enough resources 
			// in requested time. Return empty offers list.
			if(allocation == null)
				return list;
 
			// add information about resource described by this allocation.
			// provider
			LocalSystem provider = new LocalSystem(unitsManager.getResourceName(), null, null);
 
			// available resource units
			ResourceStateDescription resDesc = new ResourceStateDescription(provider);
				resDesc.addResourceUnit(new Processors(unitsManager.getResourceName(), 
							unitsManager.getNumPE(), reqCpuCnt));
 
			allocation.setAllocatedResource(resDesc);
 
			// prepare offer
			Offer offer = new Offer();
				offer.setProvider(provider);
				offer.add(allocation);
 
			list.add(offer);
 
		} catch (NoSuchFieldException e) {
			e.printStackTrace();
			return list;
		}
 
		return list;
	}

Execution Time Estimation plugin

This is an example of implementation of Execution Time Estimation interface. First, ratings of processors are summed up and total rating is computed. Then, duration of the task is calculated as the division of task length by a total rating of available processors.

ExecTimeEstimationPluginImpl

public double execTimeEstimation(
			Map<ResourceParameterName, ResourceUnit> allocatedResources,
			TaskInterface<?> task, double remainingLength) {
 
		// collect all information necessary to do the calculation
		Processors processors = (Processors) allocatedResources.get(ResourceParameterName.CPUCOUNT);
 
		// obtain single processor speed
		int speed = processors.getProcessorSpeed();
 
		// number of used processors
		int cnt = processors.getUsedAmount();
 
		// lenght of the task (in instructions)
		long taskLength = task.getLength();
 
		// do the calculation
		double execTime = (taskLength / (cnt * speed));
 
		// if the result is very close to 0, but less then one millisecond then round this result to 0.001
		if (Double.compare(execTime, 0.001) < 0) {
			execTime = 0.001;
		}
 
		// time is measured in integer units, so get the nearest execTime int value.
		execTime = Math.ceil(execTime);
		return execTime;
}



Workload (read scenario)

The following configuration is based on real log generated by DAS2 Supercomputer. Other logs can be found in Parallel Workloads Archive.

The example configuration consists of 3 files:

  • experiment.properties - see configuration file section for detail description of properties parameters. The file can be downloaded from here.
gridschedulingpluginname=simulator.plugin.gridscheduling.examples.GridSchedulingFCFS
exectimeestimationpluginname=simulator.plugin.estimation.implementation.ExecTimeEstimationPluginImpl
localallocpolicypluginname=simulator.plugin.localallocationpolicies.algorithms.FCFSAllocPolicy
resdesc=experiment/resources.xml
readscenario.workloadfilename=experiment/das2-fs0-2003-1-test3.swf
printhistory=true
  • das2-fs0-2003-1.swf - this is a part of the whole log, consisting of 67 jobs. This log is used in its original form, without any changes and extra parameters. The file can be downloaded from here.
  • resources.xml - this file contains description of one computing resource. A simple queuing system description is provided in comment. It is an alternative for computing resource description, but is fully equivalent, therefore both description forms can be used and they will provide the same results. The file can be downloaded from here.


Experiment was executed by following command:

java -classpath jars_from_lib_dir simulator.GridSchedulingSimulator path/to/experiment.properties

or using Bash script:

./gssim.sh path/to/experiment.properties

The result of the simulation is available here.

Workload (read scenario with xml)

Advance Reservation + not finished tasks

The next example workload consist of two tasks and use advance reservation scheduling model. Therefore, both tasks require xml descriptions with definition of <executionTime/> section. This workload is fully artificial, and was prepared to express simulator behaviour in case of usage too short reservations.

  • workload.swf - can be downloaded from here.

Parameters defined in swf file have following interpretation:

;StartTime: Mon Nov 03 10:00:00 CET 2008

Simulation starts in Monday, 03 November 2008, at 10:00:00. This is the start point from which other values, defined as number of seconds after start time (like task submit time), are calculated.

;PUSpeed: 1

Default speed of single CPU equals 1. This value has no unit and can be freely interpreted.

;IDMapping: swfID:jobID:taskID
; 1:0:0, 2:1:0 
;IDMapping: end

The above section defines mapping between tasks in swf file and tasks descriptions in xml files. This mapping should be read as follows: swf task of id 1 is mapped to task of id 0 in job of id 0 and swf task of id 2 is mapped to task of id 0 in job of id 1.

  • xml job description files - can be downloaded from here.

Both files have the same structure, therefore following description is limited to single but more sophisticated job.

<grmsJob appId="0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" >
	<task taskId="0">	
		<requirements>
			<resourceRequirements>
				<computingResource>
					<hostParameter name="cpucount">
						<value>4</value>
					</hostParameter>
				</computingResource>
			</resourceRequirements>
		</requirements>
		<executionTime>
			<executionDuration>PT50M</executionDuration>
			<timePeriod>
				<periodStart>2008-11-03T11:00:00.000+01:00</periodStart>
				<periodEnd>2008-11-03T13:00:00.000+01:00</periodEnd>
			</timePeriod>
		</executionTime>
	</task>
</grmsJob>

According to the mapping in workload.swf file, this job description is related to job number 1 in swf file. The resource requirement "cpucnt" and "requested number of processors" have equal values. Additional execution time constraints are defined in xml description. They require to start execution of the task after 11:00:00 and finish before 13:00:00 in 03.11.2008. Execution duration is set to 50 minutes what force simulator to create 50 minutes long resource advance reservation. Unfortunately, task "run time" in swf is set to 3600 seconds (60 minutes), which means that reservation will be too short. In such case, task will be executed for first 50 minutes and then cancelled in the moment when reservation ends.

The results of execution of this workload are available here.

Advance Reservation + Network Reservation

Using GSSIM network extension provides capability to extend XML job description file with new parameters. First of all <networkParameter/> section can be added as a part of <computingResource/>. It allows user to describe additional task requirements related to network topology. As mentioned above, the main purpose of using GSSIM network extension is ability to simulate network model by e.g. transferring data over network. Therefore <stageInOut/> section(part of <execution/>) enables user to specify parameters of task input files, such as file: name, size and location.

<grmsJob appId="1" 
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
xsi:noNamespaceSchemaLocation=
"/Users/marcin/workspace/brain/symulator/gssim_current/simulator/schemas/grms3/GrmsJobDescriptionSchema.xsd">
	<task taskId="0">	
		<requirements>
			<resourceRequirements>
				<computingResource>
					<hostParameter name="cpucount">
						<value>1</value>
					</hostParameter>
					<networkParameter name="bandwidth">
						<value>100000</value>
					</networkParameter>
					<networkParameter name="latency">
						<value>300</value>
					</networkParameter>
				</computingResource>
			</resourceRequirements>
		</requirements>
		<execution>
			<stageInOut>
				<file name="file01" type="in" size="500000">
					<location type="REFERENCE">dataRes1</location>
				</file>
				<file name="file02" type="in" size="1000000">
					<location type="REFERENCE">dataRes1</location>
				</file>
			</stageInOut>
		</execution>
		<executionTime>
		<executionDuration>PT2H</executionDuration>
			<timePeriod>
				<periodStart>2008-11-03T00:00:00.000+01:00</periodStart>
				<periodEnd>2008-11-03T02:30:00.000+01:00</periodEnd>
			</timePeriod>
		</executionTime>
	</task>
</grmsJob>

Complete XML job description is shown above and should be read as follows. Task with id = 0 requires two files: file01 (500000 bytes) which is stored on dataRes1 and file02 (1000000 bytes) located on dataRes2. Task requires also specified link parameters: bandwidth 100000 bytes/s and latency 300 ms. Network requirements are not considered during simulation but they provide helpful information for Grid scheduler plugin. We assume that dataRes1 is defined in appropriate resource description file.

Workload (create scenario) - generated workloads

Example configuration

The following listing emphasise all main features of workload generator.

The example configuration have following interpretation:

  • Simulator will start with initial date: 15.01.2009 at 10:00:00. This time is used as a start date each time the experiment is executed. All other time values (like submit time) are calculated as number of seconds after this initial date.
  • Generator will produce 100 jobs with only one task each. There are no deviations from this values, because both <JobCount/> and <TaskCount/> have constant distribution.
  • The length of each task will vary between 500 and 1500 instructions, but the in the average it will be around 1000 instructions.
  • Job package length is set to 1, therefore all jobs will have different submit time.
  • Next task will be submitted minimum 0 and maximum 100 seconds after submission of the following one. The average distance between submission of two tasks will be around 50 seconds.
  • The value of cpucount resource requirement will vary between 2 and 7, but the average will be around 6.
  • The value of memory resource requirement is defined as function of cpucount value. Generator will use definition with id=cpucnt to create some random value, then multiply this value by 100 and add 3. The result of this calculation will be set as memory resource requirement value.
  • There are 2 different generator definitions for diskspace resource requirement. For 20% of tasks value of diskspace resource requirement will be equal 15. For rest 80% of tasks value of the same requirement will vary between 20 and 80, but the average value will be equal 50.
  • The value of cpuspeed resource requirement depends on time when task is submitted to the system. For tasks submitted between 10:10:00 and 10:20:00 cpuspeed requirement will be equal 17. For tasks submitted between 10:45:00 and 11:00:00 cpuspeed requirement will be equal 19. For all tasks submitted in different time intervals value of cpuspeed resource requirement will be equal 13.
<tns:WorkloadConfiguration 
	xmlns:tns="http://www.man.poznan.pl/WorkloadSchema" 
	xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 
  	<SimulationStartTime>2009-01-15T10:00:00</SimulationStartTime>
 
	<JobCount avg="100" distribution="constant"/>
 
	<TaskCount avg="1" distribution="constant"/>
 
	<TaskLength avg="1000" min="500" max="1500" stdev="500.0" distribution="normal"/>
 
	<JobPackageLength avg="1.0" min="1.0" max="1.0" stdev="0.0" distribution="constant"/>
 
	<JobInterval avg="50" min="0" max="100" stdev="30" distribution="normal" seed="21"/>
 
 
	<ComputingResourceHostParameter metric="cpucount">
		<value id="cpucnt" avg="6" min="2" max="7" stdev="3.0" distribution="normal" />
	</ComputingResourceHostParameter>
 
	<ComputingResourceHostParameter metric="memory">
		<value id="memory" refElementId="cpucnt" expr="100*x+3"/>	
	</ComputingResourceHostParameter>
 
	<ComputingResourceHostParameter metric="diskspace">
		<value>
			<MultiDistribution>
				<dist avg="15" distribution="constant">0.2</dist>
				<dist avg="50" min="20" max="80" distribution="normal">0.8</dist>
			</MultiDistribution>
		</value>
	</ComputingResourceHostParameter>
 
	<ComputingResourceHostParameter metric="cpuspeed">
		<value avg="13" distribution="constant">
			<PeriodicValidValues avg="17" distribution="constant">
				<BeginValidTime>2009-01-15T10:10:00</BeginValidTime>
				<EndValidTime>2009-01-15T10:20:00</EndValidTime>
			</PeriodicValidValues>
			<PeriodicValidValues avg="19" distribution="constant">
				<BeginValidTime>2009-01-15T10:45:00</BeginValidTime>
				<EndValidTime>2009-01-15T11:00:00</EndValidTime>
			</PeriodicValidValues>
		</value>
	</ComputingResourceHostParameter>
 
</tns:WorkloadConfiguration>

Example execution

Proper execution of workload generator requires:

Listing of the example properties file is presented below.

resdesc=generator/resources.xml
createscenario.tasksdesc=generator/generatorConfig.xml
createscenario.outputfolder=generator/workload
createscenario.workloadfilename=workload.swf
createscenario.overwrite_files=true

Generator can be executed by the following command:

java -classpath jars_from_lib_dir simulator.workload.generator.WorkloadGenerator path/to/generator.properties

or using Bash script:

./gssim -cw path/to/experiment.properties

Generator files

References

osobiste
PL - GRID