QoS Monitoring Data Set
This page serves as an appendix to the short paper A Monitoring Data Set for Evaluating QoS-Aware Service-Based Systems, to appear at the 4th International Workshop on Principles of Engineering Service-Oriented Systems (http://www.s-cube-network.eu/pesos-2012). The paper describes a data set of monitoring data, which we envision to be used to evaluate upcoming work in the area of QoS and SLA prediction for composed services and service-based applications.
This artifact is an extensive set of monitoring data, measured from a running sample service composition. The example application has been implemented using .NET Windows Communication Foundation (WCF) technology, and was deployed on a server running Windows 2007 SP2 (64 bit). The server machine was equipped with 2 2.99GHz Xeon X5450 processors and 32 GByte RAM. A graphical screen shot (in Windows Workflow Foundation notation) of the monitored service composition can be found here: http://www.infosys.tuwien.ac.at/prototypes/VRESCo/workflow.jpg.
The following table displays some basic statistics of the data set.
|Nr. of Instances (Total)||9848|
|Nr. of Attributes||89|
|Nr. of Activities of Monitored Composition||45|
|Nr. of Service Invocation Activities||23|
|Mean Process Duration||36208 ms|
|Process Duration Standard Deviation||6227|
The data set has been generated using Windows Workflow Foundation, Windows Communication Foundation and the VRESCo research prototype (see here: https://www.infosys.tuwien.ac.at/prototypes/VRESCo/). The data itself is formatted in the ARFF format of the WEKA machine learning toolkit (http://www.cs.waikato.ac.nz/ml/weka/arff.html), to ease analyzing the data with WEKA.
How to install
How to use
Every line in the data set is a single composition instance. Lines are whitespace-separated list of attribute values. Some values contain whitespace themselves, these are enclosed with single quotation marks. The special character ? identifies a missing attribute value. The first value in each row is an UUID identifying the instance. The following values are various metrics, which can be monitored from the running instance, such as response times of services, the ordered product, the duration of subbranches, and similar. From these values, the most interesting is the first attribute (DELIVERY_TIME), which represents the duration of the process instance as a whole. The boolean attributes at the end of each line are indicators of whether a given adaptation action has been applied to this instance. 1 indicates that this action has been applied, 0 that it has been skipped. The concrete semantics of each adaptation are implementation-specific, and cannot be described here for reasons of brevity. More information on each attribute is provided inline.
Please contact Philipp Leitner (mailto:email@example.com) for enquiries about this data set.
The data set has originally been used in the evaluation of a contribution to IEEE Transactions on Services Computing:
Leitner, P.; Hummer, W.; Dustdar, S.; , Cost-Based Optimization of Service Compositions IEEE Transactions on Services Computing , forthcoming. doi: 10.1109/TSC.2011.53 URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6072201&isnumber=4629387
Related case study