Rich Client Platform for the DIA-integrated Development

DICE focuses on the quality assurance for data-intensive applications (DIA) developed through the Model-Driven Engineering (MDE) paradigm. The project aims at delivering methods and tools that will help satisfying quality requirements in data-intensive applications by iterative enhancement of their architecture design. One component of the tool chain developed within the project is the DICE IDE. It is an Integrated Development Environment (IDE) that accelerates the development of data-intensive applications.

The Eclipse-based DICE IDE integrates most of the tools of the DICE framework and it is the base of the DICE methodology. As highlighted in the deliverable D1.1 State of the Art Analysis, there does not exist yet any MDE IDE on the software market through which a designer can create models to describe and analyse data-intensive or Big Data applications and their underpinning technology stack. This is the motivation for defining the DICE IDE.

The DICE IDE is based on Eclipse, which is the de-facto standard for the creation of software engineering models based on the MDE approach. DICE customizes the Eclipse IDE with suitable plug-ins that integrate the execution of the different DICE tools, in order to minimize learning curves and simplify adoption. In this blog post we explain how the DICE tools introduced to the reader earlier have been integrated into the IDE. So, How’s the DICE IDE built?

Securing federated cloud networks using Service Function Chaining

Sébastien Dupont - CETIC

Software defined networks networks (SDN), network function virtualization (NFV) and network function chaining (SFC) technologies enable more advanced and flexible cloud federation mechanisms. In this blog post, we will show how to use those technologies in federated clouds to improve security.

Protecting network overlays using Service Function Chaining

Cloud networks security can be significantly improved by composing network functions such as firewalls, intrusion detection, deep packet inspection, etc. The image below illustrates how data flows through different paths depending on network security policies.

 

What about protecting federated networks?

SFC and NFV provide a way to secure each individual network inside a cloud federation. The following figure shows two federated networks belonging to different clouds that are protected using SFC/NFV. Each cloud administrator manages its own network security policy, and an additional global federated network security policy is applied on top. For each cloud, the intra-cloud inbound and outbound traffics go through a series of NFV.

 

 

Protecting an OpenStack federation with SFC/NFV

The OpenStack Heat project provides a template-based orchestration mechanism, formalised in YAML (YAML Ain’t Markup Language) that can be extended to support SFC network security policies. The TOSCA project proposes a service manifest specification for NFV, which can be translated in Heat.

 

 

We are currently investigating two Openstack components to protect an OpenStack cloud federation: Tackerfor the NFV management and networking SFC for the NFV orchestration.

Case studies

SFC/NFV Encryption

In this scenario we consider three clouds, the connection with one of those clouds is untrusted. To secure the communications, we can add encryption and decryption at the network level using dedicated SFC/NFV.

 

Here is an extract of the service manifest that describes the global security policy:

 

SFC/NFV Encryption and Deep Packet Inspection

Some network functions should be done asynchronously to avoid slowing down the traffic. In this scenario, the encryption and firewalling operations are done synchronously because the security system needs to respond directly when traffic goes through those NFV, whereas DPI could be applied after the traffic has already gone through.

 

References

Philippe Massonet, Anna Levin, Massimo Villari, Sébastien Dupont and Arnaud Michot: Enforcement of Global Security Policies in Federated Cloud Networks with Virtual Network Functions. NCA 2016.

Philippe Massonet, Sébastien Dupont, Arnaud Michot, Anna Levin, Massimo Villari: An architecture for securing federated cloud networks with Service Function Chaining. ISCC 2016: 38-43

Philippe Massonet, Anna Levin, Antonio Celesti, Massimo Villari: Security Requirements in a Federated Cloud Networking Architecture. ESOCC Workshops 2015: 79-88

Formal Verification of Data-Intensive Applications with Temporal Logic

Beside functional aspects, designers of Data-Intensive Applications have to consider various quality aspects that are specific to the applications processing huge volumes of data with high throughput and running in clusters of (many) physical machines. A broad set of non-functional aspects positioned in the areas of performance and safety should be included at the early stage of the design process to guarantee high-quality software development.

The evaluation of the correctness of such applications, and when functional and non-functional aspects are both involved, is definitely not trivial. In the case of Data-Intensive Applications, the inherent distributed architecture, the software stratification and the computational paradigm implementing the logic of the applications pose new questions on the criteria that should be considered to evaluate their correctness.

 

Data-intensive applications are commonly realized through independent computational nodes that are managed by a supervisor providing resources allocation and node synchronization functionalities. Message exchange is guaranteed by an underlying network infrastructure over which the (data-intensive) framework might implement suitable mechanisms to guarantee the correct message transfer among the nodes. The logic of the application is the tip of the iceberg of a very complex software architecture which the developer cannot completely govern. Between the application code and the deployed running executables there are many interconnected layers, offering abstractions and running control automatisms, that are not visible to the developers (such as, for instance, the containerization mechanisms, the cluster manager, etc.).

Besides the architectural aspects of the framework, the functionality of data-intensive applications requires, in some cases, a careful analysis of the notion of correctness adopted to evaluate the outcomes. This is the case, for instance, of streaming applications. The functionality of streaming applications is defined through the combination and concatenation of operations on streams, i.e., infinite sequences of messages originated from external data sources or by the computational nodes constituting the application. The operations can transform a stream into a new stream or can aggregate a result by reducing a stream into data. Sometimes, the operations are defined over portions of streams, called windows, that partition the streams on the basis of specific grouping criteria of the messages in the stream. The complexity and the variety of parameters defining the operations make the definition of the streaming transformation semantics not obvious and the assessment of their correctness far from being trivial.

In DICE, the evaluation of correctness concerns “safety” aspects of data intensive applications. Verification of safety properties is done automatically by means of a model checking analysis that the designer performs at design time. The formal abstraction which models the application behavior is first extracted from the application UML diagrams and later verified to check the existence of incorrect executions, i.e., executions that do not conform with specific criteria identifying the required behavior. Time and the ordering relation among the events of the application are the main aspects characterizing the formalism used for verification, that is based on specific extensions of Linear Temporal Logic (LTL). As already pointed out, since the technological framework affects the definition of correctness to be adopted for evaluating the final application, the formal modeling devised for DICE verification combines an abstraction of functional aspects with a simplified representation of the computational paradigm adopted to implement the application.

DICE verification is carried out by D-verT and focuses on Apache Storm and (soon) Spark, two baseline technologies for streaming and batch applications. The computational mechanism they implement is captured by means of logical formulae that, when instantiated, given a specific DTSM application model, represent the executions of the Storm (or Spark) application. The analyses that the user can perform, from the DICE IDE, are bottleneck analysis of Storm applications and worst time analysis of Spark applications (the latter is a work in progress).

In the first case, the developer can verify the existence of a node of a Storm application that cannot process the incoming workload with a timely computation. In such a case, the node is likely to be a bottleneck node for the application that might cause memory saturation and drop the overall performance. In the second case, the developer can perform a worst case analysis of the total time span required by a Spark application to complete a job. The overall job time, that must meet a given deadline at runtime, can be evaluated on the basis of a task time estimation, for the physical resources available in the cluster, that must be known before running the verification.

Details about verification techniques can be found in Deliverable D3.5 – Verification tool Initial Version and on the DICE Github repository.

Related material:

  1. Francesco Marconi, Marcello M. Bersani, Madalina Erascu, Matteo Rossi:
    Towards the Formal Verification of Data-Intensive Applications Through Metric Temporal Logic. ICFEM 2016
  2. Francesco Marconi, Marcello Maria Bersani and Matteo Rossi: Formal Verification of Storm Topologies through D-verT. SAC 2017

Marcello M. Bersani and Verification team (PMI)

ENTICE & TEDX - Radu Prodan: The Dark, Disruptive Side of the Cloud

In our latest blog we look back at a recent TEDx talk from the ENTICE Scientific Coordinator, Radu Prodan, where he provides insight into the technology of clouds, the historical development, their interconnection today and what kind of possibilities there are for the future. 

Radu Prodan is a trained engineer and Doctor of Technical Sciences, and Technical Coordinator of the ENTICE project. This talk discusses the mysterious Clouds as today’s de-facto interconnection, storage, and computing paradigm, gathering billions of devices spread around the globe. 

This talk was given at a TEDx event using the TED conference format but independently organized by a local community. Learn more at http://ted.com/tedx

ENTICE: 5th Cloud Assisted Conference

In collaboration with Slovenia’s Chamber of Commerce, the University of Ljubljana and the ENTICE project are co-organising the 5th Cloud Assisted Conference on November 9th, 2016. The programme and the presentations are available online.

At the CLASS 2016 event several projects of Slovenia’s Smart Specialization 55 Mio EUR funding programme are presented along with the results of Horizon 2020 projects related to smart cities, homes, communities, eHealth and Industry 4.0.

If you want to know more about ENTICE then why not take a look at our excellent commercial use cases?

Performance and Reliability in DIA Development

Worried about the performance and reliability of your data-intensive application?

A Capgemini research shows that only 13% of organizations have achieved full-scale production for their Data-Intensive applications (DIA). In particular the research refers to applications using Big Data implementations, such as Hadoop MapReduce, Apache Storm or Apache Spark. Apart of the correct deployment and optimization of a DIA, software engineers face the problem of achieving performance and reliability requirements. Definitely, a framework to assist in guaranteeing these requirements in the very early phases of the development could be of great help. Consider that in later phases, the ecosystem of a cluster is not completely controllable. Therefore, predictions of throughputs, service times or scalabilities with varying number of users, workloads, network traffic or failures are a need. Within the DICE project, Simulation tool has been developed to help achieve that.

 

If you are looking for a quality-driven framework for DIA development, the Simulation tool [1] of the DICE project can be your choice. This tool makes it easier to simulate the behavior of the system prior to the deployment. Hence, you get a real-world testbed that allows the performance assessment of the DIA. The Simulation tool features:

  • Prediction of performance metrics: throughput, utilization or service time;
  • Detection of performance bottlenecks;
  • Detection of reliability issues.

Once the software developers get the simulation results, they can consequently configure, adapt, or optimize their DIA to the specific execution context. The Simulation tool offers a modeling environment integrated within the Papyrus Eclipse tool. It guides the software developer through the design and analysis phases. The Simulation tool covers all the steps of a simulation workflow, as follows:

  1. The modeling with high-level description languages, in particular UML, using a novel profile for describing the parameters and characteristics of the system,
  2. the transformation to performance models, specifically to Stochastic Petri Nets, that are suitable for prediction, and
  3. last but not least, the analysis of the model and the retrieval of the performance results.

The following image offers an overview of the simulation workflow with the internal tools, modules, and configurations. The transformation of UML to a Stochastic Petri Net is done by a model-to-model (M2M) transformation using the QVTo language. The Stochastic Petri Net is analyzed by the GreatSPN tool, that produces the performace results.

 

The Simulation Tool has been integrated within the DICE IDE, but it can also be used as a stand-alone application. Currently, the Simulation tool supports platform-independent models as well as the Storm technology. We plan to extend the technology support to Apache Spark, Tez and Hadoop in the following releases. For more details about the Simulation tool, please visit our Github page.

José Merseguer, José I. Requeno and Diego Pérez (ZAR)

References:

[1] A. Gómez, C. Joubert and J. Merseguer. A Tool for Assessing Performance Requirements of Data-Intensive Applications. XXIV Jornadas de Concurrencia y Sistemas Distribuidos (JCSD 2016).

BEACON's Federated SDN

This blog post by OpenNebula Systems, outlines the features of the Federated SDN in BEACON and how it is structured.

BEACON is all about federating networks across clouds infrastructures securely. The Federated SDN is the software component that allows to build a Federated Network by aggregating two or more Federated Network Segments. It features an API to allow for Federated Network definitions, and uses adapters to talk to the federation agents APIs in different cloud infrastructures as well as to the Cloud Management Platforms (CMP). It is in charge of cross site networking, managing federated networks, and as such will address the following functionality in the first cycle. This component addresses the "Management of L2 overlays" software requirement of the project.  

 

This component features a REST interface to expose the functionality of the core component, which manages pools for the different data objects that represent the networking infrastructure that federates. A database is used to persist the data moel, and a well defined API allows the interaction between the Federated SDN core with the underlying cloud by the use of different adapters for OpenStack and OpenNebula based infrastructures. A high level view of this component architecture is depicted in the following figure. 

 

The Federated SDN features four first class data citizens, the federated network and federated segment objects, the tenant representation and the different cloud sites abstractions. Also, to interact with the different clouds that needs to be federated at the network level, the Federated SDN features cloud adapters. Initially two adapters, OpenNebula and OpenStack, have been developed. Each adapter is composed of a set of scripts that receive parameters from standard input and return results with standard output.

Using Apache Storm for Trend Detection in the Social Media

As it is widely known, especially in the media industry, messages posted in social media contain valuable information related to events and trends in the real world. Various industries and brands that analyze social media are gaining valuable insights and information which they use in a number of operations.

For example, in the news industry, trend detection is useful for:

  • identifying emerging news based on the popularity of a certain topic and
  • defining areas of great public interest that should be closely monitored as even a small development affects many people and leads to emerging news.

 

As another example, in the financial sector, trends may have both short-term and long-term consequences, affecting from the daily price of stock to a country’s macroeconomic indicator. As an example, a trend demanding military action in the Middle East as a result of a terrorist attack may affect oil prices and subsequently decrease car sales.

To this end, and taking into account the large scale of that type of content, it is essential to develop methods for efficient trend detection in real-time.

For example, in recent years the pace of decision-making in breaking-news journalism has significantly increased. This is due to the multiplication of digital sources and incoming data streams, digital production processes, automation, real-time publishing and largely mobile news audiences.

The Storm topology in the following figure is a first sketch for the implementation of a known trend detection method in a distributed manner. The method is a feature-pivot method that analyzes the temporal distributions of words and discovers trends by grouping trending keywords together.

 

There are different possible inputs to the topology: Candidate spouts include the Twitter streaming API and queues that inject messages into the topology (Redis, Apache Kafka). The first processing bolt is responsible for the extraction of entities and keywords from the incoming messages.

Trivial keywords (e.g. stop-words) are discarded while the rest of them are forwarded to the next bolt. The Timeline Generation bolt aggregates tuples of keywords –timestamps and creates a set of statistics for each keyword. In other words, this bolt calculates a background model of expected frequencies based on historical data. Tuples associated with the same keywords are aggregated in the same worker of the Timeline Generation bolt in a similar fashion as in map-reduce.

The resulting baseline model is forwarded to the next bolt each time there is an update. Then, the Bursty Keywords Detection bolt compares current frequencies to the baseline model and detects keywords, for which their difference is extraordinary.

Finally, the detected bursty keywords are clustered together in the final bolt of the topology based on keywords co-occurrences. The extracted trends are stored in a database.

We are currently conducting experiments on this Trend Detector topology and trying out changes that may improve the quality of results.

The flexiOPS Use Case

“I see only murk. Murk outside; murk inside. I hope, for everyone’s sake, the scanners do better.”— from A Scanner Darkly, by Philip K Dick

In this post, flexiOPS developer Andrew Phee details the implementation of the flexiOPS Use Case for the BEACON project.

The BEACON Use Case involves using an open source security scanner to highlight security limitations of Virtual Machine (VM) deployments. The scanner is configured to support scanning of VMs from multiple cloud platforms. The scanner that was chosen for use was OpenVAS, a powerful open source vulnerability scanning framework.

The overall result of the work carried out is that in the case where a new VM is created on the platform, it is scanned by OpenVAS for security vulnerabilities. Next, the generated security report is emailed to the VM owner, and a firewall is created and applied to the VM for additional security measures.

Existing on the FCO platform is a program known as a trigger. In the FCO platform, a trigger is a program that “allows an action in Flexiant Cloud Orchestrator to initiate a second action”[1]. In this case, the code is executed in the event that a new VM is created on a specific customer account. The trigger codes resulting action first launches the new VM into a running state, then uses the client socket executable located on the FCO management box to send the VM details (IP, UUID etc) to the vulnerability scanner listener program located on a separate server.

Assuming the vulnerability scanner is listening properly, it receives the VM details and uses them to build commands to be sent to the OpenVAS deployment. OpenVAS then performs actions based on the received commands. The main task performed by OpenVAS is to carry out a security vulnerability scan on the VM which was newly created at the beginning of the process. This scan generates a report, which provides insight into how vulnerable to security attackers the VM is.

This report is sent to the customer email associated with the account used to create the VM. This could potentially be extremely useful for a VM owner, as they can use the report to understand exactly where the security failings are on their VM and make improvements accordingly.

Finally, the vulnerability scanner listener creates a generic firewall on the FCO platform, and applies it to the VM. While not specifically configured to address the security problems highlighted by the OpenVAS scan report, it nonetheless provides an additional security layer for the VM.

This process helps provide immediate security improvements in the form of creating and applying a firewall to the VM. Possible future improvements are also feasible, as the VM owner has the OpenVAS report which highlights areas in which the security of the VM can be improved.

References

[1]. http://docs.flexiant.com/display/DOCS/Triggers

ESOCC 2016 in Vienna

Philippe, Anna and Massimo from our project consortium spoke about BEACON at the 5th European Conference on Service-Oriented and Cloud Computing which took place in Vienna earlier this month. 

Thanks to conference hosts TU Wien for a great event.

 

Slides from the keynote events are now

We welcome any queries or feedback on BEACON so far - please don't hesitate to get in touch with us if you are interested in hearing more.

NEWS: EC Consultation on Cloud Computing

The European Commission is seeking contributions to a consultation from all interested stakeholders on the future research and innovation challenges in the area of Cloud Computing.  Responses will feed into the plans for the forthcoming H2020 LEIT ICT Work Programme 2018-2020.

 

Visit the EC Digital Single Market website to find out more.

Find out more about the consultation and have your say by 10th October >>> click here to visit consultation website >>>

BEACON @ NetFutures 2016

We were pleased to have a demonstration booth at this year's NetFutures event in Brussels. Our booth demonstrated to visitors, two demos made specifically for the event. Each showcasing an aspect of the innovations that BEACON will develop.

This video shows our Hybrid Federation. This is the creation of a federated network between OpenNebula and a public cloud. 

 

Below is a demonstration of BEACON's OpenStack peer federation. This showcases the extension of a federated network based on modification of geo-location constraints.

Going for NoOps: should SysAdmins be worried for their jobs?

Reliable and fast automation drives efficient quality-driven development process. In DICE, we are factoring into this process deployment of services such as Storm, Cassandra or Hadoop. We offer this capability in a tool called DICER, and back it up with a technology library to off-load the installation and configuration work to a set of scripts. In effect, our technology library enables a NoOps experience to the users, because no SysAdmins are required to do the work of setting these services up. But is this a bad news for the SysAdmins? Will DICE put them out of job?

 

During the recent QUDOS workshop, the participants had an interesting discussion after the DICER paper presentation. A person in the audience commented that we might not want to present this work to the SysAdmins. Building a tool that makes people lose their jobs certainly doesn’t sound like a good thing all the way.

DICER is a powerful tool in the hands of SysAdmins as it allows them to write scripts to deploy new Big Data technologies and to give to Dev people the possibility to automatically configure and deploy them based on their specific needs. Moreover, it gives to both Dev people and SysAdmins an environment where they can work together at an abstract level to analyze and fine tune the behavior of their Data Intensive Applications.

Also, the SysAdmins remain valuable creating and maintaining the services for developers themselves. This includes managing an internal testbed in the companies that leverage a private cloud, or overseeing that policies and quotas are being observed wherever public clouds are in use. The services such as DICER, the DICE deployment service, Jenkins and others also need SysAdmins to be installed, configured and kept up-to-date.

DICE will also enable the SysAdmins to be actively included in the DevOps workflow. They could easily be assigned to review and provide suggestions and guidance in the architecting and re-architecting of the Data Intensive Applications. Moreover, DICE monitoring framework and Quality Testing tools provide additional tools for them to assess the fitness of the infrastructure to run resource-hungry applications.

New paradigms such as DevOps and Big Data technologies always cause some change and require adaptation from the people involved in processes. However, tools such as the DICE toolset should primarily aid the users to become more efficient and competitive. Therefore, when talking about NoOps, we mean “no manual operation work”, which means for SysAdmins a new and more effective way to work and be important actors in setting up complex services. The burden of menial, repetitive work should be taken by automation, leaving the people free to do creative work and making important decisions.

Matej Artač, XLAB

Elisabetta Di Nitto, PMI

A design for life!

Have you ever had problems working with a data intensive application?

If so, you’ll know that the difficulty comes from having to unavoidably deal with various failures. So what do you do? Many people have found success by designing software to never fail. But there are a few things you should know before you buy and implement a solution in order to ensure your software is actually  resilient to failures of the hosting environment. This post will tell you what you need to know to make sure you select a much more viable strategy to make your applications reliable and will let you properly test applications both during development and after deployment. Within the DICE project, a Fault Injection Tool (FIT) has been developed to help achieve exactly that.

 

If you’re looking for a way to generate faults within Virtual Machines and at the Cloud Provider Level, the FIT is a great fit. This will allow cloud platform owners/Application VM owners a means to test the resiliency of a cloud installation and applications  by using the FIT. The FIT allows:

  • application designers to use testing to show where to “harden” their application before it reaches a commercial environment
  • user/application owners to test and understand their application design/deployment in the event of a cloud failure or outage.
  • mitigation of risk in advance of a cloud based deployment.
  • developers and cloud operators to simplify testing within the DevOps paradigm for their customers.

In addition the design and development of the DICE FIT is modular in nature, which allows the replacement of any function that injects faults as well as the ability to extend the tool as required.  Further, the FIT downloads, installs and configures only what is required at the time, meaning no unnecessary tools or dependences are installed.

 

Now that you understand the DICE approach (in the diagram above), you’re ready to develop using a quality driven DevOps approach to build and test your application to withstand failing parts of the infrastructure or misbehaving services.

 

Craig Sheridan, FLEXI

DICE Configuration Optimization Tool (BO4CO)

Big Data systems are regarded as a new class of software systems leveraging several emerging technologies to efficiently ingest, process and produce large quantities of data. Each of the comprising technologies (e.g., Hadoop, Spark, Cassandra) has typically dozens of configurable parameters that should be carefully tuned in order to perform optimally. Unfortunately, users of such systems, like data scientists, usually lack the technical skills to tune system internals. Such users would rather use a system that can tune itself. Yet, there is a shortage of automated methods to support the configuration of Big Data systems. One possible explanation is that the influences of a configuration option on performance are not well understood [1].

 

Performance differences between a well-tuned configuration and a poorly configured one can be of orders of magnitude. Typically, administrators use a mix of rules-of-thumb, trial-and-error, and heuristic methods for setting configuration parameters. However, this way of testing and tuning is slow, and requires skillful administrators with a good understanding of the system internals. Furthermore, decisions are also affected by the nonlinear interactions between configuration parameters.

Today’s mandate for faster business innovation, faster response to changes in the market, and faster development of new products demand a new paradigm for Big Data software development. DevOps [2] is a set of practices that aim to decrease the time between changing a system in Development, and transferring the change to the Operation environment, and exploiting the Operation data back in the Development. Continuous Configuration Optimization is one of the cornerstone practices in DevOps where the tuning process require Ops data in order to optimize the performance of Dev [3].

We have developed a Configuration Optimization Tool for Big Data Systems called BO4CO. Bayesian Optimization for Configuration Optimization (BO4CO) [1] is an auto-tuning algorithm for Big Data applications. BO4CO helps end users of big data systems such as data scientists or SMEs to automatically tune the system.

The following figure illustrates components of BO4CO: (i) optimization component, (ii) experimental suite, (iii) and a data broker.

 

Figure 1. BO4CO Architecture [1].

BO4CO is designed keeping in mind the limitations of sparse sampling from the configuration space. For example, its features include: (i) sequential planning to perform experiments that ensure coverage of the most promising zones; (ii) memorization of past-collected samples while planning new experiments; (iii) guarantees that optimal configurations will be eventually discovered by the algorithm.

We have carried out extensive experiments with three different stream processing system benchmarks running on Apache Storm. The experimental results demonstrate that BO4CO outperforms the baselines in terms of distance to the optimum performance with at least an order of magnitude. We have also provided some evidence that the learned model throughout the search process can be also useful for performance predictions. For more information regarding the experimental results, please refer to [1].

BO4CO is integrated with DICE delivery tools (Deployment Service and Continuous Integration), DICE monitoring platform (D-Mon) as well as DICE IDE. BO4CO currently supports Apache Storm and Cassandra. We will extend this technology support to Apache Spark and Hadoop in the next release. The integration with DICE quality testing is also under way. We are also developing a novel approach that takes into account previous system measurements in order to accelerate the tuning process. This approach is suitable in DevOps context where several system versions are continuously released on daily basis.

You can either use BO4CO as a stand-alone application which only require royalty free MATLAB runtime (MCR) to optimize your big data application, or as an integrated solution alongside other DICE DevOps tools. For more details about BO4CO please visit its Github page.

Pooyan Jamshidi (IMP)

References

[1] P. Jamshidi, G. Casale, “An Uncertainty-Aware Approach to Optimal Configuration of Stream Processing Systems“, in Proc. of IEEE MASCOTS, 2016.

[2] A. Balalaie, A. Heydarnoori, P. Jamshidi, “Microservices Architecture Enables DevOps: Migration to a Cloud-Native Architecture“, IEEE Software, Vol. 33, Issue 3, pp. 42-52, 2016.

[3] E. Di Nitto, P.Jamshidi, M. Guerriero, M. Guerriero, I. Spais, D. A. Tamburri, “A Software Architecture Framework for Quality-Aware DevOps“, in Proc. of QUDOS, 2016.

Complementary materials

  • Paper is the key paper about BO4CO.
  • Wiki provides more details about the tool and setting up the environment.
  • Data is the experimental datasets.
  • Presentation is a presentation about the tool and our experimental results.
  • Gitxiv is all research materials about the tool in one link.
  • TL4CO is the DevOps enabled configuration optimization tool.

TOSCA in a Nutshell

The IT industry is not immune in the efforts of speeding up the production of its goods – applications and services. The best way of reducing cost and time needed to build a software solution is to cut the processes that can be done better and faster automatically without losing the essence of the process. Installing and configuring software is traditionally a manual process, and thus complex, costly and time-consuming. A much better alternative is to describe the whole application in a blueprint, then use a suitable tool to interpret the blueprint to turn it into a live application. OASIS TOSCA provides an emerging standard for describing applications in blueprints.

 

Virtualization technology and Cloud Computing have been major catalysts in the paradigm shift, which enabled that we are now able to treat infrastructure as if it were code. So on top of code written in Java, Python or C code, we also develop the topologies and configurations of the services required to make the application work, version them with the application’s version and store them in GIT, SVN or any other versioning system of choice. The developers can therefore indirectly take part in operations tasks, which is one of the basis of the DevOps. We should note that, here, we are not limited to virtualized solutions, but can include also the bare metal in the equation.

As a standard to describe said infrastructure and service topologies, TOSCA stands for “Topology and Orchestration Specification for Cloud Applications”. In the hands of TOSCA lies the state of the art in industrial experience and practice with deployment solutions that are both technology independent and multi-cloud compliant. These intrinsic characteristics stem from the joined interplay within which TOSCA was originally specified, i.e., the OASIS standardization effort. Within the OASIS TOSCA Technical Committee (TC) big industrial players (e.g., IBM, Huawei, Ericsson) defined the essential elements for the purpose of providing easily deployable specifications for cloud applications. TOSCA-based descriptions cover several key aspects of infrastructure, including, but not limited to, Network Function Virtualization, Infrastructure Monitoring and similar. Essentially, quoting from the TOSCA specification 1.0, “TOSCA […] uses the concept of service templates to describe cloud workloads as a topology template, […]. TOSCA further provides a type system of node types to describe the possible building blocks for constructing a service template, as well as relationship type to describe possible kinds of relations”. The diagram below outlines the essential concepts within TOSCA and their respective relation:

 

A typical TOSCA description is therefore phrased in terms of (reusable) node types that define the characteristics, (required/provided) properties and relations for a certain node that can be deployed as well as plans which define workflow management and execution.

The standard itself is not very useful without the tools, which take the descriptions based on standards and execute them in some target environment. Luckily, there are a number of initiatives and projects for manipulating TOSCA blueprints and orchestrating cloud applications according to the blueprints. As already explained in our earlier posts, for DICE we use Cloudify as the orchestrator solution of choice. But here are some other interesting solutions:

  • CAMF focuses on three distinct management operations, particularly application description, application deployment and application monitoring. To this end, it adopts the OASIS TOSCA open specification for blueprinting and packaging Cloud Applications. Being part of the Eclipse Software Foundation, part of the CAMF code will be made freely available and open-source under Eclipse Public License v1.0.
  • CELAR and the related tool-support within Eclipse, i.e., c-Eclipse.  The CELAR project is an initiative specific for multi-cloud elasticity provisioning. In realising said elasticity provisioning services, CELAR and connected tool-bases are working to implement and gradually extend a deployment engine featuring specific TOSCA templates. As part of the Eclipse ecosystem, The complete source code of c-Eclipse is made available under the terms of the Eclipse Public License v1.0. Similarly, a part of the CELAR project code responsible for automated deployment will also be made freely available as well.
  • Open-TOSCA is an open-source initiative from the university of Stuttgart to develop free-libre Open-Source TOSCA modeling/reasoning and orchestration technologies including support for modelling via the Winery modelling technology as well as TOSCA containment modelling via an ad-hoc OpenTOSCA Container and instantiation via the VinoThek self-service instantiation portal. Because it is composed of a set of technologies, Open-TOSCA does not have a clear and homogeneous open-source licensing model as a single product. Rather, Individual licensing has to be evaluated for the single modules it is made of.
  • Finally, Alien4Cloud is an interesting solution to manipulate TOSCA models. One of the core functionalities behind this technology is, quoting from the homesite: “Create or reuse portable TOSCA blueprints and components. Leverage your existing shell, chef or puppet scripts.” – These features suggest that Alien4Cloud may easily be integrated in the various methodological and technological phases intended in the definition and application of the DICE profile in practice. ALIEN 4 Cloud is open source under the Apache 2 License. However, Alien4Cloud was designed as a front-end for Cloudify technologies and currently it supports deployment and orchestration via Cloudify alone.

Although the maturity of these technologies is rather preliminary, they offer a basis to kick-off valuable research around TOSCA.

Damian Andrew Tamburri, Politecnico di Milano

DICE Monitoring Platform

Big Data is certainly a big hype nowadays and there are a tremendous number of frameworks available that enable companies to develop Big Data applications. The development of data-intensive applications, like development of any other software application, involves testing, validation and fine-tuning processes to ensure the performance and reliability the end-users expect. Throughout these processes the execution of the application needs to be constantly monitored in order to extract execution trends and spot the anomalies. And this is only the beginning. Once in production, monitoring of the application, together with its underlying infrastructure, is a must. But Big Data applications generate Big Monitoring Data, and not only this: the data is generated in different formats, is available either in log files, or via APIs.

There are monitoring systems on the market to help you with that. Plenty of them, some open source, some commerical, others using a freemium model. But they rather tend to be focused on specific areas. NagiosGanglia could be used to easily monitor your infrastructure. Others, such as Apache Chukwa or Sematext, could be used to monitor Apache Hadoop, Apache Storm, or Apache Spark. However, all these tools need to be deployed in your infrastructure, and you will certainly need to scale them to cope with scaling of your infrastructure. Or else, let external services transfer data out of your infrastructure, in case of SaaS platforms. Hmmm… Big Data seems to mean Big Problems.

DICE monitoring platform (shortly, DMon) tries to make your life easier when it comes to collecting, searching, analyzing and visualizing, in real time, your data-intensive application. Firstly, due to the leveraging on the Elastic‘s open source stack – Elasticsearch for indexing and searching the data, Logstash for pre-processing incoming data and Kibana for real time visualization – DMon platform is fully distributed, highly available and horizontally scalable. All these core components of the platform have been wrapped in microservices accessible using HTTP RESTful APIs for an easy control.

Secondly, DMon is able to monitor your infrastructure, thanks to collectd plugins. Additionally, it collects data from multiple Big Data frameworks, such as Apache HDFS, YARN, Spark, or Storm (for now). With DMon you have one platform for monitoring both your infrastructure and your Big Data frameworks.

Next, it streamlines the control and configuration of its core components. With DMon controller service, you have a unique HTTP RESTful API you can use both to control the core components of the platform (change configuration parameters, start/stop) and administer the monitored cluster, making it possible to add, update, remove monitored nodes or start/stop services on them via GET/POST/PUT/DELETE calls. We will also provide a Web user interface wrapping DMon controller API to have all administration jobs at your fingertips by clicking of a button.

Visualization of collected data is fully customizable and can be structured in multiple dashboards based on your needs, or tailored to specific roles in your organization, such as administrator, quality assurance engineer or software architect.

The deployment of the platform is integrated with Chef configuration maganagement system and we also provide a Vagrant script for a single node installation, which you will find useful for your development environment. If you are using the full DICE toolchain, it is going to be even simpler for you because the DICE deployment tool will take care of the platform deployment and deploying the agents on monitored nodes.

You can either use DMon as a stand-alone platform to monitor your infrastructure, or as a raw-data provider for high-level simulation and optimisation tools available in DICE toolchain. For more details about DMon please visit its Github page

Daniel Pop (IEAT)

Location Aware Elasticity in OpenNebula

In the context of the Beacon Project, OpenNebula needs to be able to define Virtual Machine placement policies with geographical location constraints. 

Moreover, Beacon use cases call for location-aware elasticity (also called auto-scaling). This post describes the framework defined to achieve it.

First, a OneFlow Service needs to be defined. OneFlow is an OpenNebula component that allows users and administrators to define, execute and manage multi-tier applications, or services, composed of interconnected Virtual Machines with deployment dependencies between them. Each group of Virtual Machines is deployed and managed as a single entity. This component also provides auto-scaling capabilities based on configurable policies, using performance metrics and scheduled times.

A service contains Roles. Each role is a group of identical Virtual Machines, with a certain initial cardinality (number of Virtual Machines). Following the auto-scaling elasticity policies, OpenNebula adjusts the cardinality of each Role individually.

To implement location-aware scalability, a Service can be defined with a Role for each geographical location. The following figure represents a Service that holds a simple application with a front-end machine, and several worker nodes. The worker nodes are separated into three Roles: one for the local infrastructure, one for the EC2 US region, and one for EC2 Europe region.

 

 

Representation of a OneFlow Service with 3 geographical locations

Each role defines a set of elasticity rules, that will act individually for each role. The metric used to control the elasticity will be called CLIENT_REQ_MIN, and it is suggested as a monitoring value that the application will report to OpenNebula, periodically, with the number of client requests per minute.

In Sunstone, the OpenNebula web interface, that OneFlow service template is defined as follows:

 

 

OneFlow Template. Note: the elasticity policies are individual for each Role

To implement the location-aware placement, each Role uses different VM Templates, that in turn has unique placement requirements, expressed as:

SCHED_REQUIREMENTS = “HOSTNAME = \"ec2-us-east-1\"”

 

 

Virtual Machines deployed in the correct location by the OpenNebula scheduler

In order to trigger elasticity rules, OpenNebula VMs must be able to collect relevant information from the remote cloud providers.

The metrics retrieved by the provider’s API are limited, and in some cases they may not be relevant to the performance definition for the application running inside the Service. For this reason, OpenNebula allows VMs to report their own internal metrics using the OneGate component. OneGate exposes a REST interface that, combined with an automated authentication token management, allows the Virtual Machines to push and pull information about the VMs themselves, and the OneFlow service they are part of.

 

High-level architecture of OneGate monitoring

The application running inside the VMs will periodically report the application performance in each one of the worker nodes. This could be done with a command such as:

$ curl -X "PUT" "${ONEGATE_ENDPOINT}/vm" \
--header "X-ONEGATE-TOKEN: `cat token.txt`" \
--header "X-ONEGATE-VMID: $VMID" \
-d "CLIENT_REQ_MIN = 12"

That attribute, stored in the OpenNebula database, will be accessible by the OneFlow component, in charge of the Service scalability.

 

 

OneFlow Roles cardinality after the metric is pushed to the VM metadata

The previous image shows the OneFlow service with the cardinality adjusted only for the ‘eu’ role. The expression “CLIENT_REQ_MIN > 10” is responsible for this scale-up action.

DICE enables Quality-Driven DevOps for Big Data – a White Paper

The DICE project has recently concluded its first year of activity, during which a lot of progress has been made in the definition of an innovative framework to develop Big Data applications. A technical architecture has been defined and initial prototypes are rapidly maturing.
The DICE consortium has recently released a white paper to explain to industrial stakeholders the purpose of DICE, its architecture and tool offering, and the market-oriented demonstrators that are currently being implemented.

Download the DICE White Paper

The first complete release of the DICE tools is set for August 2016, with an integrated development environment set for release in February 2017. Stay tuned!

Giuliano Casale, DICE Project Coordinator

How to start a successful DIA from scratch

Let us imagine that you are a Software Developer working in a highly innovative data-driven start-up delivering a cutting-edge solution called “Data Digger Solution” to gather raw data from various and heterogeneous sources (e.g. social media, websites, CRM, online sales, servers, emails, etc.), process them and gain tangible insights from them, with fresh semantics allowing concrete and profitable interpretations (e.g. in terms of sales and web presence). Your start-up is growing and signing more and more contracts with major actors in various sectors (banks, insurances, retailers, medias, etc.) and, actually, this is great! Your boss is a visionary man or maybe he just reads the new IDC forecast which sees the Big Data technology and services market growing at a 26.4% compound annual growth rate to $41.5 billion through 2018 driven by wide adoption across industries. To not be victim of your own success, your boss asked you to rapidly design and implement a prototype of “Data Digger Solution” aka DDS using Big Data and Cloud technologies and make sure it will be in accordance with the unstoppable start-up business acceleration especially in terms of performance, reliability and scalability: “Do it fast, cheap, at scale and don’t lose data!”

 

To be able to deliver this prototype in time will exempt you from explaining to your boss why you deserve a raise! So motivated, you open your favorite search engine, write “build big data application”, get thousands of articles, read some of them and at the end of the day you have a plethora of words such as Map Reduce, Hadoop, Spark, Cassandra, Storm, VM, Linux, Cloudify, Zookeeper, Kafka, Akka, Java, Scala and maybe also Lambda Architecture. Since you are clever, you get that these are not point-and-click technologies. Yet, you are puzzled on how to start the project? How to design your Big Data application? How could you satisfy all the quality requirements? What architecture to adopt keeping in mind the future evolution of the system? How to accelerate quality testing for your release?

Actually, offering an answer to these questions (and more) is the role of the DICE methodology. It is a step-by-step workflow we are continuously testing and validating on actual Data-Intensive Applications (DIA).

This methodology relies on all DICE tools which foster an efficient specification, design, development, and deployment of DIAs for various business domains. The DICE toolset incorporates ready-to-use built-in components supporting many DIA platforms and technologies. So far, the DICE methodology consists of ten defined and fully equipped activities going from business modelling and requirement analysis to the deployment and real-time feedback analysis:

  • Business modelling and requirement analysis
  • DIA architecture design
  • DPIM simulation and verification
  • DIA platform and technology mapping
  • DTSM simulation and verification
  • Platform- and technology-specific implementations
  • DIA deployment mapping
  • DDSM optimization
  • DIA deployment
  • Runtime feedback analysis

Each of these actions needs some actors to perform identified tasks with existing tools. From design to deployment, they are guided and assisted by the DICE IDE which interacts with the helpful DICE development and runtime tools. The workflow allows iterations between these steps in order to better meet the designer requirements and let users take full advantage of the DevOps capabilities of the DICE toolset.

The DICE philosophy built into the DICE IDE proposes an innovative architecture/approach, which lets the entire environment be flexible and extensible. Choosing The Eclipse Platform and the Papyrus Modelling Environment is a deliberate choice mainly made because of the built-in extension mechanisms offered by these platforms and widely adopted by developers, i.e. potential end-users of DICE (you). The extensibility is even more significant in DICE which proposes to the users to adapt/enrich the list of supported Big-Data technologies, which will, for sure, evolve and become longer and longer. Existing and adopted solutions such as Spark, Cassandra or Hadoop are already supported, but more and more new emerging solutions will appear. Thanks to Eclipse and to the DICE extension mechanisms, DICE users will be able to integrate these technologies with no effort in order to benefit from the whole DICE ecosystem. This extensibility feature is also a part of the whole methodology.

To sum up, the DICE methodology will (1) guide you through steps to build an efficient architecture, to test it, to simulate it, to optimize it and to deploy your DIA and (2) to adapt it accordingly to your needs. Before I thank you for reading this post, let me tell you that the astonishing growth in data in general will profoundly affect businesses and this fictive story will become, in the near future, an actual challenge for many SMEs. Coming posts and deliverables will give more details about the DICE methodology.

Youssef Ridene (Netfective Technology)