Cloud Computing - Task Scheduling Based on Genetic Algorithms

Essay by Ekrem YILDIRIM • December 9, 2017 • Research Paper • 3,926 Words (16 Pages) • 1,631 Views

Essay Preview: Cloud Computing - Task Scheduling Based on Genetic Algorithms

prev next

Page 1 of 16

Cloud Computing –

Task Scheduling based on Genetic Algorithms

Eleonora Maria Mocanu, Mihai Florea, Mugurel Ionuţ Andreica, Nicolae Ţăpuş

Computer Science Department, Politehnica University of Bucharest, Bucharest, Romania nora.mocanu@gmail.com, florea.mihai@gmail.com, mugurel.andreica@cs.pub.ro, ntapus@cs.pub.ro

Abstract— Cloud Computing is a cutting edge technology for managing and delivering services over the Internet. Map-Reduce is the programming model used in cloud computing for processing large data sets in parallel over huge clusters. In order to increase efficiency, a good task scheduling is needed. Genetic algorithms are very useful and accurate in finding solutions to large scale optimization problems, such as task scheduling. They have gained immense popularity over last few years as a robust and easily adaptable search technique. Hadoop, the open source implementation of Map-Reduce, has several task schedulers available (FIFO, Fair, Capacity Schedulers), but neither one of them is focused on minimizing the global execution time. The goal of this project is to improve Hadoop’s functionality by implementing a scheduler based on a genetic algorithm, solving the stated problem.

Keywords - Cloud Computing; Genetic Alghorithms; MapReduce; Hadoop

I. INTRODUCTION

Cloud Computing is the next natural step in the evolution of on-demand information technology services. It offers a way to increase storage capacity, computation power and add capabilities on the fly without investing in new infrastructure, licensing new software or training new personnel. In order to do all that, it needs a good taskresource planning.

Task scheduling in distributed systems usually has the goal of distributing the load on processors, maximizing their utilization, while minimizing the global task execution time. Task scheduling can be defined into two kinds: static scheduling and dynamic scheduling.

Static scheduling uses so-called prescheduling technology to schedule known tasks in foregone environment, while dynamic scheduling depends on not only the foregone tasks, but also on the current system state to make scheduling plans. Dynamic scheduling is used in real life because a cloud system is dynamically changing over time. Task flow is uncertain and during the possible long time of execution, available resources with their quantity and form are changing all the way; resources capability, current load, interests and tasks’ requests are dynamic, too.

Dynamic scheduling problem is an NP-complete problem, whose cost of precise computation will increase exponentially with the increase of the problem scale. In fact, we can just find out an approximate solution to NPcomplete problem [7]. At present, three general algorithms become popular to deal with such problems: Simulated Annealing, Tabu Search and Genetic Algorithm. Among them, GA gains wide attention because of its simplicity, robustness and accuracy in finding the global optimum.

This paper is focused on scheduling problems inside Map-Reduce framework, Google’s abstraction for processing large data sets in real time. Hadoop is the open source implementation of Map-Reduce used by large IT companies such as Facebook or Yahoo!. Hadoop comes with four scheduler choices (FIFO, Fair Scheduler, Capacity Scheduler and Dynamic Scheduler). Job scheduling in Hadoop is performed by the master node, which receives heartbeats sent by slaves at 3 seconds interval with their status and asking for new tasks.

The FIFO scheduler assigns tasks in the order of their submit time, but has also 5 priority levels. The Fair scheduler tries to offer equal shares of the cluster resources to user pools and to jobs within them. Jobs are scheduled in the order of their deficit. Capacity Scheduler offers the possibility of defining queues of jobs and is focused on using the whole cluster’s capacity. Dynamic Scheduler comes with the possibility to manage budgets and dynamically increase job weights. Neither of the schedulers is focused on minimizing global execution time. That is why implementing a scheduler based on genetic algorithms could lead to a better throughput and a more efficient task assignment.

The goal of this project is to study how a Genetic Scheduler could improve Hadoop’s task assignment and to develop it using the existing framework.

II. STATE OF THE ART

The following section of the paper will focus on explaining the concepts and technologies that were used in order to construct the best solution for our application.

With our society’s fast and continuous development comes the need to process greater quantities of data in reasonable amounts on time. That is why there is always need for more computation power and greater storage capacities. As silicon based processor technology has reached its limits in matter of computation speed, the next step was to use two or more processors instead of one. This is how parallel and distributed systems were born.

Cloud Computing

Cloud computing provides computation, software, data access, and storage services that do not require end-user knowledge of the physical location and configuration of the platform that delivers the services. It is based on a heterogeneous system of distributed clusters that is transparent to the user.

Cloud computing is Internet-based development and use of computer technology. The cloud is a metaphor for the Internet and is an abstraction for the complex and dynamic infrastructure it conceals [16].

Map-Reduce

Map-Reduce [2] is a programming model developed by Google, used for writing applications that process large sets of data (that can reach petabytes) on parallel and distributed systems. The processed data can be crawled documents, web request logs etc. and the processing applications: determining summaries of number of pages crawled per host, set of most frequent queries, frequency of words in a document etc.Map-Reduce, like the name says, is based on two operations derived from functional programming, Map and Reduce.

...

Download as: txt (23.3 Kb) pdf (342.3 Kb) docx (72.3 Kb)

Continue for 15 more pages »

Read Full Essay Save

Only available on Essays24.com

Similar Essays

Schedule

POSTGRADUATE ACADEMIC CALENDAR FOR THREE (3) SEMESTERS PER ACADEMIC SESSION 2010/2011 (Applies to postgraduate programmes by coursework, and coursework & thesis/dissertation) ACADEMIC CALENDAR Arrival of

828 Words | 4 Pages
Cloud Computing, Strategi Terbaru Penerapan It

Jurnal Business Strategic Development Cloud Computing, strategi terbaru penerapan IT Disusun oleh: Rangga Wiseno Cloud Computing, strategi terbaru penerapan IT Rangga Wiseno Abstract In this

3,418 Words | 14 Pages
Cloud Computing Essentials – Chad Notes

Cloud Computing Essentials – Chad Notes Cloud Computing Self service Rapid-elasticity Metering or Measured Service Shared resources (enabled by virtualization) Broad network access What –

361 Words | 2 Pages
Impacts of Cloud Computing for Small and Medium Sized Enterprises

Impacts of Cloud Computing for Small and Medium Sized Enterprises Author name Registration number Semester, Department Institute, City Email address INTRODUCTION Cloud computing is used

11,499 Words | 46 Pages
Cloud Computing

Define Cloud Computing. · Cloud computing is virtual storage, in the sense that data isn’t on a tangible hard-drive like in computers. Instead data

822 Words | 4 Pages
Cloud Computing

Contents Introduction Economics of Cloud Business Industry Description Firm Analysis – Core Competencies of AWS Economies of Scale and Pricing Positioning: Competition and Coopetition Consumerization

3,572 Words | 15 Pages
Getting Started with Aws Academy Cloud Computing Architecture Curriculum

Getting Started with AWS Academy Cloud Computing Architecture Curriculum The purpose of this document is to help AWS Academy students enroll in and access AWS

1,118 Words | 5 Pages
Cloud Computing Threats

Introduction to threats: the technical and operational capability of an adversary to identify and to exploit vulnerabilities. Threats are potentials for vulnerabilities to turn into

906 Words | 4 Pages