Grid computing
Encyclopedia : G : GR : GRI : Grid computing
Grid computing is an emerging computing model that provides the ability to perform higher throughput computing by taking advantage of many networked computers to model a virtual computer architecture that is able to distribute process execution across a parallel infrastructure. Grids use the resources of many separate computers connected by a network (usually the Internet) to solve large-scale computation problems. Grids provide the ability to perform computations on large data sets, by breaking them down into many smaller ones, or provide the ability to perform many more computations at once than would be possible on a single computer, by modeling a parallel division of labour between processes. Today resource allocation in a grid is done in accordance with SLAs (service level agreements).
Origins
Like the Internet, the Grid Computing evolved from the computational needs of "big science". The Internet was developed to meet the need for a common communication medium between large, federally funded computing centers. These communication links led to resource and information sharing between these centers and eventually to provide access to them for additional users. Ad hoc resource sharing 'procedures' among these original groups pointed the way toward standardization of the protocols needed to communicate between any administrative domain. The current Grid technology can be viewed as an extension or application of this framework to create a more generic resource sharing context.The ideas of the Grid were brought together by Ian Foster, Carl Kesselman and Steve Tuecke, the so called "fathers of the Grid." They lead the effort to create the Globus Toolkit incorporating not just CPU management (e.g. cluster management and cycle scavenging) but also storage management, security provisioning, data movement, monitoring and a toolkit for developing additional services based on the same infrastructure including agreement negotiation, notification mechanisms, trigger services and information aggregation. In short, the term Grid has much further reaching implications than the general public believes. While Globus Toolkit remains the de facto standard for building Grid solutions, a number of other tools have been built that answer some subset of services needed to create an enterprise Grid.
The remainder of this article discusses the details behind these notions.
Common features
Grid computing offers a model for solving massive computational problems by making use of the unused resources (CPU cycles and/or disk storage) of large numbers of disparate computers, often desktop computers, treated as a virtual cluster embedded in a distributed telecommunications infrastructure. Grid computing's focus on the ability to support computation across administrative domains sets it apart from traditional computer clusters or traditional distributed computing.Grids offer a way to solve Grand Challenge problems like protein folding, financial modelling, earthquake simulation, and climate/weather modelling. Grids offer a way of using the information technology resources optimally inside an organization. They also provide a means for offering information technology as a utility bureau for commercial and non-commercial clients, with those clients paying only for what they use, as with electricity or water.
Grid computing has the design goal of solving problems too big for any single supercomputer, whilst retaining the flexibility to work on multiple smaller problems. Thus Grid computing provides a multi-user environment. Its secondary aims are better exploitation of available computing power and catering for the intermittent demands of large computational exercises.
This approach implies the use of secure authorization techniques to allow remote users to control computing resources.
Grid computing involves sharing heterogeneous resources (based on different platforms, hardware/software architectures, and computer languages), located in different places belonging to different administrative domains over a network using open standards. In short, it involves virtualizing computing resources.
Grid computing is often confused with cluster computing. The key difference is that a cluster is a single set of nodes sitting in one location, while a Grid is composed of many clusters and other kinds of resources (e.g. networks, storage facilities).
Functionally, one can classify Grids into several types:
- Computational Grids (including CPU scavenging Grids) which focuses primarily on computationally-intensive operations
- Data Grids or the controlled sharing and management of large amounts of distributed data
- Equipment Grids which have a primary piece of equipment e.g. a telescope, and where the surrounding Grid is used to control the equipment remotely and to analyse the data produced.
Definitions of Grid computing
The term Grid computing originated in the early 1990s as a metaphor for making computer power as easy to access as an electric power grid.Today there are many definitions of Grid computing:
- The definitive definition of a Grid is provided by Ian Foster in his article "What is the Grid? A Three Point Checklist" The three points of this checklist are:
- * Computing resources are not administered centrally.
- * Open standards are used.
- * Non-trivial quality of service is achieved.
- Plaszczak/Wellner define Grid technology as "the technology that enables resource virtualization, on-demand provisioning, and service (resource) sharing between organizations."
- IBM says, "Grid is the ability, using a set of open standards and protocols, to gain access to applications and data, processing power, storage capacity and a vast array of other computing resources over the Internet. A Grid is a type of parallel and distributed system that enables the sharing, selection, and aggregation of resources distributed across multiple administrative domains based on the resources availability, capacity, performance, cost and users' quality-of-service requirements"
- Buyya defines Grid as "a type of parallel and distributed system that enables the sharing, selection, and aggregation of geographically distributed resources dynamically at runtime depending on their availability, capability, performance, cost, and users' quality-of-service requirements".
- CERN, one of the largest users of Grid technology, talk of The Grid: "a service for sharing computer power and data storage capacity over the Internet."
- Pragmatically, Grid computing is attractive to geographically-distributed non-profit collaborative research efforts like the NCSA Bioinformatics Grids such as BIRN: external Grids.
- Grid computing is also attractive to large commercial enterprises with complex computation problems who aim to fully exploit their internal computing power: internal Grids.
Grid computing is a subset of distributed computing.
Conceptual framework
Grid computing reflects a conceptual framework rather than a physical resource. The Grid approach is utilized to provision a computational task with administratively-distant resources. The focus of Grid technology is associated with the issues and requirements of flexible computational provisioning beyond the local (home) administrative domain.
Virtual organization
A Grid environment is created to address resource needs. The use of that resource(s) (eg. CPU cycles, disk storage, data, software programs, peripherals) is usually characterized by its availability outside of the context of the local administrative domain. This 'external provisioning' approach entails creating a new administrative domain referred to as a Virtual organization (VO) with a distinct and separate set of administrative policies (home administration policies plus external resource administrative policies equals the VO [aka your Grid] administrative policies). The context for a Grid 'job execution' is distinguished by the requirements created when operating outside of the home administrative context. Grid technology (aka. middleware) is employed to facilitate formalizing and complying with the Grid context associated with your application execution.Resource utilization
One characteristic that currently distinguishes Grid computing from distributed computing is the abstraction of a 'distributed resource' into a Grid resource. One result of abstraction is that it allows resource substitution to be more easily accomplished. Some of the overhead associated with this flexibility is reflected in the middleware layer and the temporal latency associated with the access of a Grid (or any distributed) resource. This overhead, especially the temporal latency, must be evaluated in terms of the impact on computational performance when a Grid resource is employed.Web based resources or Web based resource access is an appealing approach to Grid resource provisioning. A recent GGF Grid middleware evolutionary development 're-factored' the architecture/design of the Grid resource concept to reflect using the W3C WSDL (Web Service Description Language) to implement the concept of a WS-Resource. The stateless nature of the Web, while enhancing the ability to scale, can be a concern for applications that migrate from a stateful resource access context to the Web-based stateless resource access context. The GGF WS-Resource concept includes discussions on accommodating the statelessness associated with Web resources access.
State-of-the-art, 2005
The conceptual framework and ancillary infrastructure are evolving at a fast pace and include international participation. The business sector is actively involved in commercialization of the Grid framework. The 'big science' sector is actively addressing the development environment and resource (aka performance) monitoring aspects. Activity is also observed in providing Grid-enabled versions of HPC (High Performance Computing) tools. Activity in the domains of 'little science' appears to be scant at this time. The treatment in the GGF documentation series reflects the HPC roots of the Grid concept framework; this bias should not be interpreted as a restriction in the application of the Grid conceptual framework in its application to other research domains or other computational contexts.Substantial experience is being built through the operation of various Grids, most notable of them being the EGEE infrastructure supporting LCG, the LHC Computing Grid [link]. LCG is driven by CERN's need to handle a huge amount of data, produced at a rate of almost a gigabyte per second (10 petabytes per year), a history not unlike that of the production NorduGrid. A list of active sites participating within LCG can be found online [link] as can real time monitoring of the EGEE infrastructure [link]. The relevant software and documentation is also publicly accessible [link].
Grid computing is one of the method of processing multiple datas by multiple computers from various locations ie., through Internet
Grid-enabling organizations and offerings
The Global Grid Forum
The Global Grid Forum (GGF) has the purpose of defining specifications for Grid computing. GGF is a collaboration between industry and academia with significant support from both.The Globus Alliance
The Globus Alliance implements some of the standards developed at the GGF through the Globus Toolkit (Grid middleware). As a middleware component, it provides a standard platform for services to build upon, but Grid computing also needs other components, and many other tools operate to support a successful Grid environment.Globus has implementations of the GGF-defined protocols to provide:
- Resource management: Grid Resource Allocation & Management Protocol (GRAM)
- Information Services: Monitoring and Discovery Service (MDS)
- Security Services: Grid Security Infrastructure (GSI)
- Data Movement and Management: Global Access to Secondary Storage (GASS) and GridFTP
- Gridbus [Grid Service Broker]
- Grid Portal Software such as GridPort, OGCE and [GridSphere]
- Grid Packaging Toolkit (GPT)
- MPICH-G2 (Grid Enabled MPI)
- Network Weather Service (NWS) (Quality-of-Service monitoring and statistics)
- Condor (CPU Cycle Scavenging) and Condor-G (Job Submission)
- Moab Grid Suite
- [HPC4U Middleware] (Fault Tolerant and SLA aware Grid Middleware)
Commercial Grid computing offerings
Computing vendors offer Grid solutions which are based either on the Globus Toolkit, or a proprietary architecture. Confusion remains in that vendors may badge their computing on demand or cluster offerings as Grid computing.See also
- Concepts & related technology
- Distributed computing
- List of distributed computing projects
- High-performance computing
- Render farm
- Semantic Grid
- Supercomputer
- computer cluster
- virtual organization
- Computon
- Access Grid
- Enabling Grids for E-sciencE (EGEE)
- IceGrid
- NorduGrid
- Open Science Grid
- OurGrid
- Metropolitan Area Grid
- Sun grid
- Parallel Virtual Machine (PVM)
- Message Passing Interface (MPI)
- Ganglia
- Scalable Cluster Environment (SCE)
- Berkeley Open Infrastructure for Network Computing
- Simple Grid Protocol
- SDSC Storage resource broker (data grid)
- UNICORE
- Invisionix Roaming System Remote (IRSR)
References
- Antony Davies: Computational Intermediation and the Evolution of Computation as a Commodity, Applied Economics, June 2004, [Online version]
- Ian Foster, Carl Kesselman: The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann Publishers, ISBN 1558604758, [Website]
- Pawel Plaszczak, Rich Wellner, Jr: Grid Computing “The Savvy Manager’s Guide” Morgan Kaufmann Publishers, ISBN 0127425039 [Online book companion]
- Fran Berman, Anthony J. G. Hey, Geoffrey C. Fox: Grid Computing: Making The Global Infrastructure a Reality, Wiley, ISBN 0470853190, [Online version]
- Maozhen Li, Mark A. Baker: The Grid: Core Technologies, Wiley, ISBN 0470094176, [Website]
- Charlie Catlett, Larry Smarr: "Metacomputing", Communications of the ACM, vol. 35, no. 6, June 1992, [Website]
- Roger Smith: [Grid Computing: A Brief Technology Analysis], CTO Network Library, 2005.
- Rajkumar Buyya: [Special Theme Issue -- Grid Computing: Making the Global Cyberinfrastructure for eScience a Reality], CSI Communications, Vol.29, No.1, ISSN 0970-647X, Computer Society of India (CSI), Mumbai, India, July 2005.
- Viktors Berstis [Fundamentals of Grid Computing] http://www.redbooks.ibm.com/abstracts/redp3613.html
- Luis Ferreira et.al. [Grid Computing Products and Services] http://www.redbooks.ibm.com/abstracts/sg246650.html
- Luis Ferreira et.al. [Introduction to Grid Computing with Globus] http://www.redbooks.ibm.com/abstracts/sg246895.html?Open
- Bart Jacob et.al. [Enabling Applications for Grid Computing] http://www.redbooks.ibm.com/abstracts/sg246936.html?Open
- Luis Ferreira et.al. [Grid Services Programming and Application Enablement] http://www.redbooks.ibm.com/abstracts/sg246100.html?Open
- Bart Jacob et.al. [Introduction to Grid Computing] http://www.redbooks.ibm.com/abstracts/sg246778.html?Open
- Luis Ferreira et.al. [Grid Computing in Research and Education] http://www.redbooks.ibm.com/abstracts/sg246649.html?Open
- Luis Ferreira et.al. [Globus Toolkit 3.0 Quick Start] http://www.redbooks.ibm.com/abstracts/redp3697.html?Open
External links
- Upcoming Events
- [GridsWatch]
- [IEEE Distributed Systems Online, Grid Computing Section]
- [Grid Computing - Google News]
- [Primeur magazine - HPC and Grid computing news]
- [GRIDtoday]
- [UtilityComputing.com]
- [LinuxHPC.org] Linux High Performance Computing and Clustering Portal
- [WinHPC.org] Windows High Performance Computing and Clustering Portal
- [Science Grid This Week]
- [Grid Computing Info Center]
- [The GridWorld Blog]
- [Grid Meter - Info World]
- [The Grid Computing Blog]
- [Gridtech blog]
- [West Coast Grid]
- [Gridalogy GridBlog]
- [The Globus Alliance]
- [Global Grid Forum]
- [ApGrid: Asia Pacific Grid]
- [US NSF TeraGrid]
- [EU DataGrid project] Complete, succeeded by EGEE
- [Enabling Grids for E-sciencE (EGEE)]
- [The LHC Computing Grid]
- [The Israeli Association of Grid Technologies (IGT)]
- [ThaiGrid]
- [NorduGrid]
- [Grid Computing Reference Guide]
- [Open Science Grid]
- [D-Grid]
- [Mobile Agent Technologies]
- [Parabon Computation]
- [DataSynapse]
- [Platform Computing]
- [Digipede]
- [EnginFrame]
- [United Devices]
- [Gridalogy]
- [Fujitsu Systems Europe]
- [Scalent Systems]
- [EnterTheGrid directory on Grid computing]
- [IBM Grid Computing website]
- [GridComputing.com]
- [GridSphere Portal Framework (JSR-168 compliant)]
- [GridSummit.com]
- [GridsWatch, Georgetown University]
- [Gridalogy]
- [BigBlueRiver]
- [O'Reilly article about grid computing software]
- [Grid Café, the place for everyone to learn about the Grid]
- [GreenTea Software] a totally decentralized pure Java-based Peer-to-Peer (P2P) Grid computing platform
- [Globus Toolkit]
- [GRIA] is an open source Grid middleware that enables commercial use of the Grid in a secure, interoperable and flexible manner.
- [DRMAA API Specification Page]
- [Java Commodity Grid Toolkit (CoG) Kit]
- [Java Parallel Processing Framework (JPPF)] is an Open Source computational grid framework for Java.
- [Apache WSRF]
- [Digipede Framework SDK] provides developers with the tools and information required to build grid-enabled Windows applications (.NET and COM).
- [GridForge]
- [ProActive] is a Java library for parallel, distributed, and concurrent computing with mobility and security
- [Grid Engine], open source grid engine sponsored by Sun Microsystems, runs on many platforms. Also check out the [supporting community]
- [Apple Xgrid], an easy-to-set-up grid solution for Mac OS X
- [BioSimGrid: Grid database for biomolecular simulations]
- [The Condor project] is a grid computing engine by the University of Wisconsin, and runs on many platforms
- [Mobius Data Grid middleware]
- [The OGSA-DAI Data virtualization project]
- [Moab Grid Suite]
- [Berkeley Open Infrastructure for Network Computing]
- [Simple Grid Protocol] is a Freeware grid computing package for Linux and BSD, and is based on the Common Lisp language.
- [Media Grid™]
- [The Java Commodity Grid Toolkit (CoG) Kit]
- [Grid and cluster computing using Java] (LGPL)
- [Condor Shibboleth Integration Project - Distributed Authentication]
- [Invisionix Systems "Invisionix Roaming System Remote -- IRSR"]
- [eyeOS Project]
- [Gridalogy]
- [Einstein@Home] Search data from the Laser Interferometer Gravitational wave Observatory (LIGO) in the US and from the GEO 600 gravitational wave observatory in Germany for signals coming from rapidly rotating neutron stars, known as pulsars.
- [LHC@home] Improve the design of the CERN LHC particle accelerator.
- [Climateprediction.net] Improve the accuracy of long-term climate prediction.
- [Predictor@home] Solve biomedical questions and investigate protein-related diseases.
- [How you can fight against diseases using your computer].
- [WorldCommunityGrid.org] A more recently created grid with the aim of running multiple projects on a single grid. From the home page "World Community Grid's mission is to create the largest public computing grid benefiting humanity. Our work is built on the belief that technological innovation combined with visionary scientific research and large-scale volunteerism can change our world for the better. "
- [Folding@home] Protein-Folding project by Stanford University
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
