Abstract
This document is the D-2 Report deliverable from the ERCPool project. The full title of the ERCPool project is “Pilot Project in Sharing High Performance Computing, Data Intensive Computing and Novel Computing Resources across Scottish HEIs”. For the remainder of this document the project will be referred to as ERCPool.
The two major objectives of the ERCPool project are:
1. To promote and investigate potential opportunities, through equipment pooling initiatives, for the sharing of Edinburgh’s leading-edge high performance and novel computing systems at EPCC with other universities.
2. To deliver a report documenting the strengths, weaknesses, opportunities and threats of a move to a system of resource sharing, as well as looking at the potential benefits to researchers. This report will also look at the practical barriers to resource sharing and undertake a cost/benefit analysis.
This document fulfils this project’s second objective. The content for this document has been derived from the two major activities undertaken in ERCPool.
1. A regional study, based on a pilot programme for resource sharing in novel computing. This has involved opening up two state-of-the-art novel computing resources to researchers from outside Edinburgh. These are the Data Intensive Research (EDIM1) computer and EPCC’s parallel GPGPU resource testbed. This has enabled the identification of some of the operational barriers to the sharing of these novel computing resources with researchers from outside the University of Edinburgh.
2. An investigation into the potential for operating regional or topical HPC resources instead of each University procuring and operating its own machine. Using the researchers from SUPA (which covers all of the Physics departments in Scotland) and other disciplines, ERCPool has investigated and documented the future computing needs of researchers to determine to what extent these could be met by shared resources. It has also looked at the potential tangible benefits for research of sharing research computing resources, such as increased collaboration between institutions, better access to training and consultancy, equality of access to state of the art facilities to researchers from a wide geographical area, integration of university, topical and regional facilities into the UK HPC ecosystem. This has involved contacts with more than 40 staff from HEIs from across Scotland and resulted in three projects running on the state-of-the-art novel computing resources opened up as part of ERCPool’s pilot programme described above.
Section 2 of this document therefore discusses if future computing needs as expressed by contacted IS providers and research groups can be met by resource pooling. Section 3 provides a SWOT analysis, whilst Section 4 discusses the possible cost benefits of resource sharing. Finally, Section 5 discusses the practical barriers to resource sharing and how these could be overcome.
In summary, ERCPool has found that the barriers to sharing of novel and high performance computing resources can be broadly classified into four areas – access, resource management, application-related and costs. Overcoming the access related barriers is primarily about making the resource application process lightweight and timely. Regarding resource management, it must be recognised that traditional techniques such as batch systems and virtualisation are not appropriate for every circumstance. A range of possibilities needs to be provided that includes simple and flexible mechanisms that recognise, for example, that not all applications scale to hundreds or thousands of cores and so may require execution times of longer than 12 hours. Application based training and not just software development training is required since many users do not program but instead run third-party applications, Such users need to know how to use these efficiently and to best effect. Probably the greatest barrier, however, to resource pooling is the lack of clarity around funding models and costs. Once these are better understood and clear, easy to understand, implement and monitor processes are in place then resource pooling will be far more appealing to researchers. This, however, requires RCUK and HEIs to engage together to resolve this.
The two major objectives of the ERCPool project are:
1. To promote and investigate potential opportunities, through equipment pooling initiatives, for the sharing of Edinburgh’s leading-edge high performance and novel computing systems at EPCC with other universities.
2. To deliver a report documenting the strengths, weaknesses, opportunities and threats of a move to a system of resource sharing, as well as looking at the potential benefits to researchers. This report will also look at the practical barriers to resource sharing and undertake a cost/benefit analysis.
This document fulfils this project’s second objective. The content for this document has been derived from the two major activities undertaken in ERCPool.
1. A regional study, based on a pilot programme for resource sharing in novel computing. This has involved opening up two state-of-the-art novel computing resources to researchers from outside Edinburgh. These are the Data Intensive Research (EDIM1) computer and EPCC’s parallel GPGPU resource testbed. This has enabled the identification of some of the operational barriers to the sharing of these novel computing resources with researchers from outside the University of Edinburgh.
2. An investigation into the potential for operating regional or topical HPC resources instead of each University procuring and operating its own machine. Using the researchers from SUPA (which covers all of the Physics departments in Scotland) and other disciplines, ERCPool has investigated and documented the future computing needs of researchers to determine to what extent these could be met by shared resources. It has also looked at the potential tangible benefits for research of sharing research computing resources, such as increased collaboration between institutions, better access to training and consultancy, equality of access to state of the art facilities to researchers from a wide geographical area, integration of university, topical and regional facilities into the UK HPC ecosystem. This has involved contacts with more than 40 staff from HEIs from across Scotland and resulted in three projects running on the state-of-the-art novel computing resources opened up as part of ERCPool’s pilot programme described above.
Section 2 of this document therefore discusses if future computing needs as expressed by contacted IS providers and research groups can be met by resource pooling. Section 3 provides a SWOT analysis, whilst Section 4 discusses the possible cost benefits of resource sharing. Finally, Section 5 discusses the practical barriers to resource sharing and how these could be overcome.
In summary, ERCPool has found that the barriers to sharing of novel and high performance computing resources can be broadly classified into four areas – access, resource management, application-related and costs. Overcoming the access related barriers is primarily about making the resource application process lightweight and timely. Regarding resource management, it must be recognised that traditional techniques such as batch systems and virtualisation are not appropriate for every circumstance. A range of possibilities needs to be provided that includes simple and flexible mechanisms that recognise, for example, that not all applications scale to hundreds or thousands of cores and so may require execution times of longer than 12 hours. Application based training and not just software development training is required since many users do not program but instead run third-party applications, Such users need to know how to use these efficiently and to best effect. Probably the greatest barrier, however, to resource pooling is the lack of clarity around funding models and costs. Once these are better understood and clear, easy to understand, implement and monitor processes are in place then resource pooling will be far more appealing to researchers. This, however, requires RCUK and HEIs to engage together to resolve this.
Original language | English |
---|---|
Publisher | EPCC, University of Edinburgh |
Number of pages | 26 |
Publication status | Published - 18 Oct 2012 |