CERN/FOCUS 2000-007
Minutes 19
31st October 2000
FOCUS
FORUM ON COMPUTING: USERS AND SERVICES
MINUTES OF THE 19th MEETING OF FOCUS
HELD ON THURSDAY 12th OCTOBER 2000
Present: T.Cass, M.Cattaneo (Secretary), M.Delfino, M.Ernst, B.Gobbo, R.Gokieli, A.Grant, F.Hemmer, H.-F.Hoffmann*), V.Innocente, P.Jeffreys (Chairperson), M.Kienzle, J.Knobloch, M.Marquina, N.McCubbin, H.Meinhard, S.O’Neale, A.Norton, S.O'Neale, M.Pimiä, F.Ranjard, H.Renshall, L.Robertson, A.Sandoval, J.Shiers, A.Silverman, E.Valente*), P.Vande Vyvre, W.von Rüden
Invited: J.-P.Baud*), M.Dimou*), J.Gordon, D.Heagerty*), B.Segal*), J.van Eldik*)
Apologies: S.Jarp, J.May, K.Safarik, R.Voss (represented by H.Meinhard)
Absent: J.Altaber, R.Cashmore, F.Etienne, F.Gagliardi, D.Jacobs, W.Lerche, L.Mapelli, M.Mazzucato
*) part time
1.1 CONSIDERATION OF AGENDA
The chairman opened the meeting by pointing out that the operation of FOCUS is still being tuned to ensure meetings finish on time. Changes introduced with this meeting are that IT has been asked to predict the time needed for items presented under "Update on ongoing IT activities", and that FOCUS members have been asked to request items under A.O.B. at the latest two days before the meeting.
The chairman thanked Helge Meinhard for distributing his excellent paper on requirements for Linux on laptops.
1.2 MINUTES OF THE LAST MEETING
The minutes had already been approved by E-mail. There were no further comments.
1.3 CHAIRMAN’S COMMENTS
The chairman welcomed new members to FOCUS and thanked departing members:
Wolfgang von Rüden replaces Eric McIntosh as IT/PDP group leader.
Maria Kienzle replaces Bob Clare as L3 representative.
Vincenzo Innocente replaces Lucas Taylor as second CMS representative.
In addition, on suggestion by Jürgen May, the Desktop Forum (DTF) secretary (Alan Silverman) has been invited to attend FOCUS meetings, and the FOCUS secretary has been invited to attend DTF meetings.
The chairman then pointed out that FOCUS should be more explicit when approving proposals presented to it. Some misunderstandings have arisen in the past when proposals made at FOCUS have been misinterpreted as decisions.
Finally he congratulated Denise Heagerty on her appointment as site-wide Computer Security Officer.
1.4 DATES FOR FOCUS MEETINGS IN 2001 (M.Cattaneo) (see slides)
The following dates for FOCUS meetings in 2001 were agreed:
Thursday 1st March, 14:00
Thursday 28th June, 14:00
Thursday 11th October, 14:00
Thursday 6th December, 14:00
2. WEB SERVERS AND SECURITY (D.Heagerty) (see slides)
Denise's talk highlighted the dangers of insecure web servers. There are hundreds of known security holes for web servers, and automatic tools publicly available to exploit these holes. Risks range from defaced web pages (unwanted publicity) to access to system scripts and deletion of files from inside the CERN firewall. A security check has been made of the web servers registered for external
access, but this needs to be extended to unregistered web servers, some of which bypass the firewall on non-standard port numbers..
The Security Team proposes the following actions:
There was some discussion on whether any script accepting the http protocol would be considered a web server. The goal is not to forbid such use but to find vulnerability. It was suggested to authorise certain users to look for their own vulnerabilities.
FOCUS recognises that insecure web servers are potentially very damaging, thanks Denise for bringing this to the attention of the committee, and ENDORSES the proposal made to regulate the situation.
3. CLASP PROJECT (D.Heagerty) (see slides)
Denise presented the CLASP project, whose goal is to reduce the number of login/passwords required by users to access services. It addresses at least the services provided by IT and AS divisions, targeting Linux and W2000 use for web, mail, interactive use and file access, both from inside and outside CERN. Details can be found on the project web site http://cern.ch/proj-clasp. This is NOT a security project, but the definition of security levels and password policy are within its scope.
Under phase 1 of the project, the mandate has been defined and a service survey and feasibility study have been completed. The survey lists more than 30 services across divisions, using more than 12 different passwords. Most IT services use a single login ID through CCDB, AS division services are being integrated. Some password harmonisation exists wherever possible. The explosion of different login/passwords is mainly driven by Web authors.
The feasibility study concludes that Kerberos v5 provides a good basis for common authentication and Single Sign On, but some Public Key Infrastructure (PKI) based on certificates has to be integrated, for GRID applications. Non-Kerberos solutions have to be considered for the Web, since Netscape does not support Kerberos v5. Enhanced security is essential to overcome the vulnerability of the initial sign on. Common Access Rights, covering distribution lists, web page and file protections through "e-groups" looks useful - electronic grouping of people and/or accounts can be defined centrally and made available to applications through LDAP and Active Directory.
Phase 2 of the project has just started. The first milestone, this month, is to make available to services a test authentication environment, serving Kerberos v5, AFS and Grid certificates - services included in phase 2 are mail, web (IT+AS services), file access, interactive, batch (LSF), Oracle and future Grid services. By February 2001 implementation plans will be available for a production authentication service and for most IT and AS services. Phase 2 will conclude in May 2001with a final proposal, including a security review, password change and check policy, recommendations for offsite access and a proposal for common access control for web pages, files and e-mail lists. The final proposal will be presented to C5, FOCUS and Desktop Forum.
The audience made several suggestions about the scope of the project. External access to Windows files is requested, worries were expressed about the scheduling of acrontab tasks, about tapes ownership, and about the maturity of the PIE database. It was also suggested to collaborate with other laboratories (through HTASC and HEPCCC). In answer to a question from M.Ernst, Denise said that she is picking up as much as possible from the FNAL "strong authentication project".
FOCUS recognises this to be a topical issue, and notes the progress made under Phase 1 of the project. It ENDORSES the plans for Phase 2 and ENCOURAGES consideration for these activities to be co-ordinated across sites through HTASC.
4. PROPOSAL FOR IMPROVING SERVICE CHANGE ANNOUNCEMENTS (M.Dimou) (see slides)
Maria summarised her proposal, details are available at http://cern.ch/ref/cern/it/us/2000/040/. She addresses the problem of users complaining that they don't know the details of planned changes, or that insufficient explanation is given when problems occur, even though service managers are already using agreed announcement procedures. Several actions are proposed to improve the communication between managers of services (which also allows building up a knowledge base of side effects), and to better target announcements to affected groups of users. When announcing, the reasons and side effects of a change/problem should be explained. Several announcement channels should be used, to tailor to individual tastes: announcements should be replicated in e-mail distribution lists, news, web pages etc., letting people switch off those channels in which they are not interested.
FOCUS is grateful for the comprehensive report and AGREES with the conclusions and the proposed means of addressing the relative inadequacies of the present service change announcement system.
5 MASS STORAGE ISSUES
5.1 Review of current tape technologies and policy (Harry Renshall) (see slides)
Harry began by reviewing current devices and media. CERN strategy is to use data centre quality tape technology for raw and processed physics data, and to expect to use commodity tape technology for import/export. The placing of Linear Tape Open (LTO), currently under field test at CERN, is not yet clear but it is expected to be the successor to DLT for small labs. End of service has been announced for the STK Redwood drive (end 2002), IBM 3590E (2CHF/GB) has replaced IBM 3590.
Since the March FOCUS meeting, a third STK silo has been added in the basement of bat 513. Eight IBM 3590E drives have been added (for adsm+HPSS). A new LHC testbed silo in bat 513 is currently equipped with 9 STK 99xx drives. 25 STK 99xx drives could be installed later but we have to go for tender before. The STK 99xx drive, currently on beta-test, is expected to be the next drive for bulk physics data. One IBM 3494 robot has been stopped, the second one and the TL820 DLT robot will be stopped later this year.
New objectives for 2001 are:
The following points were raised during the discussion:
FOCUS ENDORSES the objectives for 2001. In particular it SUPPORTS the interim solution of charging 2 CHF/GB in 2001 for new data (existing data will be transferred from Redwoods without charge), purchasing managed storage instead of tapes, which are no longer sold to experiments.
5.2 Overview of Storage Area Networks (Ben Segal) (see slides)
Ben introduced the SAN principles. SCSI commands are sent over a switched LAN (usually FibreChannel), offloading the general purpose LAN. Disk and Tape devices are attached directly to the network (as opposed to servers as for the conventional "Network Attached Storage" (NAS)). They are physically visible to the whole network but are usually logically partitioned among clients without sharing. Shared storage is possible but requires a cluster file system.
The Pros of SAN compared to NAS are extra network bandwidth, storage switchability without recabling, better storage redundancy and manageability, and higher speed due to the elimination of the IP protocol stack and intermediate servers. Cons are the cost per GB of disk storage, which tends to be higher than basic SCSI or EIDE, extra network management, and interoperability problems with FibreChannel. However, work is ongoing, also at CERN, to look for alternatives to FibreChannel, especially Gigabit Ethernet (encapsulating SCSI commands in the very lightweight ST protocol).
Thus SANs make sense for high quality, high availability, high performance, high reliability storage, but not for stageing, scratch or for direct desktop connections.
A small pilot SAN will be set up by IT/PDP with RAID disks and a few tape drives, to investigate high performance, highly reliable storage for ORACLE Parallel Server, and for CDR first level storage (i.e. the high performance buffer at the input of CDR). An investigation of Gigabit Ethernet client attachment will be made when ready.
FOCUS is grateful for the review and SUPPORTS a small pilot SAN.
5.3 Plans for HPSS, CASTOR etc. (Jean-Philippe Baud) (see slides)
This year HPSS has been used for CDR (NA57 and LHC test beams), "User tapes", DELPHI simulation data, all CMS data, and the ALICE MDC. CASTOR was used by the Tape movers, the Delphi stager, the ALICE MDC, COMPASS and L3C production, and as an interface to HPSS.
Next year it is proposed to limit HPSS to "user tapes" and existing datasets, and to use CASTOR for all CDR, LHC Data Challenges and high volume physics data.
In the longer term, CASTOR is viewed as the candidate solution for LHC, with HPSS as a possible fallback. There is no realistic hope for a common HEP solution, but Data Grid WP5 will provide a common API (with RFIO as proposed solution) and the definition of an exchange format for data and metadata.
As a possible migration path, two possibilities are suggested for importing existing SHIFT tapes into the CASTOR name space. Experiments can ask to move data from HPSS into CASTOR, but no systematic transfer is currently planned; the copy must be done while CERN has an active HPSS license.
S.O'Neale pointed out that the proposed "transparent" mechanism for migrating data to CASTOR loses the knowledge which allows staging of many files in one mount, and so may require changing the applications. This again led to a broader discussion on the difficulty of migrating out of FATMEN.
It was felt important to give to Tier 1 centres the message that CASTOR is the current direction of CERN, and that CERN should invest sufficient manpower into this project.
FOCUS is pleased to receive the status report on the use of CASTOR and HPSS. It SUPPORTS the proposal to use CASTOR for everything except 'user tapes' and existing datasets next year. FOCUS AGREES that effort should be invested to build towards consideration of CASTOR as the solution for LHC.
5.3 DataGrid and implications for CERN
5.3.1 WP2 (Ben Segal) (see slides)
Ben gave a breakdown of the tasks composing DataGrid WP2, whose goal is to provide a universal name space, efficient data transfer between sites, secure WAN data access with caching/replication, and interfacing to mass storage management systems, and reviewed the resources available to the project. CERN manages the work package and leads tasks 2.2 (Data Access/Migration i.e. the interface to WP5) and 2.3 (Data Replication i.e. how to move data close to processing) but has also budgeted some manpower in the other tasks.
The schedule is similar for all work packages. A report on the survey of current technology should be produced by month 4, and a report on requirements and architecture by month 6. A first prototype release is scheduled for month 9, with second and third prototypes one and two years later.
5.3.1 WP5 (John Gordon) (see slides)
Data Grid WP5 addresses the problem of different regional centres having different mass storage hardware and software. The three main tasks of WP5 are to provide:
If RFIO is adopted, CERN will have to package it for use by others (e.g. documentation). Work will be needed to instrument services to provide resource metadata to the Datagrid Information Service.
Unsolved problems include data flow and grid architecture, and databases: so far only physical files have been considered - the implications of using e.g. Objectivity have still to be worked out.
As a conclusion, the chairman asked whether people were worried about grid activities diverting effort from IT division. It was felt too early to conclude anything, but if the model is correct this will actually bring in effort for something that anyway has to be solved.
6. Update on ongoing IT activities
6.1 Termination of X terminal repair service (T.Cass) (see slides)
The service handles requests for X terminal repairs. Usage of the service has been declining (15 repairs in 1999) and repairs take 5-10 weeks to complete at an average cost of 500CHF. No loan is available during the repair. Tony suggests stopping the service, since X-terminals have no long-term future as they are not much cheaper than PCs. Broken terminals could be replaced by old PCs running Linux.
The proposed termination was AGREED.
6.2 Consolidation of SUN physics services (L.Robertson) (slides)
(This item was in fact taken just before coffee)
Les presented one slide summarising the IT proposal. This is to concentrate physics SUN services on a single shared interactive service "SUNDEV". The platform would provide the environment for physics software development, including support for tools not available on Linux. It would NOT be a generic "desktop support service", i.e. not for mail, browsing, text processing etc.. Batch would be available only in background to the basic interactive service. The proposal is a consolidation proposal, not implying any additional investment in SUN support. Specialised servers (e.g. database servers) are not concerned by this proposal.
The Atlas and CMS representatives were unwilling to agree to this proposal without consulting their collaborations, and without more details of the scope and partitioning of the service. Worries were also expressed by NA48 about the impact of this policy on preserving backward compatibility for data processed on the CS2, and on preserving specialised interactive applications such as the NA48 news system. Given the urgency of a decision, N.McCubbin and M.Pimiä were asked to form an ad hoc working group together with IT/PDP (and NA48), to agree on a policy in time for the COCOTIME meeting on 6th November. H.Hoffman stressed that IT division has defined a direction to follow, and that too many deviations taking manpower away from this direction are not welcome.
FOCUS AGREES that SUN is required as a second software development and validation platform for physics and that the prime platform remains Linux/Intel. It RECOMMENDS that a small group (organised by N.McCubbin and M.Pimiä - ACTION) be formed to define with IT/PDP group the SUNDEV service before the COCOTIME meeting on 6th November, taking into account the presentation made by Les Robertson and the agreement already reached.
7. ACTIONS OUTSTANDING
|
Minuted/Section |
Action |
Who |
Status |
|
01/07/1999 14/2 |
Organise discussion of Storage Area Networks at a future meeting |
M.Cattaneo, P.Jeffreys |
See these minutes, section 5.2 CLOSED |
|
01/07/1999 14/2 |
Organise review of Mass Storage technologies at October 2000 meeting |
M.Cattaneo, P.Jeffreys |
See these minutes, section 5.1 CLOSED |
|
01/07/1999 14/4 |
Look into new representation on DTF |
P.Jeffreys |
M.Cattaneo represents FOCUS in DTF, A.Silverman represents DTF in FOCUS. CLOSED |
|
02/12/1999 16/4 |
Determine policy for future use of FORTRAN CERN libraries |
IT Division, FOCUS |
Review of requirements previously planned for FOCUS 20 moved to FOCUS 21 |
|
02/12/1999 16/602/03/2000 17/508/06/2000 18/4.1 |
Second software validation platforrn working group to make and document fuller analysis following presentation at FOCUS18 |
M.Pimiä and experiment reps. |
Superseded by ad hoc working group, see these minutes section 6.2 |
|
02/03/2000- 17/2 |
Report on LHC Computing Review at next meeting |
D.Jacobs |
Planned for FOCUS 20 |
|
02/03/2000- 17/4 |
Organise discussion on freezing of NICE95/NT |
M.Cattaneo, P.Jeffreys |
Planned for FOCUS 20 |
|
02/03/2000- 17/4 |
Proposal for centralised migration of Windows home directories to DFS |
IT Division |
To be included in NICE discussion in FOCUS 20 |
|
02/03/2000- 17/4 |
Organise discussion of remote access to Windows home directories |
M.Cattaneo, P.Jeffreys |
To be included in NICE discussion in FOCUS 20 |
|
02/03/2000- 17/5 |
LEP experiments to define long term analysis strategy by end 2000 |
LEP expt. reps. |
R.Cashmore has asked LEP experiments for written report. To be followed up by H.Hoffmann |
|
02/03/2000- 17/6 |
Experiments to decide how to split their tapes between the different libraries |
Experiment reps. |
Tape libraries have been split. CLOSED |
|
08/06/2000- 18/3 |
State requirements for interoperability of Linux and Windows |
Experiment reps. |
Requirements included in Helge's document, see next action. CLOSED |
|
08/06/2000- 18/3 |
Distribute ATLAS requirements for Linux on portables to FOCUS mailing list. |
Helge Meinhard |
Document available on FOCUS19 web site. CLOSED |
|
08/06/2000- 18/3 |
Organise presentation on Linux/Windows migration at a future meeting |
M.Cattaneo, P.Jeffreys |
Windows presentation planned for FOCUS20. Linux presentation after creation of migration task in 2001 |
|
08/06/2000- 18/4.2 |
Identify full set of software to be maintained in frozen operating system |
Experiments with IT. |
To be addressed by meetings of experiments with IT link persons. |
|
08/06/2000- 18/4 |
Document RISC decommissioning schedule and definition of frozen O.S. |
IT Division |
Memo from Tim Smith will be distributed after meeting, to be discussed at FOCUS 20 |
|
08/06/2000- 18/5.1 |
Ensure external network connections are optimised and appropriate for import/export data transfers (protection against breaks, reliability, ease of transfers) |
IT Division together with external partners |
Talk on CERN external connectivity planned for FOCUS 20 |
|
08/06/2000- 18/5 |
Refine questions on experiments' needs for remote backup and archive, including needs for any features unique to ADSM |
IT Division |
Postponed to 2001 |
|
08/06/2000- 18/6 |
Request experiments to specify requirements for large shared LHC computing test-bed |
M.Cattaneo, P.Jeffreys |
Spring or Summer 2001. Hans Hoffmann would like to tests with real data, not just MDC |
|
08/06/2000- 18/7.2 |
Report back to FOCUS on details of User Revoking Policy, following ACCU discussion |
M.Delfino |
Principle agreed by ACCU. New iteration at December ACCU, then finalise at FOCUS. |
|
08/06/2000- 18/9.4 |
Make proposal on how to improve service change announcements |
M.Marquina, S.O'Neale |
See these minutes, section 4. CLOSED |
8. A.O.B.
8.1 New CERN computing rules
New Computing Rules, covering all users of CERN computing facilities, were recently issued. Full information is available at http://www.cern.ch/ComputingRules. Paper copies are available at divisional secretariats in both English and French.
8.2 LEP Higgs search MC productions (DELPHI: J.van Eldik, ALEPH: F.Ranjard (see slides)
Jan described his efforts to produce 3M Monte Carlo events in 3 days, to make them available for the Higgs analyses. This compares with 50k events/day normally produced by the biggest Delphi MC farm. 650 Linux and 45 OSF servers were made available by many experiments with the help of IT division. This was made possible by using the standard configuration (AFS, LSF, rfcp) and the LSF multicluster facility. There were few problems, except for book-keeping which was rather painful due to the short time available to develop it. Jan concluded by thanking IT/PDP and all the experiments.
Florence added that Aleph benefited from 150 CPUs from the EFF (originally allocated to CMS) for a similar production; she believes that this was only possible thanks to the standardised configuration of the Linux clusters managed by IT Division. It would never have been possible in the days of experiment specific "shift" environments.
PENDING ACTIONS
|
Minuted/Section |
Action |
Who |
Status |
|
02/12/1999 16/4 |
Determine policy for future use of FORTRAN CERN libraries |
IT Division, FOCUS |
Review of requirements planned for FOCUS 21 |
|
02/12/1999 16/602/03/2000 17/508/06/2000 18/4.112/10/2000 19/6.2 |
Second software validation platform. Form ad hoc group to define scope of SUNDEV service before COCOTIME meeting of 6th November 2000. |
M.Pimiä, N.McCubbin IT/PDP |
Report planned for FOCUS 20 |
|
02/03/2000- 17/2 |
Report on LHC Computing Review at next meeting |
D.Jacobs |
Planned for FOCUS 20 |
|
02/03/2000- 17/4 |
Organise discussion on freezing of NICE95/NT |
M.Cattaneo, P.Jeffreys |
Planned for FOCUS 20 |
|
02/03/2000- 17/4 |
Proposal for centralised migration of Windows home directories to DFS |
IT Division |
To be included in NICE discussion in FOCUS 20 |
|
02/03/2000- 17/4 |
Organise discussion of remote access to Windows home directories |
M.Cattaneo, P.Jeffreys |
To be included in NICE discussion in FOCUS 20 |
|
02/03/2000- 17/5 |
Definition of LEP long term analysis strategy. |
R.Cashmore H.Hoffmann |
Waiting for action on written report requested by R.Cashmore |
|
08/06/2000- 18/3 |
Organise presentation on Linux/Windows migration at a future meeting |
M.Cattaneo, P.Jeffreys |
Windows presentation planned for FOCUS20. Linux presentation after creation of migration task in 2001 |
|
08/06/2000- 18/4.2 |
Identify full set of software to be maintained in frozen operating system |
Experiments with IT. |
To be addressed by meetings of experiments with IT link persons. |
|
08/06/2000- 18/4 |
Document RISC decommissioning schedule and definition of frozen O.S. |
IT Division |
Memo from Tim Smith will be distributed after meeting, to be discussed at FOCUS 20 |
|
08/06/2000- 18/5.1 |
Ensure external network connections are optimised and appropriate for import/export data transfers (protection against breaks, reliability, ease of transfers) |
IT Division together with external partners |
Talk on CERN external connectivity planned for FOCUS 20 |
|
08/06/2000- 18/5 |
Refine questions on experiments' needs for remote backup and archive, including needs for any features unique to ADSM |
IT Division |
Postponed to 2001 |
|
08/06/2000- 18/6 |
Request experiments to specify requirements for large shared LHC computing test-bed |
M.Cattaneo, P.Jeffreys |
Spring or Summer 2001. Hans Hoffmann would like to tests with real data, not just MDC |
|
08/06/2000- 18/7.2 |
Finalise with FOCUS details of User Revoking Policy, following discussion at December ACCU |
M.Delfino |