Trilaterale SD-Storage-User support

Europe/Rome
Description

Collaborative tools per trilaterale10m

Confluence page: https://confluence.infn.it/display/CNAF/CNAF_Trilaterale+Home

Progetto JIRA: https://issues.infn.it/jira/projects/CNTRI/issues/

Spostare minute su confluence

Aggiornamento load balancing storm webdav 15m

news su redirector: storm-webdav unica implementazione che usa https anche per trasferimento, potrebbe avere redirect verso http plain. Implementato in prossima versione per GET - configurare webdav per fare redirect da https==>http, ma potrebbe anche essere https==>https che permetterebbe di avere head node storm + serie di transfer node nginx (magari aggiunti/rimossi dinamicamente)

rende possibile confronto diretto con gridftp plain vs plain

da valutare drop di performance in uso https vs http

per test:

step 0 - configurare sotrage area non autenticata e misurare differenza (non serve redirector) - usiamo endpoint di produzione, es. atlas azione:storage/SD

step 1 - confrontare con nginx con redirector - possiamo farlo su un xfer - possiamo partire da una macchina singola anche vm usando localhost

Strategia deployment webdav per consistenza autorizzazione

dubbi su deployment redis  che ha un costo, attendiamo valutazione infinispan ma occorre verificare supporto multicast

Performance storm-webdav e migrazione 20m

https://ggus.eu/index.php?mode=ticket_info&ticket_id=150181

problemi visti da ATLAS risolti/capiti (differenza in bandwidth delle macchine coinvolte)

solved (StoRM) We (or I) now finally understand StoRM behavior and in the latest test we reached * webdav time to transfer 2TB (2000x 1GB) with 200 concurrent connections ... 1403s (1.42GB/s) * SRM+gsiftp time to transfer 2TB (2000x 1GB) with 200 concurrent connections ... 217s (9.22GB/s) The reason for different throughput is not related to StoRM, but WebDAV interface total capacity to the backend storage is now 4GB/s (perfectly OK for current ATLAS TPC transfers to INFN-T1 disks with monthly average ~ 400MB/s) while for GridFTP total capacity is 30GB/s. This will change once we move everything to WebDAV and than GridFTP interfaces will be converted to the WebDAV. Petr

performance durante migrazione gridftp-> webdav - aggiungiamo webdav a NSD?

Storm-atlas: xs-201/301/401
webdav atlas: ds-816/916

evitare che webdav consuma tutte le risorse disponibili su nsd - possiamo tunare limitando numero max di connessioni - oppure cpu consumata per https, questo lo vediamo nei hest https/http

azione: convertire 1 server gridftp in webdav e vediamo cosa succede cercando di capire a cosa è dovuto il carico xs-201==> webdav. Per adesso non cambiamo  staging su cric e vediamo se la differenza di hw cambia  qualcosa (xs-201) azione Storage

Storm-cms: xs-402/403
Webdav CMS: xs-404/

storm-lhcb: xs-101/102/xs-2013
webdav lhcb: xs-104

storm-ams: ds-806/906

storm-archive:ds-509/510/511/512 (I gridftp sono anche StoRM WebDAV server per Belle, Virgo e CTA-LST )

Deployment di una istanza Vault 15m

attività anche dentro pett/smartchain (Arianna?) e IoTwins (Barbara). Esistono istante già deployate?

  • installiamo un'istanza condivisa (dove?)
    lo installiamo nell'ottica della semplificazione delle gestione dei token sulle ui - in sostituzione di oid-agent sulle UI - azione US di procedere all'installazione
    sentire anche Duma

ACL for local groups of the same VO 15m

  • ACL for local groups of the same VO that wrote the file:

    https://issues.infn.it/jira/projects/STOR/issues/STOR-1350
    
    it seems that acl are set to allow the write operation (RW) and never changed also after the put done. It could be the wanted behaviour but we need to discuss how to address the requirements (i.e from JUNO) that asks to inhibit writing for local users
        should we use different groups, i.e. JUNO_local or can be managed via storm?
    LHCb seems to expect the R only for the lhcb group after the write operation in /storage/gpfs_lhcb/lhcb/user:
    
                <aclMode>AoT</aclMode>
                <default-acl>
                    <acl-entry>
                      <groupName>lhcb</groupName>
                      <permissions>R</permissions>
                    </acl-entry>
                </default-acl>
    

AOB

prossimo meeting il 15/04

iniziamo con azioni non discusse

There are minutes attached to this event. Show them.
    • 10:00 10:10
      Collaborative tools per trilaterale 10m

      Confluence page:

      https://confluence.infn.it/display/CNAF/CNAF_Trilaterale+Home

      Progetto JIRA:

      https://issues.infn.it/jira/projects/CNTRI/issues/

    • 10:10 10:25
      Aggiornamento load balancing storm webdav 15m

      news su redirector

    • 10:25 10:45
      Performance storm-webdav e migrazione 20m

      problemi visti da ATLAS risolti/capiti (differenza in bandwidth delle macchine coinvolte)

      performance durante migrazione gridftp-> webdav - aggiungiamo webdav a NSD?

      Storm-atlas: xs-201/301/401
      webdav atlas: ds-816/916

      Storm-cms: xs-402/403
      Webdav CMS: xs-404/

      storm-lhcb: xs-101/102/xs-2013
      webdav lhcb: xs-104

      storm-ams: ds-806/906

      storm-archive:ds-509/510/511/512 (I gridftp sono anche StoRM WebDAV server per Belle, Virgo e CTA-LST )

    • 10:45 11:00
      Deployment di una istanza Vault 15m

      attività anche dentro pett/smartchain (Arianna?) e IoTwins (Barbara). Esistono istante già deployate?

      Agenda

      • Collaborative tools per trilaterale10m

      Confluence page: https://confluence.infn.it/display/CNAF/CNAF_Trilaterale+Home

      • Progetto JIRA: https://issues.infn.it/jira/projects/CNTRI/issues/

         
      • Aggiornamento load balancing storm webdav 15m
         
      • news su redirector

         
      • Performance storm-webdav e migrazione 20m
         
        problemi visti da ATLAS risolti/capiti (differenza in bandwidth delle macchine coinvolte)
         
        solved (StoRM) We (or I) now finally understand StoRM behavior and in the latest test we reached * webdav time to transfer 2TB (2000x 1GB) with 200 concurrent connections ... 1403s (1.42GB/s) * SRM+gsiftp time to transfer 2TB (2000x 1GB) with 200 concurrent connections ... 217s (9.22GB/s) The reason for different throughput is not related to StoRM, but WebDAV interface total capacity to the backend storage is now 4GB/s (perfectly OK for current ATLAS TPC transfers to INFN-T1 disks with monthly average ~ 400MB/s) while for GridFTP total capacity is 30GB/s. This will change once we move everything to WebDAV and than GridFTP interfaces will be converted to the WebDAV. Petr
      • performance durante migrazione gridftp-> webdav - aggiungiamo webdav a NSD?

        Storm-atlas: xs-201/301/401
        webdav atlas: ds-816/916

        Storm-cms: xs-402/403
        Webdav CMS: xs-404/

        storm-lhcb: xs-101/102/xs-2013
        webdav lhcb: xs-104

        storm-ams: ds-806/906

        storm-archive:ds-509/510/511/512 (I gridftp sono anche StoRM WebDAV server per Belle, Virgo e CTA-LST )

         
      • Deployment di una istanza Vault 15m
        attività anche dentro pett/smartchain (Arianna?) e IoTwins (Barbara). Esistono istante già deployate?
      •  
      • ACL for local groups of the same VO 15m
         
         
      • ACL for local groups of the same VO that wrote the file:

        https://issues.infn.it/jira/projects/STOR/issues/STOR-1350
        
        it seems that acl are set to allow the write operation (RW) and never changed also after the put done. It could be the wanted behaviour but we need to discuss how to address the requirements (i.e from JUNO) that asks to inhibit writing for local users
            should we use different groups, i.e. JUNO_local or can be managed via storm?
        LHCb seems to expect the R only for the lhcb group after the write operation in /storage/gpfs_lhcb/lhcb/user:
        
                    <aclMode>AoT</aclMode>
                    <default-acl>
                        <acl-entry>
                          <groupName>lhcb</groupName>
                          <permissions>R</permissions>
                        </acl-entry>
                    </default-acl>
        
         
      • AOB
       

       

       

    • 11:00 11:15
      ACL for local groups of the same VO 15m

      ACL for local groups of the same VO that wrote the file:

      https://issues.infn.it/jira/projects/STOR/issues/STOR-1350
      it seems that acl are set to allow the write operation (RW) and never changed also after the put done. It could be the wanted behaviour but we need to discuss how to address the requirements (i.e from JUNO) that asks to inhibit writing for local users
          should we use different groups, i.e. JUNO_local or can be managed via storm?
      LHCb seems to expect the R only for the lhcb group after the write operation in /storage/gpfs_lhcb/lhcb/user:
      
                  <aclMode>AoT</aclMode>
                  <default-acl>
                      <acl-entry>
                        <groupName>lhcb</groupName>
                        <permissions>R</permissions>
                      </acl-entry>
                  </default-acl>
      
    • 11:15 11:25
      AOB 10m