The Real Time Monitor

Adding your site to the RTM

If your site and its resources are published in a major top BDII then they should automatically appear on the RTM. To check if everything is published as you expect it is please look here. This list contains all WMS/RB and CEs visible by the RTM, ordered by country and site.

If a resource/jobs are not present on this text map they will not appear on the RTM map either.

If your site is in a country not listed in the location list above or your BDII is not being monitored, please mail us, so we can extend the list of countries recognised by the RTM.

You can also supply a logo for this list and the RTM. Just email the image you wish to use to lcg-monitor@imperial.ac.uk It can be any web compatible format you wish but it should be 72dpi and be a maximum 100px high and 300px wide.

We do not show the Logging And Bookkeeping system (L&B) on the map but these are the boxes we actually use to obtain information about jobs. We scan for new resources 4 times a day, adding new services, and we ping L&B systems to learn if they are to be monitored. We periodically mail sysadmins with instructions and ask them to include their site's L&B to be monitored by the RTM.

Enabling your L&B for use with the RTM

L&B version before 2.0.X

If you are running a version of L&B prior to version 2.0 you need to allow the RTM to monitor your WMS/RB. To do this you need to execute the following steps on the appropriate box where the Logging & Bookkeeping MySQL database is running. The variable password used below is available by emailing lcg-monitor@imperial.ac.uk.

mysql -u root -p
enter your mysql pass - in YAIM site-info.def
GRANT SELECT ON lbserver20.events TO 'lcg2mon'@'tl00.hep.ph.ic.ac.uk'
IDENTIFIED BY 'password' ;
GRANT SELECT ON lbserver20.states TO 'lcg2mon'@'tl00.hep.ph.ic.ac.uk'
IDENTIFIED BY 'password' ;
GRANT SELECT ON lbserver20.short_fields TO
'lcg2mon'@'tl00.hep.ph.ic.ac.uk' IDENTIFIED BY 'password' ;
You also need to ensure any firewalls are not blocking the IP address used by the RTM (tl00.hep.ph.ic.ac.uk).

L&B version 2.0.X

If you are currently running a version of L&B after version 2.0 then the system which the RTM interacts with this has changed. The old 'pull' mode with the direct access to the LB MySQL database is by default disabled in the RTM from L&B ver. 2.0. since the format has changed and we cannot monitor the new server the old way. With L&B version >2.0 we are able to use LB notifications. This way the required information is pushed to the RTM by LB servers we subscribe to. To include your L&B servers in the system there are a few actions required from your side.

  • DN of the X509 identity of LB harvester (/C=UK/O=eScience/OU=Imperial/L=Physics/CN=rtmsrv00.hep.ph.ic.ac.uk/emailAddress=janusz.martyniak@imperial.ac.uk) must be added to /opt/glite/etc/LB-super-users (for L&B 2.1 it is ADMIN_ACCESS section in /opt/glite/etc/glite-lb/glite-lb-authz.conf file)
  • index lastUpdateTime is needed ("/opt/glite/bin/glite-lb-bkindex -d" will get all the current indices). This may take a while.

However when installing a brand new L&B, one can use the following YAIM installation options instead.

GLITE_LB_RTM_ENABLED=true
GLITE_LB_RTM_DN="/C=UK/O=eScience/OU=Imperial/L=Physics/CN=rtmsrv00.hep.ph.ic.ac.uk/emailAddress=janusz.martyniak@imperial.ac.uk"

Troubleshooting

If you have think your resources should be visible on the map you may need to do an LDAP search. This is outlined here.

The RTM uses solely information stored in a BDII to discover resources used in the program. We are happy to include any EGEE compatible resource, not necesarily published in central BDII systems at CERN or RAL. If your resources are published in your Grid infrastructure's top BDII we are happy to scan it and include the services in the RTM. Our experience shows however that many 'missing' resources are caused by improper listing in the BDII. Below we give examples of queries which you can (should) perform to see if your boxes are published properly (these are command line versions of the queries our scans actually use).

In the following examples the site we are interested in is UKI-LT2-IC-HEP and we are using the RAL top BDII (ldap://lcg-bdii.gridpp.ac.uk:2170). When you are running these queries you should replace these with your site ID and your top level BDII when running these commands. If any of these commands fail you will need to talk to the administrator of your BDII. Let's start from the top and see if your site is there.

ldapsearch -LLL -x -H ldap://lcg-bdii.gridpp.ac.uk:2170 -b mds-vo-name=local,o=grid '(objectClass=GlueSite)' GlueSiteUniqueID
This will create a lengthy output but if your site is visible it will be in this list and with the expected ID e.g UKI-LT2-IC-HEP.

Once you have confirmed your site is visible you need to determine what Computing Elements are being reported. This is done by searching for GlueClusterName and filtering on GlueSiteUniqueID.

ldapsearch -LLL -x -H ldap://lcg-bdii.gridpp.ac.uk:2170 -b mds-vo-name=local,o=grid '(&(objectClass=GlueCluster)(GlueForeignKey=GlueSiteUniqueID=UKI-LT2-IC-HEP))' GlueClusterName
This will list your CEs (GlueClusterName) located at the site if you are publishing correctly.

The RTM is interested in more than the CEs and you will need to check that both WMS and L&B are working. To do this run the following command:

ldapsearch -LLL -x -H ldap://lcg-bdii.gridpp.ac.uk:2170 -b mds-vo-name=local,o=grid '(&(objectClass=GlueService)(GlueForeignKey=GlueSiteUniqueID=UKI-LT2-IC-HEP))' GlueServiceType GlueServiceEndpoint
To check your WMS is being reported look in the output for the line GlueServiceType: org.glite.wms.WMProxy. To check you L&B is working as expected look for the line GlueServiceType: org.glite.lb.server. You also need to check that each of these GlueServiceTypes are accompanied by the correct GlueServiceEndpoint value.