Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove obsolete site.ad file from TW and schedulers #8720

Closed
belforte opened this issue Sep 26, 2024 · 0 comments
Closed

remove obsolete site.ad file from TW and schedulers #8720

belforte opened this issue Sep 26, 2024 · 0 comments
Assignees

Comments

@belforte
Copy link
Member

since a long time only site.ad.json is used ! ref: #8699 (comment)

text pasted here for convenience:
site.ad vs. site.ad.json

manipulation of site.ad in AdjustSites,py is unchanged since Brian's commit in 2013.

if 'CRAB_SiteAdUpdate' in ad:
newSiteAd = ad['CRAB_SiteAdUpdate']
with open("site.ad", 'r', encoding='utf-8') as fd:
siteAd = classad.parseOne(fd)
siteAd.update(newSiteAd)
with open("site.ad", "w", encoding='utf-8') as fd:
fd.write(str(siteAd))

But I can't find any other place in the code base which references to CRAB_SiteAdUpdate.
Is that code simply "never executed" ?

PreJob uses site.ad.json to create site lists avalaible (where we can run) and ``datasites(where data is). If it does not find it, falls back tosite.ad` to create list of available sites, only, and at first sight it should crash since `datasites` is unconditionally used in following code

There is also this very interesting commit from Brain 10y ago 75b85ca which looks like he introduced the JSON, leaving the old code for temporary compatibility (?) and then.. change stuck.

So I am leaning to site.ad can be removed.

AFAICT those two files are created by DagmanCreator

with open("site.ad.json", "w", encoding='utf-8') as fd:
json.dump(siteinfo, fd)

Yet I can't be sure nor figure out how they are populated from reading the code. Need to run it step-by-step.

site.ad.json contains siteinfo structure
site.ad contains sitead strucutre
those structures are initialized to "empty" in

# Create site-ad and info. DagmanCreator should only be run in
# succession, never parallel for a task!
if os.path.exists("site.ad.json"):
with open("site.ad.json", encoding='utf-8') as fd:
siteinfo = json.load(fd)
else:
siteinfo = {'group_sites': {}, 'group_datasites': {}}
if os.path.exists("site.ad"):
with open("site.ad", encoding='utf-8') as fd:
sitead = classad.parseOne(fd)
else:
sitead = classad.ClassAd()

(when DagmanCreator runs in TW, there are no files to read)

so in particular we start with

siteinfo = {'group_sites': {}, 'group_datasites': {}}

Then DagmanCreator calls createSubdag() which gets the list of data sites and available sites for each job groups (one job group for each set of locations, IIUC)


availablesites = possiblesites - global_blacklist

availablesites &= acceleratorsites

available -= (siteBlacklist - siteWhitelist)

availablesites = [str(i) for i in availablesites]
datasites = jobs[0]['input_files'][0]['locations']

and calls makeDagSpecs() where those are added to siteinfo dictionary
groupid = len(siteinfo['group_sites'])
siteinfo['group_sites'][groupid] = list(availablesites)
siteinfo['group_datasites'][groupid] = list(datasites)

(no comment on the what looks like a horrible dirty trick with L599)

My conclusion from the above is that site.ad is old, obsolete, useless and can be simply removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant