Skip to content

Program data for each conference, and some scripts and derivatives

Notifications You must be signed in to change notification settings

AccessLibCon/conference-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 

Repository files navigation

The program data has been marshalled in a Google Sheet, with a worksheet for each conference (currently starting with 1996). The worksheets have been exported into the csv directory with names like 1993.csv. Until the data format is finalized, I've considered the Google Sheet to be the authoritative source and the csv files to be working derivatives. Therefore update the Google Sheet to make changes and download new csv files, until the decision is made to consider the csv files the authoritative source.

List of conferences and sources: Access Conference History

The only non-flat field in the csv is the speakers column, which can contain multiple speakers (pipe-separated). Individual speakers are listed in the form name (institution). Not all speakers have institutions. Institution names have not been normalized and many different forms (University of Alberta, U of Alberta, University of Alberta Libraries etc.)

The schedule times are preserved, and are generally in the form 19:00 - 10:00 (with many variations), always in the 12-hour clock.

Geocoding of institutions is a complex chain: Non-normalized institution name from program is gathered into institutions.json by gather-institutions.rb;

instution.json entries look like this:

"Emory University, Atlanta, GA": {
    "city": "Atlanta",
    "type": null,
    "ignore": false
  },

The city property is added manually. It will be used for geocoding by gather-places.rb, so it needs to be specific enough for a lookup. Big cities work (like Toronto), smaller ones or ambiguious ones need more details (Victoria, BC, Canada)

Sample code for geocoding places:

s = source_data[key]
if mappings.keys.include?(s['city'])
  s['address'] = mappings[s['city']]
else
  result = Geocoder.search(s['city']).first
  puts 'Not found: ' + s['city'] if !result
  next if !result
  new = {
    address: result.address,
    city: (result.city.nil? ? '' : result.city),
    state: result.state,
    country: result.country,
    lat: result.coordinates[0],
    lon: result.coordinates[1],
  }
  mappings[s['city']] = result.address
  new_places << new[:city]
  
  master_place_data[result.address] = new
  puts s['city'] + ': ' + result.address
end

Sample parsing of program times:

# we assume times less than 8:00 are pm
eightoclock = Time.parse('8:00')
twelvehours = 12 * 60 * 60

...

times = row['time'].gsub(/[^0-9:]/, ' ').gsub(/\s+?/, ' ').strip.split(' ')
start = Time.parse(times[0])
finish = Time.parse(times[1])

# convert pm times to 24 hour format
start += twelvehours if start < eightoclock
finish += twelvehours if finish < eightoclock

duration = (finish - start) / 60

About

Program data for each conference, and some scripts and derivatives

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages