Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Context File spec to use newest Context questions #3

Open
DingoEatingFuzz opened this issue Aug 3, 2019 · 0 comments
Open

Comments

@DingoEatingFuzz
Copy link
Collaborator

My initial exercise of taking the Context questions and translating them into a YAML spec used out of date questions.

The final Context v1 spec should be similar in structure (prefer nested properties, short property names, mindful datatypes) but up to date with the latest methodology requirements.

Example from initial exercise, for inspiration

name: 'E-Scooter Routes Traveled Interactive Map data'
history:
  original_purpose: 'E-Scooter pilot program analysis'
  other_purposes:
    known: []
    potential:
      - 'Commute frequency'
      - 'Urban planning'
  funding:
    funded_by: 'PBOT'
    dependencies: []
  experiments: []
  comments: ''
composition:
  instances:
    - type: 'Ride aggregations by geospatial hexagon'
      count: 16000
      fields:
        - longitude
        - latitude
        - segment name
        - trips
  relationships: []
  self_contained: true
  external_guarantees: false
  data_splits: ''
  evaluation_measures: ''
  comments: ''
collection:
  procedure: >
    Trip data was collected by various E-Scooter providers, including Bird and Lyft. Source data was collected at the ride level, including statistics such as start time, end time, max speed, average speed, and route taken.
  participants:
    - Lyft
    - Bird
    - Lime
    - Skip
    - PBOT
  compensation_structure: >
    Lyft, Bird, and Lime collect data for internal usage and analytics. Their incentive for sharing the data with PBOT is to allow the city to analyze ride history during the pilot to determine the outcomes of the pilot. Ultimately, to continue providing their services in the city of Portland.
  timeframe:
    start: 2018/07/23
    end: 2018/11/20
  acquisition:
    method: 'From E-Scooter providers to PBOT'
    frequency: 'weekly'
  validation: >
    Before publicly releasing the data, all trips under one minute in duration was dropped. All trips representing an outlier measured by avg speed were dropped.
  completeness:
    sample: false
    missing_information: false
  comments: ''
preprocessing:
  methodology: >
    Trips from the entire pilot period were aggregated based on geospatial hexagons that span the city limits of Portland. Each hexagon has an associated segment name. Most segments include multiple hexagon. Some hexagons represent multiple segments, (e.g., road intersections).
  source_data_preserved: true
  software_used:
    - name: ArcGIS
      third_party: true
      open_source: false
      link: ''
  aligned_with_motivations: true
  limitations: >
    In this aggregated view, only geospatial frequency can be examined and validated. The original analysis includes route durations, trip frequency by time and day, trip frequency by E-Scooter provider, and many other details.
  comments: ''
distribution:
  channels:
    - PBOT website as a downloadable CSV
  doi: false
  redundant_archival: true
  first_distribution_date: ''
  license: null
  restrictions:
    access: null
    redistribution: null
  comments: ''
maintenance:
  maintainers:
    - 'PBOT'
  updates: null
  erratum: null
  obsolute_notification_procedure: null
  data_usage_tracking: null
  extensions: null
  comments: ''
legal:
  collection_disclosure:
    disclosure: true
    procedure: >
      As part of the 2018 E-Scooter pilot, the city publically stated data was being collected.
  collection_consent:
    motivation_disclosure: true
    consent_protocol: 'Terms of use'
    ability_to_revoke_consent: false
  individual_legal_implications:
    exposes_individuals_to_harm: false
    exposes_individuals_to_legal_action: false
    mitigation_procedure: >
      Only aggregated, and therefore anonymized, data was published.
  social_advantages:
    groups_advantaged: []
    groups_disadvantaged: []
    details: ''
    mitigation_procedure: >
      Only aggregated, and therefore anonymized, data was published.
  privacy:
    guarantees:
      - Personally identifiable data would not be published
    protections: []
    confidential: false
    personally_identifiable: false
ethics:
  content_warnings: []
  worst_case_scenarios:
    - Data is used to sabotage PBOT's transportation goals in order to drive more people to use private transportation methods. E-Scooters or otherwise.
    - E-Scooter providers use this data to prey on vulnerable populations, such as underage young adults and children (e.g, ensuring a healthy stock of scooters outside of high schools).
  comments: >
    Since this dataset is derived from more detailed and more granular data owned by E-Scooter providers, there is nothing here that is enabling worst case scenarios beyond what these companies could do on their own already.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant