Skip to content

Commit

Permalink
Updating to new DataAtWork structure
Browse files Browse the repository at this point in the history
  • Loading branch information
jflasher committed Aug 16, 2019
1 parent a5b8753 commit 4d753cc
Show file tree
Hide file tree
Showing 120 changed files with 1,259 additions and 942 deletions.
52 changes: 32 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,21 @@ Resources:
Region:
Type:
DataAtWork:
- Title:
URL:
AuthorName:
AuthorURL:
Tutorials:
- Title:
URL:
AuthorName:
AuthorURL:
Tools & Applications:
- Title:
URL:
AuthorName:
AuthorURL:
Publications:
- Title:
URL:
AuthorName:
AuthorURL:
```
The metadata required for each dataset entry is as follows:
Expand Down Expand Up @@ -93,22 +104,23 @@ Resources:
Region: us-east-1
Type: SNS Topic
DataAtWork:
- Title: Exploring GDELT with Athena
URL: http://blog.julien.org/2017/03/exploring-gdelt-data-set-with-amazon.html
AuthorName: Julien Simon
AuthorURL: https://twitter.com/julsimon
- Title: Running R on Amazon Athena
URL: https://aws.amazon.com/blogs/big-data/running-r-on-amazon-athena/
AuthorName: Gopal Wunnava
AuthorURL: https://www.linkedin.com/in/gopal-wunnava-b11a77/
- Title: Bootstrapping GeoMesa HBase on AWS S3
URL: http://www.geomesa.org/documentation/tutorials/geomesa-hbase-s3-on-aws.html
AuthorName: Commonwealth Computer Research, Inc.
AuthorURL: https://www.ccri.com
- Title: Creating PySpark DataFrame from CSV in AWS S3 in EMR
URL: https://gist.github.com/jakechen/6955f2de51212163312b6430555b8e0b
AuthorName: Jake Chen
AuthorURL: https://github.com/jakechen
Tutorial:
- Title: Exploring GDELT with Athena
URL: http://blog.julien.org/2017/03/exploring-gdelt-data-set-with-amazon.html
AuthorName: Julien Simon
AuthorURL: https://twitter.com/julsimon
- Title: Running R on Amazon Athena
URL: https://aws.amazon.com/blogs/big-data/running-r-on-amazon-athena/
AuthorName: Gopal Wunnava
AuthorURL: https://www.linkedin.com/in/gopal-wunnava-b11a77/
- Title: Bootstrapping GeoMesa HBase on AWS S3
URL: http://www.geomesa.org/documentation/tutorials/geomesa-hbase-s3-on-aws.html
AuthorName: Commonwealth Computer Research, Inc.
AuthorURL: https://www.ccri.com
- Title: Creating PySpark DataFrame from CSV in AWS S3 in EMR
URL: https://gist.github.com/jakechen/6955f2de51212163312b6430555b8e0b
AuthorName: Jake Chen
AuthorURL: https://github.com/jakechen
```

## How can I contribute?
Expand Down
11 changes: 7 additions & 4 deletions datasets/1000-genomes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,10 @@ Resources:
Region: us-east-1
Type: S3 Bucket
DataAtWork:
- Title: Exploratory data analysis of genomic datasets using ADAM and Mango with Apache Spark on Amazon EMR
URL: https://aws.amazon.com/blogs/big-data/exploratory-data-analysis-of-genomic-datasets-using-adam-and-mango-with-apache-spark-on-amazon-emr/
AuthorName: Alyssa Marrow
AuthorURL: https://research.eecs.berkeley.edu/~akmorrow/
Tutorials:
Tools & Applications:
Publications:
- Title: Exploratory data analysis of genomic datasets using ADAM and Mango with Apache Spark on Amazon EMR
URL: https://aws.amazon.com/blogs/big-data/exploratory-data-analysis-of-genomic-datasets-using-adam-and-mango-with-apache-spark-on-amazon-emr/
AuthorName: Alyssa Marrow
AuthorURL: https://research.eecs.berkeley.edu/~akmorrow/
4 changes: 4 additions & 0 deletions datasets/3kricegenome.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,7 @@ Resources:
ARN: arn:aws:s3:::3kricegenome
Region: us-east-1
Type: S3 Bucket
DataAtWork:
Tutorials:
Tools & Applications:
Publications:
4 changes: 4 additions & 0 deletions datasets/990-spreadsheets.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,3 +16,7 @@ Resources:
ARN: arn:aws:s3:::irs-990-spreadsheets
Region: us-east-1
Type: S3 Bucket
DataAtWork:
Tutorials:
Tools & Applications:
Publications:
21 changes: 12 additions & 9 deletions datasets/afsis.yaml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Name: Africa Soil Information Service (AfSIS) Soil Chemistry
Description: |
This dataset contains soil infrared spectral data and paired soil property
This dataset contains soil infrared spectral data and paired soil property
reference measurements for georeferenced soil samples that were collected
through the Africa Soil Information Service (AfSIS) project, which lasted
from 2009 through 2018. In this release, we include data collected during
Expand Down Expand Up @@ -34,11 +34,14 @@ Resources:
Region: us-east-1
Type: S3 Bucket
DataAtWork:
- Title: AfSIS Soil Chemistry - Usage Tutorial
URL: https://github.com/qedsoftware/afsis-soil-chem-tutorial/
AuthorName: QED
AuthorURL: https://qed.ai
- Title: Goalkeepers 2018, Soil - The Big Data Beneath Your Feet
URL: https://www.youtube.com/watch?v=Fb9R0CnPMkc
AuthorName: QED
AuthorURL: https://qed.ai
Tutorials:
- Title: AfSIS Soil Chemistry - Usage Tutorial
URL: https://github.com/qedsoftware/afsis-soil-chem-tutorial/
AuthorName: QED
AuthorURL: https://qed.ai
Tools & Applications:
Publications:
- Title: Goalkeepers 2018, Soil - The Big Data Beneath Your Feet
URL: https://www.youtube.com/watch?v=Fb9R0CnPMkc
AuthorName: QED
AuthorURL: https://qed.ai
11 changes: 7 additions & 4 deletions datasets/allen-brain-observatory.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,10 @@ Resources:
Region: us-west-2
Type: S3 Bucket
DataAtWork:
- Title: Use the Allen Brain Observatory – Visual Coding on AWS
URL: https://github.com/AllenInstitute/AllenSDK/wiki/Use-the-Allen-Brain-Observatory-%E2%80%93-Visual-Coding-on-AWS
AuthorName: Nika Keller, David Feng
AuthorURL: https://twitter.com/dyfbrain
Tutorials:
- Title: Use the Allen Brain Observatory – Visual Coding on AWS
URL: https://github.com/AllenInstitute/AllenSDK/wiki/Use-the-Allen-Brain-Observatory-%E2%80%93-Visual-Coding-on-AWS
AuthorName: Nika Keller, David Feng
AuthorURL: https://twitter.com/dyfbrain
Tools & Applications:
Publications:
19 changes: 11 additions & 8 deletions datasets/amazon-bin-imagery.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,14 @@ Resources:
Region: us-east-1
Type: S3 Bucket
DataAtWork:
- Title: Amazon Bin Image Dataset Challenge
URL: https://github.com/silverbottlep/abid_challenge
AuthorName: silverbottlep
AuthorURL: https://github.com/silverbottlep
- Title: Amazon Inventory Reconciliation using AI
URL: https://github.com/OneNow/AI-Inventory-Reconciliation
AuthorName: Pablo Rodriguez Bertorello, Sravan Sripada, Nutchapol Dendumrongsup
AuthorURL: https://github.com/pablo-tech
Tutorials:
Tools & Applications:
Publications:
- Title: Amazon Bin Image Dataset Challenge
URL: https://github.com/silverbottlep/abid_challenge
AuthorName: silverbottlep
AuthorURL: https://github.com/silverbottlep
- Title: Amazon Inventory Reconciliation using AI
URL: https://github.com/OneNow/AI-Inventory-Reconciliation
AuthorName: Pablo Rodriguez Bertorello, Sravan Sripada, Nutchapol Dendumrongsup
AuthorURL: https://github.com/pablo-tech
27 changes: 15 additions & 12 deletions datasets/amazon-reviews.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,15 +16,18 @@ Resources:
Region: us-east-1
Type: S3 Bucket
DataAtWork:
- Title: How to scale sentiment analysis using Amazon Comprehend, AWS Glue and Amazon Athena
URL: https://aws.amazon.com/blogs/machine-learning/how-to-scale-sentiment-analysis-using-amazon-comprehend-aws-glue-and-amazon-athena/
AuthorName: Roy Hasson
AuthorURL: https://twitter.com/royhasson
- Title: Implementing a recommender system with Amazon SageMaker and Apache MXNet Gluon
URL: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_applying_machine_learning/gluon_recommender_system/gluon_recommender_system.ipynb
AuthorName: David Arpin
AuthorURL: https://github.com/djarpin
- Title: Querying Review Data with Kognitio AWS Marketplace product using SQL
URL: https://www.linkedin.com/pulse/100-shades-grey-other-amazon-review-discoveries-mark-chopping/
AuthorName: Mark Chopping
AuthorURL: https://twitter.com/markchopping1
Tutorials:
- Title: How to scale sentiment analysis using Amazon Comprehend, AWS Glue and Amazon Athena
URL: https://aws.amazon.com/blogs/machine-learning/how-to-scale-sentiment-analysis-using-amazon-comprehend-aws-glue-and-amazon-athena/
AuthorName: Roy Hasson
AuthorURL: https://twitter.com/royhasson
- Title: Implementing a recommender system with Amazon SageMaker and Apache MXNet Gluon
URL: https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_applying_machine_learning/gluon_recommender_system/gluon_recommender_system.ipynb
AuthorName: David Arpin
AuthorURL: https://github.com/djarpin
- Title: Querying Review Data with Kognitio AWS Marketplace product using SQL
URL: https://www.linkedin.com/pulse/100-shades-grey-other-amazon-review-discoveries-mark-chopping/
AuthorName: Mark Chopping
AuthorURL: https://twitter.com/markchopping1
Tools & Applications:
Publications:
22 changes: 0 additions & 22 deletions datasets/america-ninja-warrior.yaml

This file was deleted.

11 changes: 7 additions & 4 deletions datasets/aws-igenomes.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,10 @@ Resources:
Region: eu-west-1
Type: S3 Bucket
DataAtWork:
- Title: nf-core analysis pipelines
URL: http://nf-co.re/
AuthorName: Phil Ewels
AuthorURL: http://phil.ewels.co.uk/
Tutorials:
Tools & Applications:
- Title: nf-core analysis pipelines
URL: http://nf-co.re/
AuthorName: Phil Ewels
AuthorURL: http://phil.ewels.co.uk/
Publications:
11 changes: 7 additions & 4 deletions datasets/broad-references.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,10 @@ Resources:
Region: us-east-1
Type: S3 Bucket
DataAtWork:
- Title: Genomics Workflows on AWS - Cromwell on AWS
URL: https://docs.opendata.aws/genomics-workflows/orchestration/cromwell/cromwell-examples/#real-world-example-haplotypecaller
AuthorName: W. Lee Pang
AuthorURL: https://www.linkedin.com/in/lee-pang-a039a26/
Tutorials:
Tools & Applications:
- Title: Genomics Workflows on AWS - Cromwell on AWS
URL: https://docs.opendata.aws/genomics-workflows/orchestration/cromwell/cromwell-examples/#real-world-example-haplotypecaller
AuthorName: W. Lee Pang
AuthorURL: https://www.linkedin.com/in/lee-pang-a039a26/
Publications:
81 changes: 42 additions & 39 deletions datasets/cbers.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -49,42 +49,45 @@ Resources:
Region: us-east-1
Type: SNS Topic
DataAtWork:
- Title: STAC V0.6 search endpoint for archive
URL: https://stac.amskepler.com/v06/stac/search
AuthorName: AMS Kepler
AuthorURL: https://github.com/fredliporace/cbers-2-stac
- Title: STAC V0.7 search endpoint for archive (beta)
URL: https://stac.amskepler.com/v07/stac/search
AuthorName: AMS Kepler
AuthorURL: https://github.com/fredliporace/cbers-2-stac
- Title: Remote Pixel Viewer
URL: https://viewer.remotepixel.ca
AuthorName: Remote Pixel
AuthorURL: http://remotepixel.ca/
- Title: EOS Land Viewer
URL: https://eos.com/landviewer/
AuthorName: Earth Observing System
AuthorURL: https://eos.com/
- Title: CBERS timelapse GIF generator
URL: https://github.com/fredliporace/cbersgif
AuthorName: Frederico Liporace
AuthorURL: https://github.com/fredliporace
- Title: aws-sat-api-py
URL: https://github.com/RemotePixel/aws-sat-api-py
AuthorName: Remote Pixel
AuthorURL: http://remotepixel.ca/
- Title: rio-tiler
URL: https://github.com/mapbox/rio-tiler
AuthorName: Mapbox
AuthorURL: https://www.mapbox.com/
- Title: cbers-tiler
URL: https://github.com/mapbox/cbers-tiler
AuthorName: Mapbox
AuthorURL: https://www.mapbox.com/
- Title: CBERS static STAC catalog served by stac-browser
URL: https://cbers.stac.cloud
AuthorName: Radiant Earth
AuthorURL: https://github.com/radiantearth/stac-browser
- Title: Keeping a SpatioTemporal Asset Catalog (STAC) Up To Date with SNS/SQS
URL: https://aws.amazon.com/blogs/publicsector/keeping-a-spatiotemporal-asset-catalog-stac-up-to-date-with-sns-sqs/
AuthorName: Frederico Liporace
Tutorials:
- Title: Keeping a SpatioTemporal Asset Catalog (STAC) Up To Date with SNS/SQS
URL: https://aws.amazon.com/blogs/publicsector/keeping-a-spatiotemporal-asset-catalog-stac-up-to-date-with-sns-sqs/
AuthorName: Frederico Liporace
Tools & Applications:
- Title: STAC V0.6 search endpoint for archive
URL: https://stac.amskepler.com/v06/stac/search
AuthorName: AMS Kepler
AuthorURL: https://github.com/fredliporace/cbers-2-stac
- Title: STAC V0.7 search endpoint for archive (beta)
URL: https://stac.amskepler.com/v07/stac/search
AuthorName: AMS Kepler
AuthorURL: https://github.com/fredliporace/cbers-2-stac
- Title: Remote Pixel Viewer
URL: https://viewer.remotepixel.ca
AuthorName: Remote Pixel
AuthorURL: http://remotepixel.ca/
- Title: EOS Land Viewer
URL: https://eos.com/landviewer/
AuthorName: Earth Observing System
AuthorURL: https://eos.com/
- Title: CBERS timelapse GIF generator
URL: https://github.com/fredliporace/cbersgif
AuthorName: Frederico Liporace
AuthorURL: https://github.com/fredliporace
- Title: aws-sat-api-py
URL: https://github.com/RemotePixel/aws-sat-api-py
AuthorName: Remote Pixel
AuthorURL: http://remotepixel.ca/
- Title: rio-tiler
URL: https://github.com/mapbox/rio-tiler
AuthorName: Mapbox
AuthorURL: https://www.mapbox.com/
- Title: cbers-tiler
URL: https://github.com/mapbox/cbers-tiler
AuthorName: Mapbox
AuthorURL: https://www.mapbox.com/
- Title: CBERS static STAC catalog served by stac-browser
URL: https://cbers.stac.cloud
AuthorName: Radiant Earth
AuthorURL: https://github.com/radiantearth/stac-browser
Publications:
11 changes: 7 additions & 4 deletions datasets/cell-painting-image-collection.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,10 @@ Resources:
Region: us-east-1
Type: S3 Bucket
DataAtWork:
- Title: Example submission for the 2018 CytoData Hackathon (in R and Python)
URL: https://github.com/cytodata/cytodata-hackathon-2018/tree/master/cytodata-toolkit/
AuthorName: Juan Caicedo, Tim Becker
AuthorURL: broadinstitute.org
Tutorials:
Tools & Applications:
Publications:
- Title: Example submission for the 2018 CytoData Hackathon (in R and Python)
URL: https://github.com/cytodata/cytodata-hackathon-2018/tree/master/cytodata-toolkit/
AuthorName: Juan Caicedo, Tim Becker
AuthorURL: broadinstitute.org
11 changes: 7 additions & 4 deletions datasets/census-dataworld-pums.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,10 @@ Resources:
Region: us-east-1
Type: S3 Bucket
DataAtWork:
- Title: Setting up Blazegraph on EC2
URL: https://docs.data.world/uscensus/#50---getting-started
AuthorName: data.world
AuthorURL: https://data.world/
Tutorials:
- Title: Setting up Blazegraph on EC2
URL: https://docs.data.world/uscensus/#50---getting-started
AuthorName: data.world
AuthorURL: https://data.world/
Tools & Applications:
Publications:
6 changes: 5 additions & 1 deletion datasets/cgiardata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,8 @@ Resources:
- Description: ARC GRID, and ARC ASCII format compressed
ARN: arn:aws:s3:::cgiardata
Region: us-west-2
Type: S3 Bucket
Type: S3 Bucket
DataAtWork:
Tutorials:
Tools & Applications:
Publications:
Loading

0 comments on commit 4d753cc

Please sign in to comment.