-
Notifications
You must be signed in to change notification settings - Fork 0
/
index.html
355 lines (344 loc) · 20.3 KB
/
index.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
<!DOCTYPE HTML>
<!--
Tessellate by HTML5 UP
html5up.net | @n33co
Free for personal and commercial use under the CCA 3.0 license (html5up.net/license)
-->
<html>
<head>
<title>Monster Mashup: UFOs of Connecticut</title>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<!--[if lte IE 8]><script src="assets/js/ie/html5shiv.js"></script><![endif]-->
<link rel="stylesheet" href="assets/css/main.css" />
<!--[if lte IE 8]><link rel="stylesheet" href="assets/css/ie8.css" /><![endif]-->
<!--[if lte IE 9]><link rel="stylesheet" href="assets/css/ie9.css" /><![endif]-->
</head>
<body>
<!-- Header -->
<section id="header" class="dark">
<header>
<h1><b>UFOs of Connecticut</b></h1>
<p><b>An exploration of geospatial visualization using Halloween-themed data.</b></p>
</header>
<footer>
<a href="#first" class="button scrolly">Let's get started!</a>
</footer>
</section>
<!-- First -->
<section id="first" class="main">
<header>
<div class="container">
<h2>About the data</h2>
<p>The National UFO Reporting Center is managed by Peter Davenport and Christian Stepien. It's a portal for individuals to report unidentified flying objects. All of the data input is free text, with little guidance given to control the fields.</p>
<p>Why do I mention this? The data is messy. <em>So, so messy.</em> Individuals typing up UFO sightings reports will often mistakenly identify cities, states, and other information about the events. The time duration is not standardized, so the data cannot submit to duration analysis without painstaking cleaning.</p>
<p>I went to the web site and grabbed the data about Connecticut. This guide will walk you through what I did to get it from that original form to something that I could take and mash up with other datasets in CartoDB and in ArcGIS online (pending: kartograph for Python). The data set is not 100% clean – there are still many things that I could change if I wanted to actually do data analysis on this.</p>
</div>
</header>
<div class="content dark style1 featured">
<div class="container">
<div class="row">
<div class="4u 12u(narrow)">
<section>
<span class="feature-icon"><span class="icon fa-clock-o"></span></span>
<header>
<h3>Cleaning Data</h3>
</header>
<p>We will start by cleaning the data, which can be copied and pasted into Excel. The 1000+ records needed a lot of care: The line breaks didn't always match up the way I needed. If one were to actually do analysis on this data, town and city names would need to be checked more thoroughly to ensure that there were no data entry errors. In addition, the non-standard "duration" field would need to be modified so we could get quantitative information out.</p>
</section>
</div>
<div class="4u 12u(narrow)">
<section>
<span class="feature-icon"><span class="icon fa-bolt"></span></span>
<header>
<h3>Geocoding Data</h3>
</header>
<p>OpenRefine allows for geocoding using several services, including the Google Maps API and MapQuest's association with the OpenStreetMap data. This can be a bit tricky, but I will explain more below!</p>
</section>
</div>
<div class="4u 12u(narrow)">
<section>
<span class="feature-icon"><span class="icon fa-cloud"></span></span>
<header>
<h3>Showing Data</h3>
</header>
<p>For Monster Mashup, I am showing data in CartoDB and in ArcGIS Online. CartoDB has a lot of visualization tools that can be applied to the data. ArcGIS Online allows you to search for other data sets to add to the map. We can look for things like flight paths, swamps, or military bases in ArcMap to see if there are any correlations. Of course, desktop-based GIS software would give us better control!</p>
</section>
</div>
</div>
<div class="row">
<div class="12u">
<footer>
<a href="#second" class="button scrolly">Let's talk about this in more detail ...</a>
</footer>
</div>
</div>
</div>
</div>
</section>
<!-- Second -->
<section id="second" class="main">
<header>
<div class="container">
<h2>Step One: Grab Data</h2>
<p>To pull down data, navigate to <a href="http://www.nuforc.org/">the National UFO Reporting Center</a> and select a state of interest. We will select Connecticut because this workshop is happening in New Haven, and it's interesting to see.</p>
<p>The easiest way to clean the data is to import it into Excel, use text-to-columns on the Data tab to get the data to look good, and spot-check the data for irregularities. When I did this for Connecticut, it took me about 30-40 minutes to do basic data cleaning for 1200 rows.</p>
<p>Had I wanted to do more, I could:
<ul>
<li>Standardize the way duration is coded in the Excel file to make it machine-readable.</li>
<li>Check city names to ensure that they all match locations that are actually in Connecticut.</li>
</ul>
And I <em>would have done that</em> if I needed to get this data analysis-ready. Los Angeles, CT, for example, does not exist. You'll notice that I never did, and that's because it's easier for me to teach from dirty data than from completely clean data.</p>
<h2>Step Two: Geocode Address Information</h2>
<p>Once you have edited the data in Excel, save it as a .csv, .tsv, or Excel workbook. I chose to save this data as an Excel workbook because OpenRefine can read Excel files, and the commas and oddities in the data set makes it risky to use commas or tabs as delimiters.</p>
<p>To make the data readable for geoprocessing, we need to combine the city and state fields. Select the drop-down chevron on the State Column, Choose Edit > Add Column Based on This Column, and use the formula <code>cells["City"].value + ", " + cells["State"].value</code> to combine the cities and states.</p>
<p>Once we have that, we can make a call to an API for geocoding. If you use the Google Maps API, you can only put your map in Google Maps — and we're not going to do that. For our purposes, we can use the OpenStreetMap MapQuest API. You will need to apply for a developer's license on the MapQuest API web site to receive a unique user key, and they will allow several thousand unique geocoding calls per month. To avoid maxing out the quota, I recommend testing on sample data sets.</p>
<p>Select the drop-down chevron for the combined City-State column and add a column based on this column. This is the command needed for geocoding: <code>'http://open.mapquestapi.com/nominatim/v1/search.php?' + 'key=INSERTKEYHERE&format=json&' + 'q=' + escape(value, 'url')</code></p> — because it's an API call, you will need to add a time delay of 100-200ms between each call so that you are not throttled. This is an API best practice.
<p>A lot of stuff will come back at you. We just want the latitudes and longitudes. "Add Column Based on This Column" is what we will use on the API-called field we made — and we will use it twice, once for latitude, and once for longitude. This will pull the value for lat or long out of the first record that the JSON query brought back. (And a side note: This might not be what we wanted! You will see when we try to map this data. Remember, we just wanted CT data points!) These are the two piece of code you will use to make a column for latitude and a column for longitude: <code>value.parseJson()[0].lat</code><code>value.parseJson()[0].lon</code></p>
<p>After that is done, collapse the column for the JSON gibberish and export the non-collapsed columns. Now we have data that we can pull into mapping tools!</p>
<p><img src="images/Screenshots/OpenRefine-data.png" width=90%/></p>
<h2>Step Three: Use CartoDB to Visualize the Data</h2>
<p>Map it! You can make some really interesting visualizations in places like CartoDB — just sign up for an account. Here is some of the stuff that I did. Note that there are addresses represented that are not exactly in Connecticut. This happened in part because there are some towns in the data set (e.g., Los Angeles, CT) that don't exist. In other cases, the first result in the JSON query when we geocoded was not a location within Connecticut. CartoDB gives you the opportunity to create heat maps, but we can also classify the points based on image type to see what the distribution of UFO features might be.</p>
<p><img src="images/Screenshots/CartoDB-connecticut-obvious-data-errors.png" width=90%/></p>
<p><img src="images/Screenshots/CartoDB-connecticut-closeup.png" width=90%/></p>
<p><img src="images/Screenshots/CartoDB-classification-alt.png" width=90%/></p>
<p>I have also made an interactive map on CartoDB that you can look at. It is animated — we can see when the sightings happened over time! <a href="https://kayebohemier.cartodb.com/viz/66a70546-7cd4-11e5-b9b6-0e31c9be1b51/map">See the map here.</a></p>
<h2>Step Four: Use ArcGIS Online to Mash Up<br/>and Visualize the Data</h2>
<p>ArcGIS Online (for which you will need an account) will let you import up to a certain number of records. To compare data after uploading files to ArcGIS Online, you can find data sets that have been made available on the platform. For UFO data, we can be creative: I decided to pull down data about wetlands (e.g., swamp gas?), flight paths, and US military bases to see if there was a correlation. Of course, to do a proper analysis, this would need to go into ArcMap, where we could analyze the data points.</p>
<p><img src="images/Screenshots/ArcGIS-online-mashup.png" width=60%/></p>
</div>
</header>
<div class="content dark style2">
<div class="container">
<div class="row">
<div class="4u 12u(narrow)">
<section>
<h3>The Images</h3>
<p>On the right, I have collected the images generated from the Monster Mashup UFO section. Clicking on any one of them will make the images larger.</p>
<footer>
<a href="#third" class="button scrolly">To the next section ...</a>
</footer>
</section>
</div>
<div class="8u 12u(narrow)">
<div class="row">
<div class="6u"><a href="images/Screenshots/OpenRefine-data.png" class="image fit"><img src="images/Screenshots/OpenRefine-data.png" alt="" /></a></div>
<div class="6u"><a href="images/Screenshots/CartoDB-connecticut-obvious-data-errors.png" class="image fit"><img src="images/Screenshots/CartoDB-connecticut-obvious-data-errors.png" alt="" /></a></div>
<div class="6u"><a href="images/Screenshots/CartoDB-connecticut-closeup.png" class="image fit"><img src="images/Screenshots/CartoDB-connecticut-closeup.png" alt="" /></a></div>
<div class="6u"><a href="images/Screenshots/CartoDB-classification-alt.png" class="image fit"><img src="images/Screenshots/CartoDB-classification-alt.png" alt="" /></a></div>
<div class="6u"><a href="images/Screenshots/CartoDB-classification.png" class="image fit"><img src="images/Screenshots/CartoDB-classification.png" alt="" /></a></div>
<div class="6u"><a href="images/Screenshots/ArcGIS-online-mashup.png" class="image fit"><img src="images/Screenshots/ArcGIS-online-mashup.png" alt="" /></a></div>
</div>
</div>
</div>
</div>
</div>
</section>
<!-- Third -->
<section id="third" class="main">
<header>
<div class="container">
<h2>Where To Get Started</h2>
<p>We have spent a lot of time looking at the data we used, and there are many resources that you should be aware of if you try this analysis yourself.</p>
<p><a href="https://www.arcgis.com/home/">ArcGIS Online: Making maps and using data in a ready-to-use, cloud-based system.</a></p>
<p><a href="https://cartodb.com/">CartoDB: A way to map and analyze location-based data.</a></p>
<p><a href="http://www.nuforc.org/">Davenport, Peter, and Stepien, Christian. The National UFO Reporting Center. Accessed 20 October 2015. </a></p>
<p><a href="https://opensas.wordpress.com/2013/06/30/using-openrefine-to-geocode-your-data-using-google-and-openstreetmap-api/">Opensas blog. "Using OpenRefine to geocode your data with Google and OpenStreetMap API."</a></p>
<p><a href="http://openrefine.org/">Open Refine: A free, open-source, and powerful tool for working with messy data.</a></p>
<p>Verborgh, Ruben. (2013). Using OpenRefine: the essential OpenRefine guide that takes you from data analysis and error fixing to linking your dataset to the Web. Birmingham, England: Packt Publishing. <a href="http://search.library.yale.edu/catalog/12263462">Online.</a></p>
</div>
</header>
</div>
</section>
<!-- Basic Elements -->
<!--
<section class="main">
<header>
<div class="container">
<h2>A Whole Lotta Elements</h2>
<p>General purpose elements for every general purpose. Or something like that.</p>
</div>
</header>
<div class="content style1 dark">
<div class="container">
<section>
<header>
<h3>Paragraph</h3>
<p>This is a byline</p>
</header>
<p>Phasellus nisl nisl, varius id <sup>porttitor sed pellentesque</sup> ac orci. Pellentesque
habitant <strong>strong</strong> tristique <b>bold</b> et netus <i>italic</i> malesuada <em>emphasized</em> ac turpis egestas. Morbi
leo suscipit ut. Praesent <sub>id turpis vitae</sub> turpis pretium ultricies. Vestibulum sit
amet risus elit.</p>
</section>
<section>
<header>
<h3>Blockquote</h3>
</header>
<blockquote>Fringilla nisl. Donec accumsan interdum nisi, quis tincidunt felis sagittis eget.
tempus euismod. Vestibulum ante ipsum primis in faucibus. Cras sit amet urna eros, id egestas
tempus ante ipsum primis in faucibus orci luctus et ultrices.</blockquote>
</section>
<section>
<header>
<h3>Divider</h3>
</header>
<p>Donec consectetur <a href="#">vestibulum dolor et pulvinar</a>. Etiam vel felis enim, at viverra
ligula. Ut porttitor sagittis lorem, quis eleifend nisi ornare vel. Praesent nec orci
facilisis leo magna. Cras sit amet urna eros, id egestas urna. Quisque aliquam
tempus euismod. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices
posuere cubilia.</p>
<hr />
<p>Donec consectetur vestibulum dolor et pulvinar. Etiam vel felis enim, at viverra
ligula. Ut porttitor sagittis lorem, quis eleifend nisi ornare vel. Praesent nec orci
facilisis leo magna. Cras sit amet urna eros, id egestas urna. Quisque aliquam
tempus euismod. Vestibulum ante ipsum primis in faucibus orci luctus et ultrices
posuere cubilia.</p>
</section>
<section>
<header>
<h3>Unordered List</h3>
</header>
<ul class="default">
<li>Donec consectetur vestibulum dolor et vel felis enim at viverra ligula. Ut porttitor sagittis lorem.</li>
<li>Donec consectetur vestibulum dolor et vel felis enim at viverra ligula. Ut porttitor sagittis lorem.</li>
<li>Donec consectetur vestibulum dolor et vel felis enim at viverra ligula. Ut porttitor sagittis lorem.</li>
<li>Donec consectetur vestibulum dolor et vel felis enim at viverra ligula. Ut porttitor sagittis lorem.</li>
</ul>
</section>
<section>
<header>
<h3>Ordered List</h3>
</header>
<ol class="default">
<li>Donec consectetur vestibulum dolor et vel felis enim at viverra ligula. Ut porttitor sagittis lorem.</li>
<li>Donec consectetur vestibulum dolor et vel felis enim at viverra ligula. Ut porttitor sagittis lorem.</li>
<li>Donec consectetur vestibulum dolor et vel felis enim at viverra ligula. Ut porttitor sagittis lorem.</li>
<li>Donec consectetur vestibulum dolor et vel felis enim at viverra ligula. Ut porttitor sagittis lorem.</li>
</ol>
</section>
<section>
<header>
<h3>Table</h3>
</header>
<div class="table-wrapper">
<table class="default">
<thead>
<tr>
<th>ID</th>
<th>Name</th>
<th>Description</th>
<th>Price</th>
</tr>
</thead>
<tbody>
<tr>
<td>00001</td>
<td>Lorem ipsum dolor</td>
<td>Ut porttitor sagittis lorem quis nisi ornare.</td>
<td>29.99</td>
</tr>
<tr>
<td>00002</td>
<td>Sit amet nullam</td>
<td>Ut porttitor sagittis lorem quis nisi ornare.</td>
<td>19.99</td>
</tr>
<tr>
<td>00003</td>
<td>Feugiat felis viverra</td>
<td>Ut porttitor sagittis lorem quis nisi ornare.</td>
<td>29.99</td>
</tr>
<tr>
<td>00004</td>
<td>Sagittis enim felis</td>
<td>Ut porttitor sagittis lorem quis nisi ornare.</td>
<td>19.99</td>
</tr>
<tr>
<td>00005</td>
<td>Nullam sed vestibulum</td>
<td>Ut porttitor sagittis lorem quis nisi ornare.</td>
<td>19.99</td>
</tr>
</tbody>
<tfoot>
<tr>
<td colspan="3"></td>
<td>100.00</td>
</tr>
</tfoot>
</table>
</div>
</section>
<section>
<header>
<h3>Form</h3>
</header>
<form method="post" action="#">
<div class="row 50%">
<div class="6u 12u(narrow)">
<input type="text" name="name" id="name" value="" placeholder="John Doe" />
</div>
<div class="6u 12u(narrow)">
<input type="text" name="email" id="email" value="" placeholder="[email protected]" />
</div>
</div>
<div class="row 50%">
<div class="12u">
<div class="select" tabindex="-1">
<select name="department" id="department">
<option value="">Choose a department</option>
<option value="1">Manufacturing</option>
<option value="2">Administration</option>
<option value="3">Support</option>
</select>
</div>
</div>
</div>
<div class="row 50%">
<div class="12u">
<input type="text" name="subject" id="subject" value="" placeholder="Enter your subject" />
</div>
</div>
<div class="row 50%">
<div class="12u">
<textarea name="message" id="message" placeholder="Enter your message"></textarea>
</div>
</div>
<div class="row">
<div class="12u">
<ul class="actions">
<li><input type="submit" class="button" value="Send Message" /></li>
<li><input type="reset" class="button alt" value="Clear Form" /></li>
</ul>
</div>
</div>
</form>
</section>
</div>
</div>
</section>
-->
<!-- Footer -->
<section id="footer">
<ul class="icons">
<li><a href="http://twitter.com/kayeatyalelib" class="icon fa-twitter"><span class="label">Twitter</span></a></li>
<li><a href="https://github.com/kayebohemier" class="icon fa-github"><span class="label">GitHub</span></a></li>
</ul>
<div class="copyright">
<ul class="menu">
<li>© Monster Mashup: A Data Smashup - UFO Data. All rights reserved.</li><li>Design: <a href="http://html5up.net">HTML5 UP</a></li>
</ul>
</div>
</section>
<!-- Scripts -->
<script src="assets/js/jquery.min.js"></script>
<script src="assets/js/jquery.scrolly.min.js"></script>
<script src="assets/js/skel.min.js"></script>
<script src="assets/js/util.js"></script>
<!--[if lte IE 8]><script src="assets/js/ie/respond.min.js"></script><![endif]-->
<script src="assets/js/main.js"></script>
</body>
</html>