Skip to content

Commit

Permalink
Put optional and required fields under a heading to make it clearer
Browse files Browse the repository at this point in the history
  • Loading branch information
mlandauer committed Nov 7, 2023
1 parent b2aadb5 commit 6092b39
Showing 1 changed file with 159 additions and 153 deletions.
312 changes: 159 additions & 153 deletions app/views/_tailwind/documentation/how_to_write_a_scraper.html.erb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

<%# TODO: Format all the content to match the rest of the site %>
<%# TODO: Extract this prose block into a component %>
<div class="prose prose-2xl max-w-none prose-h2:font-display prose-h2:text-2xl prose-h2:font-bold prose-a:underline prose-a:font-bold prose-a:text-fuchsia hover:prose-a:text-fuchsia-darker active:prose-a:bg-sun-yellow prose-h1:text-4xl prose-h1:font-display prose-h1:font-bold">
<div class="prose prose-2xl prose-th:whitespace-nowrap max-w-none prose-h3:font-sans prose-h3:text-2xl prose-h3:font-bold prose-h2:font-display prose-h2:text-3xl prose-h2:font-bold prose-a:underline prose-a:font-bold prose-a:text-fuchsia hover:prose-a:text-fuchsia-darker active:prose-a:bg-sun-yellow prose-h1:text-4xl prose-h1:font-display prose-h1:font-bold">
<h1><%= yield :page_title %></h1>

<section>
Expand Down Expand Up @@ -72,158 +72,164 @@
<%= link_to "create a new scraper", "https://morph.io/scrapers/new" %>
that downloads and saves the following information:
</p>
<p>
<%# TODO: Add highlight to required? %>
The following fields are required. All development applications should have these bits of information.
</p>
<table>
<thead>
<tr>
<th>field</th>
<th>Example value</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>council_reference</td>
<td>TA/00323/2012</td>
<td>
<p>
The ID that the council has given the planning application. This also must be the unique key for this data set.
</p>
</td>
</tr>
<tr>
<td>address</td>
<td>1 Sowerby St, Goulburn, NSW</td>
<td>
<p>
The physical address that this application relates to. This will be geocoded so doesn't need to be a specific
format but obviously the more explicit it is the more likely it will be successfully geo-coded. If the original
address did not include the state (e.g. "QLD") at the end, then add it.
</p>
</td>
</tr>
<tr>
<td>description</td>
<td>Ground floor alterations to rear and first floor addition</td>
<td>
<p>
A text description of what the planning application seeks to carry out.
</p>
</td>
</tr>
<tr>
<td>info_url</td>
<td>http://foo.gov.au/app?key=527230</td>
<td>
<p>
A URL that provides more information about the planning application.
</p>
<p>
This should be a persistent URL that preferably is specific to this particular application. In many cases councils force
users to click through a license to access planning application. In this case be careful about what URL you provide.
Test clicking the link in a browser that hasn't established a session with the council's site to ensure users of Planning Alerts
will be able to click the link and not be presented with an error.
</p>
</td>
</tr>
<tr>
<td>date_scraped</td>
<td>2012-08-01</td>
<td>
<p>
The date that your scraper is collecting this data (i.e. now). Should be in
<%= link_to "ISO 8601", "http://en.wikipedia.org/wiki/ISO_8601" %>
format.
</p>
<p>
Use the following Ruby code:
<code>Date.today.to_s</code>
</p>
</td>
</tr>
</tbody>
</table>
<p>
Note that there used to be a field "comment_url"
above that was required. This is no longer used though you might
still see it referenced in older scrapers.
</p>
<p>
The following fields are optional because not every planning authority provides them. Please do include them if data is available.
</p>
<table>
<thead>
<tr>
<th>field</th>
<th>Example value</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>date_received</td>
<td>2012-06-23</td>
<td>
<p>
The date this application was received by council. Should be in
<%= link_to "ISO 8601", "http://en.wikipedia.org/wiki/ISO_8601" %>
format.
</p>
</td>
</tr>
<tr>
<td>on_notice_from</td>
<td>2012-08-01</td>
<td>
<p>
The date from when public submissions can be made about this application. Should be in
<%= link_to "ISO 8601", "http://en.wikipedia.org/wiki/ISO_8601" %>
format.
</p>
</td>
</tr>
<tr>
<td>on_notice_to</td>
<td>2012-08-14</td>
<td>
<p>
The date until when public submissions can be made about this application. Should be in
<%= link_to "ISO 8601", "http://en.wikipedia.org/wiki/ISO_8601" %>
format.
</p>
</td>
</tr>
<tr>
<td>comment_email</td>
<td>[email protected]</td>
<td>
<p>
Only set this in
<strong>extremely unusual</strong>
situations. Allows each application in a single
planning authority to go to a different email address. This should never be set for 99.9%
of authorities as a single email address is used for all comments. Currently this is only
used for SA Planning Portal where comments are ideally sent back to the originating
local council so that the staff in state government don't have to do the redirection by hand.
</p>
</td>
</tr>
<tr>
<td>comment_authority</td>
<td>Acme Council</td>
<td>
<p>
Only set this in
<strong>extremely unusual</strong>
situations. Give the name associated with the comment_email address.
</p>
</td>
</tr>
</tbody>
</table>
<section>
<h3>Required fields</h3>
<p>
<%# TODO: Add highlight to required? %>
The following fields are required. All development applications should have these bits of information.
</p>
<table>
<thead>
<tr>
<th>field</th>
<th>Example value</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>council_reference</td>
<td>TA/00323/2012</td>
<td>
<p>
The ID that the council has given the planning application. This also must be the unique key for this data set.
</p>
</td>
</tr>
<tr>
<td>address</td>
<td>1 Sowerby St, Goulburn, NSW</td>
<td>
<p>
The physical address that this application relates to. This will be geocoded so doesn't need to be a specific
format but obviously the more explicit it is the more likely it will be successfully geo-coded. If the original
address did not include the state (e.g. "QLD") at the end, then add it.
</p>
</td>
</tr>
<tr>
<td>description</td>
<td>Ground floor alterations to rear and first floor addition</td>
<td>
<p>
A text description of what the planning application seeks to carry out.
</p>
</td>
</tr>
<tr>
<td>info_url</td>
<td>http://foo.gov.au/app?key=527230</td>
<td>
<p>
A URL that provides more information about the planning application.
</p>
<p>
This should be a persistent URL that preferably is specific to this particular application. In many cases councils force
users to click through a license to access planning application. In this case be careful about what URL you provide.
Test clicking the link in a browser that hasn't established a session with the council's site to ensure users of Planning Alerts
will be able to click the link and not be presented with an error.
</p>
</td>
</tr>
<tr>
<td>date_scraped</td>
<td>2012-08-01</td>
<td>
<p>
The date that your scraper is collecting this data (i.e. now). Should be in
<%= link_to "ISO 8601", "http://en.wikipedia.org/wiki/ISO_8601" %>
format.
</p>
<p>
Use the following Ruby code:
<code>Date.today.to_s</code>
</p>
</td>
</tr>
</tbody>
</table>
<p>
Note that there used to be a field "comment_url"
above that was required. This is no longer used though you might
still see it referenced in older scrapers.
</p>
</section>
<section>
<h3>Optional fields</h3>
<p>
The following fields are optional because not every planning authority provides them. Please do include them if data is available.
</p>
<table>
<thead>
<tr>
<th>field</th>
<th>Example value</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>date_received</td>
<td>2012-06-23</td>
<td>
<p>
The date this application was received by council. Should be in
<%= link_to "ISO 8601", "http://en.wikipedia.org/wiki/ISO_8601" %>
format.
</p>
</td>
</tr>
<tr>
<td>on_notice_from</td>
<td>2012-08-01</td>
<td>
<p>
The date from when public submissions can be made about this application. Should be in
<%= link_to "ISO 8601", "http://en.wikipedia.org/wiki/ISO_8601" %>
format.
</p>
</td>
</tr>
<tr>
<td>on_notice_to</td>
<td>2012-08-14</td>
<td>
<p>
The date until when public submissions can be made about this application. Should be in
<%= link_to "ISO 8601", "http://en.wikipedia.org/wiki/ISO_8601" %>
format.
</p>
</td>
</tr>
<tr>
<td>comment_email</td>
<td>[email protected]</td>
<td>
<p>
Only set this in
<strong>extremely unusual</strong>
situations. Allows each application in a single
planning authority to go to a different email address. This should never be set for 99.9%
of authorities as a single email address is used for all comments. Currently this is only
used for SA Planning Portal where comments are ideally sent back to the originating
local council so that the staff in state government don't have to do the redirection by hand.
</p>
</td>
</tr>
<tr>
<td>comment_authority</td>
<td>Acme Council</td>
<td>
<p>
Only set this in
<strong>extremely unusual</strong>
situations. Give the name associated with the comment_email address.
</p>
</td>
</tr>
</tbody>
</table>
</section>
</section>

<section>
Expand Down

0 comments on commit 6092b39

Please sign in to comment.