TSearch Address Dictionaries

Motivation

Trying to use full-text indexing for address data with the default language configurations doesn't work well: if you use an 'english' configuration, you end up getting street names stemmed by the english dictionary. Using the 'simple' configurations is better but still not optimal: there is no handling of obvious abbreviations, so you might fail to match "12 N Oak st" to "12 North Oak Street", or vice versa.

This extension provides an 'addressing_en' configuration that has common (Canada Post enumerated) street type abbreviations, direction tokens (n, s, e, w, etc) and numeric street variants (first/1st) handled. There's sure to be piles more, as well as many cases that cannot be handled with a simple full-text tokenizing strategy (st = street or saint?). However, using full-text features for basic geocoding is too convenient to ignore.

Using full-text indexing for nationwide address data might still be a bad idea, since there are so many possible false or multiple matches to find, but for a single city, county or even state, this approach can generate a quick'n'dirty address lookup routine perfect to tying to an autocomplete form field on a web page.

Currently there is only support for English (really, North American English) addressing -- pull requests gratefully received.

Installation

Clone the repository, ensure that pg_config is on your path, and run make install to copy the dictionary files into place.

Examples

> CREATE EXTENSION addressing_dictionary;

> SELECT to_tsvector('addressing_en', '1234 n main st');

                 to_tsvector                  
----------------------------------------------
 '1234':1 'main':4 'n':2 'north':3 'street':5

Caveats

If using v1.1, then cardinal directions are tokenized as both the single letter abbreviation, and the word. This is to avoid ambiguity. See example above.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.gitignore		.gitignore
LICENSE.md		LICENSE.md
META.json		META.json
Makefile		Makefile
README.md		README.md
addressing_dictionary--1.0--1.1.sql		addressing_dictionary--1.0--1.1.sql
addressing_dictionary--1.1.sql		addressing_dictionary--1.1.sql
addressing_dictionary.control		addressing_dictionary.control
addressing_en.stop		addressing_en.stop
addressing_en.syn		addressing_en.syn
addressing_en.ths		addressing_en.ths
addressing_es.syn		addressing_es.syn
addressing_fr.stop		addressing_fr.stop
addressing_fr.syn		addressing_fr.syn
addressing_fr.ths		addressing_fr.ths

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TSearch Address Dictionaries

Motivation

Installation

Examples

Caveats

About

Releases

Packages

Contributors 5

Languages

License

pramsey/pgsql-addressing-dictionary

Folders and files

Latest commit

History

Repository files navigation

TSearch Address Dictionaries

Motivation

Installation

Examples

Caveats

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 5

Languages

Packages