-
Notifications
You must be signed in to change notification settings - Fork 87
Name Extraction with PHP Text Analysis
yooper edited this page Mar 2, 2017
·
1 revision
The name extraction functionality is provided by an sqlite database that was built using 2010 Census data and surnames provided by the SSN department. The class NameCorpus provides several calls to for helping to identify if a token is a first name or a last name also called surname.
In order to use this functionality you must run the following the following command.
php text console pta:package:install us_names
This command will download and unpackage the database for you. From there you can use the following commands to determine how valid the name is.
<?php
use TextAnalysis\Corpus\NameCorpus;
$corpus = new NameCorpus();
// returns a boolean, true if the name exists, the name is normalized to lower case internally
$corpus->isFirstName('Mike'));
$corpus->isLastName('Williamson');
$corpus->isFir
// returns a single record, but multiple records are available, because the underlying dataset has the frequency
// count of persons born with that name since 1915
$corpus->getFirstName('Mike');
// $lastName is an array of data that has additional frequency counts and population statistics associated to
// the given last name
$lastName = $corpus->getLastName('Williamson');
var_dump($lastName);
// takes the first and last tokens and checks if the 1st token is a 1st name
// and if the last token is a last name
$corpus->isFullName('Brad Von Williamson')
// get the raw pdo connection so you can issue your own sql statements
$pdo = $corpus->getPdo();