Skip to content

Naming Errors (Google Refine & FUSE)

kayebohemier edited this page Jun 7, 2011 · 1 revision

One of the problems discovered when testing Google Refine is how these objects have been named. The original list of targets (before consolidating some of the more obvious errors) contained 3,0008 unique objects. This version contains 2,714 objects. This does not take into account the -BKGD measurements on the objects.

FUSE does not seem to have any standard naming conventions for its objects. Here are just some cursory examples:

HD5394_+3arcmin HD5394_-3arcmin (these two are fine --- the +/- 3 arcmin is actually significant)

MRK106 Mrk106

RX-AND RXAnd

VW-HYI VWHyi

SH2_174 Sh2-174

V405-AUR V405-Aur

MV-LYR MV-Lyr

SS-CYG SSCyg

ZAnd ZAND

Sk-67D111 SK-67D111

HS1700+6416 HS-1700+6416

VW-Hyi VW-HYI

V436-Car V436-CAR

WZSge WZ-SGE

SK-69D243 Sk-69D243

Mrk290 MRK290

HD269006 HD-269006

UX-UMA UX-UMa AMHer AM-HER

V709-CAS V709-Cas

SYMus SY-MUS

TWHya TW-Hya

MK42 Mk42

... and as you can see, the VW-Hyi and VW-HYI are actually the same objects (rows 20-21 and 44-45 of this text document). Google Refine is probably missing some duplicate object names no matter how good its recognition software is. In the third section, I copy/pasted the output directly.

Clone this wiki locally