Read file on node in Vertica cluster and present result as table with string columns. Function is presenting external data as table in Vertica, without doing data loads.
Functionality is similar to external table, with CSV function having advantage of not needed to define external tables for each file. You can have million of CSV files, each with it's own columns, and one function will rule them all. CSV function produce varchar output columns, you need to convert to your data type.
select csv(<filename> [ using parameters <parameters> ] ) over();
Filename must be visible on node in Vertica cluster. If file is on client, it is not visible to Vertica cluster!
Parameters (case sensitive):
column_delimiter - column delimiter, default comma
row_separator - record separator, default line feed
column_names - line with list of column names delimited by column delimiter
column_count - number of columns, default 32
max_row_length - maximum length of row, default 4096
skip_row - number of rows to skip in file, default 0.
file_with_header - you can specify column names in separate file. Or, it can be in same file as data.
filename_column - Add column with this name and populate with data file name (no path)
path_column - Add column with this name and populate with path to data file
empty_as_null - empty column value ",," is treated as NULL. Default - it will be non-null empty string.
debug - 'y' or 'trace'. Will cause logs dumped on vertica node. Do not use.
Sample content of file:
cat /home/dbadmin/testdata/csvfile.txt
one,two,three
1,234-5,sdf
4,kdfds,sdfds
,dt,
c1,,r
sd,f,
sd,f,g,
SQL execution:
select csv('/home/dbadmin/testdata/csvfile.txt' using parameters file_with_header='/home/dbadmin/testdata/csvfile.txt', skip_rows=1, filename_column='y', path_column='y') over();
one | two | three | filename | path
-----+-------+-------+-------------+-------------------------
1 | 234-5 | sdf | csvfile.txt | /home/dbadmin/testdata/
4 | kdfds | sdfds | csvfile.txt | /home/dbadmin/testdata/
| dt | | csvfile.txt | /home/dbadmin/testdata/
c1 | | r | csvfile.txt | /home/dbadmin/testdata/
sd | f | | csvfile.txt | /home/dbadmin/testdata/
sd | f | g | csvfile.txt | /home/dbadmin/testdata/
(6 rows)