Skip to content

Commit bb1dd81

Browse files
author
Allen Maxwell
committed
update README and add import files
1 parent 5c519b4 commit bb1dd81

File tree

5 files changed

+57
-1
lines changed

5 files changed

+57
-1
lines changed

.ruby-gemset

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
urug_csv

.ruby-version

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
ruby-2.1.0

README.md

+53-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,56 @@
11
csv_coding_exercise
22
===================
33

4-
Basic files for SL URUG meeting on 9/23/2014
4+
Basic files for SL URUG meeting on 9/23/2014 and a quick discussion on csv files in general and fastercsv and it's usage in ruby
5+
6+
in ruby version 1.9.2 the CSV standard library was replaced with a library called fastercsv. this library is authored by James Edward Gray II.
7+
8+
traditionally, csv stood for 'comma seperated value' file but it has come to be more accurate to call them 'character seperated value' files because many applications want to use alternate seperators (often the pipe '|' or other characters).
9+
10+
Most csv files utilize the first row to label the columns included below but that isn't necessarily required. It's a nice convenience though and, with fastercsv, allows you to reference the columns as they are read by name rather than by position. This also allows files to be consumed if the columns are not in a set order from delivery to delivery without any problems.
11+
12+
Ruby has built in string processing methods that are great. If the csv file is not mal-formed in any way and doesn't have any quoted fields, then parsing an entry is simply
13+
14+
data.split(',')
15+
16+
the problem arrises when we want to quote fields to allow for embedded commas in a comma separated file, or other embedded delimiter characters. That's where it gets a little dicey and the library makes life dramatically easier.
17+
18+
The basic pattern I've used for reading CSV files is below
19+
20+
require 'csv'
21+
22+
CSV.foreach('myfilename', {:headers => :first_row}) do |row|
23+
# skip blank and commented rows
24+
next if row.nil? || row.empty? || row[0][0] == '#'
25+
26+
# convert to hash to be able to read by column name
27+
hash = row.to_hash
28+
29+
# process row or hash contents from file
30+
... do stuff here ...
31+
32+
end
33+
34+
Basic things to discuss and watch out for:
35+
* quoted fields - use quotation marks or other characters to quote fields so you can embed the delimiter character in the data being read
36+
* non utf8 file coding. I've run into this and haven't found a great solution so am interested in discussing it. Sometimes wide-chars can cause problems too but fastercsv can choke on different file encoding
37+
* field order different from file to file - can make reading using an array index approach problematic
38+
* csv file size - not sure on limits or issues with this - would like to hear peoples experience on issues using fastercsv with large csv files
39+
* uploading csv files to server in rails apps - timeout issues. I've had to deal with this using the delayed_jobs gem because rails will time out if the csv file is too large.
40+
41+
# Coding Exercise - let's build a csv parser! (oh boy!!!)
42+
The challenge this month is to build a parser to read csv files. The parameters are as follows:
43+
* You may NOT use any external libraries for reading csv files (like faster csv) to solve the challenge.
44+
* You must account for quoted strings (quotes will always be double quote chars: "") and allow the file delimiter to be embedded within the quotes
45+
* You must allow for a user specified delimiter (comma, pipe, tilde...?)
46+
47+
we will not use a non UTF8 file encoding or non-ascii chars so no worries there.
48+
49+
There are some sample files in the import_files directory.
50+
51+
the challenge will read some files that represent sales transactions for some kinds of products. the sample files are selling computer/office supplies. there are 2 files: a products file and a transactions file. read the files using your csv parser and report the following in your output:
52+
53+
* for each product show: highest price, lowest price, average price, total quantity sold, total revenue for each item
54+
* show the total revenue for all items
55+
56+
### The other requirement is to have fun!

import_files/product_file1.csv

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
product_id,name1,MacBook Pro2,Dell PC Laptop3,HP Desktop PC4,BlueTooth Keyboard and Mouse5,"21"" flat screen LCD Monitor"6,"""All in One Printer, Scanner, Fax"""7,"""Office Furniture: chair, desk, lamp"""8,business card holder

import_files/transactions.csv

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
product_id,date,quantity,price1,1/1/13,3,24001,1/15/13,1,9002,2/12/13,1,10003,4/1/13,1,12504,5/2/14,15,652,6/16/14,3,7508,6/30/14,5,12.56,9/3/14,3,2256,9/4/14,2,2357,9/15/14,25,12005,9/16/14,25,1755,9/20/14,3,1957,9/21/14,3,1300

0 commit comments

Comments
 (0)