- java -jar gedcomparser-1.0.jar input_file output_file
- e.g. java -jar gedcomparser-1.0.jar input/input_1.txt output_1.txt
- to compile mvn compile
- to test mvn test
- to make a jar file mvn package
- Spock/Groovy is used to writing tests, this make them more readable and intuitive
- StAX, a streaming API for XML, is used for writing target XML document. Streaming is very fast but often exhibits some limitations like forward-only.
- If source file contains lines with invalid format, they will be reported and parsing will stop there.
- By default, 'mvn package' produces a jar file in the target folder
- This project has been tested in Maven 3.1.1, JDK 1.7, under Linux
- To improve performance, we could introduce parallelism into this program by
- Having separate threads to parse different sections of the raw data file
- Having separate threads to work on different stages of parsing: Line parsing, converting parsed lines into XML elements, writing to XML target file, etc.
- Combine the above two approaches