Skip to content

Commit bb5c397

Browse files
authored
Update README.md
1 parent bd0cf14 commit bb5c397

File tree

1 file changed

+5
-8
lines changed

1 file changed

+5
-8
lines changed

README.md

Lines changed: 5 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,27 @@
1-
# (Poor man) Python ORC Reader
1+
# Python ORC Reader
22

33
## What is it?
44

5-
This is my attempt to write an ORC reader in python. The situation is that we have a lot of ORC files on local disk to consume
6-
by Python but there is no efficient way to access the file without converting it to CSV or compatible format.
5+
This is my attempt to write an ORC reader in python. The situation is that we have a lot of ORC files on local disk to consume by Python but there is no efficient way to access the file without converting it to CSV or compatible format.
76

87
My approach is to use [orc-core](https://orc.apache.org/docs/core-java.html) java library to read ORC file, then use
98
[py4j](https://github.com/bartdag/py4j) to bridge between Python and Java.
109

11-
I call it poor man because it may not be a proper approach. This approach may not work or may suffer from performance issue
12-
due to overhead. The proper approach would be using C++ reader from orc-core library. I want to go through this as an
13-
exercise to know more about ORC and py4j.
10+
This approach is not yet validated and or may suffer from performance issue due to overhead. The proper approach would be using C++ reader from orc-core library. I want to go through this as an exercise to know more about ORC and py4j.
1411

1512

1613
## Installation
1714

1815
Until this package is available on PIP, you will have to install the package as following:
1916

20-
1. Compile java gateway
17+
Compile java gateway
2118

2219
``` bash
2320
cd java-gateway
2421
mvn clean compile assembly:single
2522
```
2623

27-
2. Run setup.py script to install the package to the system
24+
Run setup.py script to install the package to the system
2825

2926
``` bash
3027
cd python

0 commit comments

Comments
 (0)