-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GSoC 2019 Progress Updates #40
Comments
Black: The Uncompromising Code Formatter
|
Personally, I think 88 columns is too short. However, I really don't care that much. Also don't care about single/double quotes (despite having my own preference for the former). Don't worry about setting up commit hooks for now.
As far as checking goes, you can send a PR to switch the
Use your own branches and send PRs to master that I'll review. We really shouldn't spend any more time on this than the one day that we've already spent. The existing setup is perfectly fine so please don't spend more than an hour or two on changing style checking stuff tomorrow. I'd much rather you get started on the real work and we worry about this later. |
Few Queries:
|
SGTM.
Exactly where I would start.
:D I think to start we'll only need to make the edges of the interface usable in Python? All of the FST processor itself doesn't need to be visible from Python.
My perspective is that we should start with wrapping as little as possible in order to get e.g.
Since
This should serve as a great example. You might even want to copy its approach as much as possible. Take a look at the README: https://github.com/hfst/hfst/tree/master/python. |
Note that if we did end up wrapping a large portion of |
Regarding SWIG: The needed functions are the ones that https://github.com/apertium/lttoolbox/blob/master/lttoolbox/lt_proc.cc calls - grep for |
Makes sense. The MVP is wrapping every function that https://github.com/apertium/lttoolbox/blob/master/lttoolbox/lt_proc.cc#L111 Note that all the file operations can just be done in Python since that shims out to 'native' code anyway. |
|
Great!
There's no real need to wrap
Why are we wrapping
We don't need to actually call any code that hits |
The SWIG The idea was never to wrap |
@Vaydheesh any luck? Tino's advice should be helpful. |
I had exam on 14 and my next exam is tomorrow. So couldn't look into it. I'll try this after the exam |
Sounds good. Best of luck on your exams! |
|
Hope your exams went well. That sounds like a good start! I'd like to see a PR soon (maybe tomorrow) even if it's a WIP. Tino and/or I can take a quick look to double check that you're not going down the wrong path. edit: note that everything should probably go in a Makefile, a la https://github.com/hfst/hfst/blob/master/python/Makefile.am |
@Vaydheesh how's it going? |
Also,
|
I'm a bit confused. You should be using |
I am using separate files for input and output
|
Hmm, okay. Let me know if I can help (what's going wrong?). Something straightforward like
If you can get the above working, don't worry about file descriptors for now. I'd rather get something to work and then improve it than immediately trying to jump to the optimal approach. |
This is what I've run on interpreter
Output: When I checked the temporary file created in
|
I'm not sure if this will help at all but try using |
Also, instead of using |
|
The file persists, but it is empty |
Try adding some debugging to the C++ (or using gdb I suppose) to figure out whether the issue is that there's no input being read or the output isn't being written? You can also try using https://en.cppreference.com/w/cpp/io/c/tmpfile and passing in/out strings. I don't like that approach since it requires more code in the C++ wrapper but lets see if we can get something to work first. |
|
Ah, I thought already you had it working with non-temporary files. I'm not really sure why your example doesn't work. Maybe https://github.com/apertium/lttoolbox/blob/master/lttoolbox/lt_proc.cc#L308-L311? Shouldn't be relevant unless you're on Windows though. You shouldn't need to open them in append mode but that shouldn't matter. @TinoDidriksen @unhammer any ideas? Probably something obvious. |
What does the input contain? There are certain inputs that I've seen can lead lt-proc into endless loops (typically when stuff hasn't been through deformatters) |
I've been working on the multiple flags functionality and I've made the relevant changes in all the wrappers. I just need to update the changes in |
Sure! I'm a bit worried that you're just creating code that will go out of sync with the real CLI interface though. |
The wrapper can be updated by adding |
I am not well aware of the documentation process. Can you explain what needs to be done? Should I just update the https://github.com/apertium/apertium-python/blob/master/README.md and https://github.com/apertium/apertium-python/tree/master/docs with the usage? |
Yes. But someone who updates the options will have to do so in two places now, right?
Here are the docs that are generated: https://apertium-python.readthedocs.io/en/latest/. They're mostly a result of the doc comments that need to be improved. The README can also use some improvement. For example, there are no docs regarding installing new packages. |
If those changes needs to be updated in the wrapper then yes
👍 |
Is there a way for us to ensure that these updates actually occur? e.g. in the build or test? |
or even better, a way to refactor and prevent there from being two places? |
If the part of wrapping up the arguments can be avoided, then there wont be any need to maintain duplicates. Since all the parsing is being done in |
I am working on caching the object the wrapper object to prevent reinitialization import lttoolbox
import os
import tempfile
def execute_lt_proc(fst, command, input_text, input_file, output_file):
input_file.write(input_text)
input_file.flush()
fst.lt_proc(command, input_file.name, output_file.name)
return output_file.read()
def analyze(input_text, fst):
input_file = tempfile.NamedTemporaryFile(delete=False, mode='w')
output_file = tempfile.NamedTemporaryFile(delete=False)
command = ['lt-proc', '-w']
output = execute_lt_proc(fst, command, input_text, input_file, output_file)
os.remove(input_file.name)
os.remove(output_file.name)
print(output)
path = '/usr/share/apertium/apertium-eng/eng.automorf.bin'
fst = lttoolbox.FST(path)
analyze('cats\n', fst) output is Re initializing the object fixes this. I have no clue about this unexpected behavior. Is it even possible to reuse the |
Hmm... I'm not sure. @unhammer, @xavivars any thoughts? related code: https://github.com/apertium/lttoolbox/blob/master/python/lttoolbox.i#L22 |
That certainly is unexpected :-) I've used lttoolbox's What happens if you destxt the input, ie. send |
>>> analyze('cats[][\n]', fst)
b'^cats/cat<n><pl>$[][\n]'
>>> analyze('cats[][\n]', fst)
b'^cats/cat<n><pl>/cat<n><pl>$[][\n]'
>>> analyze('cats[][\n]', fst)
b'^cats/cat<n><pl>/cat<n><pl>/cat<n><pl>$[][\n]' |
Very strange. What exactly does |
It calls
Calling |
If you look at what However, this complicates things a bit – you can't simply do Maybe you could make the API be something like |
Well, given that we're not trying to expose a perfect lttoolbox API for Python, I'm fine with the latter option. It should be possible in the current setup. |
after getting familiar with |
That seems okay to me if it works but I'm not sure how it's related to the stuff above? |
By wrapping |
Hmm, I'm not seeing why what Unhammer suggested isn't possible via the current approach of passing in temporary path strings. |
Can you have a look at the http://wiki.apertium.org/wiki/User:Vaydheesh/GSoC2019Report to be submitted as GSoC report |
It LGTM! I'm not too worried about the GSoC report :) Prefer you spend the last bits of time on the PRs. |
https://github.com/apertium/apertium-python/blob/master/apertium/utils.py#L191 is this |
That TODO is actually copied from apertium-apy. I think the concern is that we're parsing .modes files instead of the source modes.xml files. I don't know if the latter are available in an installation from the package manager? |
|
Do we need to set This seems a bit of overkill which reduces code coverage due to https://github.com/apertium/apertium-python/blob/master/apertium/mode_search.py#L8-L37 |
I don't think we need to do anything on that topic until/if people complain. |
I am working on various issues opened, but suddenly appveyor builds started failing due to
Edit: This weird behavior was caused due to a unit test. I've fixed the appveyor builds |
Awesome! |
This is an issue so that notifications come through nicely.
The text was updated successfully, but these errors were encountered: