Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Self-adaptive translation mode for Marian (runtime domain adaptation). #887

Open
wants to merge 143 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
0eefedf
Dynamic swap working, as long as the vocabularies are the same
XapaJIaMnu Mar 24, 2021
521f634
Model and GPUSlot separation, add vocab support
kpu Mar 28, 2021
67190db
Add vocabulary padding script
kpu Mar 28, 2021
b165af8
Split code into main and library h/cpp
kpu Mar 28, 2021
4d8e327
Restore ensemble support
kpu Mar 28, 2021
203a9bb
Minor logging improvements
kpu Mar 28, 2021
c71d488
Return Histories
kpu Mar 28, 2021
47feb2b
Alignments
kpu Mar 28, 2021
8fc8d02
Fix enit
kpu Mar 28, 2021
b4bded3
Merge github.com:marian-nmt/marian-dev into dynamic_swap_mvp
kpu Mar 29, 2021
b9bc153
Merge https://github.com/kpu/marian-dev into dynamic_swap_mvp
kpu Mar 29, 2021
9b3e76a
Add an option to force loading
kpu Mar 30, 2021
cf12178
Allow CPU only compilation
XapaJIaMnu Mar 30, 2021
7e06801
Add explicit gpu device index when creating the object
XapaJIaMnu Mar 30, 2021
635cfb0
Allow multiple mini-batches
XapaJIaMnu Mar 30, 2021
ee6ff75
No stringstreams
XapaJIaMnu Mar 30, 2021
57ddeba
Sort the histories before returning them
XapaJIaMnu Apr 1, 2021
4f2b218
SwappableSlot: add GPU-to-GPU reset feature
Apr 1, 2021
fa51460
Merge pull request #1 from davidecaroselli/dynamic_swap_mvp
XapaJIaMnu Apr 1, 2021
2062438
Merge branch 'dynamic_swap_mvp' of https://github.com/kpu/marian-dev …
XapaJIaMnu Apr 2, 2021
e3f5388
Separate graph from loading to GPU
XapaJIaMnu Apr 2, 2021
ba4d166
Abort if not initialized
kpu Apr 2, 2021
f8523b7
Go back to Load instead of OverwriteFrom
kpu Apr 2, 2021
8bcfdcc
Check device index
kpu Apr 2, 2021
a893f19
Start working on code to reproduce a bug i encountered
rihardsk Feb 12, 2021
7f6d01e
Build the model implement a simplistic training loop
rihardsk Feb 17, 2021
f4e227e
Load config using the cli parser so that we can have default values f…
rihardsk Feb 17, 2021
dcb7122
Add dummy values for training sets in the config
rihardsk Feb 24, 2021
fcb9a61
Repeat the graph initialization in a cycle
rihardsk Mar 1, 2021
6560067
Add a part of the self adaptive marian's implementation
rihardsk Mar 2, 2021
7130800
Fix compatability issues with some new refactors in master
rihardsk Mar 3, 2021
10cdffa
Fix options parsing issues
rihardsk Mar 4, 2021
286a23c
Fix remaining input parsing issues
rihardsk Mar 10, 2021
85685c6
Re-enable all of the adaptive code
rihardsk Mar 19, 2021
fca5fe4
Some further debugging, ugh
rihardsk Mar 19, 2021
37b6aa4
Fix the way inputs are initialized
rihardsk Mar 29, 2021
f67015e
Output graphviz graphs for the training graph
rihardsk Mar 30, 2021
78bcce1
Fix the segfault in the repro by moving the builder inside the loop
rihardsk Mar 31, 2021
162a17c
Move the builder initialization inside run() to fix the segfault
rihardsk Mar 31, 2021
de49880
Use a dedicated builder for the adaptive graph to avoid segfaults
rihardsk Apr 1, 2021
29415c7
Make a copy of all the swappable stuff to later adjust for training
rihardsk Apr 19, 2021
5b28f1f
Implement training with swappable stuff
rihardsk Apr 19, 2021
98b1ad1
Remove CPULoadedModelTrain in favor of just using CPULoadedModel
rihardsk Apr 20, 2021
d14da1b
Adapt self_adaptive.h to use the swappable stuff
rihardsk Apr 20, 2021
07658fb
Fix some runtime issues related to configuration
rihardsk Apr 21, 2021
06ee187
Fix issues woth vocab initialization and memory allocation
rihardsk May 8, 2021
c4ff8b9
Initialize the ExpressionGraph for translation with inference=true
rihardsk May 14, 2021
16ec013
Seek to beginning of the istringstream when resetting text input
rihardsk May 14, 2021
a220a2b
When translating, directly use the trained parameters instead of load…
rihardsk May 19, 2021
4f67aab
Ensure that SwapPointers is called an even number of times
rihardsk May 19, 2021
ea1380d
Get some params from the gpu memory for debugging
rihardsk May 26, 2021
dda5995
Retrieve some debugging information in SwapPointers
rihardsk Jun 16, 2021
4e743bf
Attempt to load the io::Items representing parameters directly into t…
rihardsk Jun 30, 2021
9e898b0
Only reserve memory not fill it with values when initializing the mem…
rihardsk Jul 14, 2021
e7d339b
Load params before building the graph, drop the F0:: prefix, clear pa…
rihardsk Jul 19, 2021
26e7574
Try to clear the graph before loading the parameters in an attempt to…
rihardsk Jul 23, 2021
58851ae
Recreate the graph upon every training invocation
rihardsk Jul 23, 2021
2765e65
The wrong vocab was being passed to the printer
rihardsk Aug 10, 2021
c289397
Clean up and move memory piece extraction to a better place
rihardsk Aug 11, 2021
20c893d
Rename for readability; remove commented out code; remove debugging code
rihardsk Aug 11, 2021
724b910
Remove some redundant initialization code
rihardsk Aug 12, 2021
7a79027
Make method naming consistent in GPUEngineTrain
rihardsk Sep 8, 2021
1790ea1
Clean up some comments
rihardsk Sep 8, 2021
f8fe981
Simplify the training loop
rihardsk Sep 14, 2021
4a4214a
Make CorpusBase understand that stdin is not a file
rihardsk Sep 14, 2021
0c974eb
Move common training/translation stuff out into a separate method
rihardsk Sep 16, 2021
1176c3f
Rename TrainSet{Reader,Iterator} to AdaptiveContext{Reader,Iterator}
rihardsk Sep 16, 2021
79002cb
Add documentation comments for adaptive context reader classes
rihardsk Sep 16, 2021
632d05f
Move self-adaptive data stuff to a separate file
rihardsk Sep 16, 2021
95ed9af
Move method definitions from adaptive_context.h to .cpp
rihardsk Sep 17, 2021
030ddb0
Introduce more whitespace for readability
rihardsk Sep 17, 2021
c90a4d7
Rename and move the adaptive translation function
rihardsk Sep 17, 2021
448de67
Unhardcode the maximum translation input length parameter
rihardsk Sep 17, 2021
a6639ff
Compile adaptive_context.cpp conditionally
rihardsk Sep 17, 2021
bafcae1
Remove the marian_swapper executable
rihardsk Sep 20, 2021
afc5e15
Remove dead code from the model swapping code
rihardsk Sep 20, 2021
6aeb510
Rename some swappable classes and improve documentation
rihardsk Sep 21, 2021
ad38da9
Describe the purpose of swappable.h
rihardsk Sep 21, 2021
295040d
Explain the purpose of self-adaptive code
rihardsk Sep 22, 2021
6311f2b
Improve comments in self-adaptive code
rihardsk Sep 22, 2021
6bf3445
Check that param names and sizes match upon loading
rihardsk Sep 24, 2021
5cac0d1
Fix amun model loading
rihardsk Sep 24, 2021
1e1397d
Implement parameter name remapping for nematus models
rihardsk Sep 24, 2021
3f9c088
Work around a crash in amun model loading
rihardsk Sep 24, 2021
d4ba1fa
Don't crash when training sets not provided
rihardsk Oct 26, 2021
24e8fc3
Copy over the self-adaptive server example script from an older commit
rihardsk Oct 26, 2021
d68fd73
Clean up logging
rihardsk Oct 26, 2021
324f69a
Remove a config option for swappable stuff that isn't used any more
rihardsk Oct 26, 2021
fa5f9f1
Merge branch 'master' into adaptive-whole-graph-recreate
rihardsk Oct 26, 2021
7f43074
Disable early stopping for self-adaptive training
rihardsk Oct 27, 2021
d7676bd
Forgot to remove a file that was used for debugging
rihardsk Oct 27, 2021
e48e737
Update CHANGELOG.md
rihardsk Oct 27, 2021
017b6c1
Fix CPU-only compilation
rihardsk Oct 28, 2021
1257a45
Add a virtual destructor to CollectorBase
rihardsk Oct 28, 2021
96115c8
Fix casing in the `COMPILE_ADAPTIVE` cmake option's description
rihardsk Nov 29, 2021
ba61acd
Split out marian-adaptive server mode into a separate executable
rihardsk Nov 29, 2021
2e7e78f
Remove marian-adaptive from the .zip and .tgz targets
rihardsk Nov 29, 2021
0084a3a
Remove a comment that was made obsolete by the grandparrent commit (b…
rihardsk Nov 29, 2021
30c0400
Change the defaultDispFreq option to use an unsigned value
rihardsk Nov 29, 2021
2fbb6ec
Fix indentation
rihardsk Nov 29, 2021
d09c021
Fix indentation
rihardsk Nov 29, 2021
10d5bff
Remove @brief from doc comments
rihardsk Nov 29, 2021
d41d81b
Remove commented out debugging code
rihardsk Nov 29, 2021
fde2226
Don't split the line here
rihardsk Nov 29, 2021
e407587
Fix indentation
rihardsk Nov 29, 2021
b869f68
Make it clear that validation options are disabled
rihardsk Nov 29, 2021
e04b829
Delete the pad_model_vocabulary.py script
rihardsk Nov 30, 2021
939384b
Comment on why data management options are disabled for self-adaptive…
rihardsk Nov 30, 2021
92aaeea
Explain the max-length-translate option; fix the default for max-lengt
rihardsk Nov 30, 2021
99553d5
Remove an obsolete comment
rihardsk Nov 30, 2021
8aec3ca
Remove excessive empty lines
rihardsk Nov 30, 2021
971e1dc
Split some long lines
rihardsk Nov 30, 2021
f3a085c
Forgot to add the marian_adaptive_server.cpp file to git
rihardsk Dec 1, 2021
bcbeb2d
Document the toMemoryPieces method
rihardsk Dec 2, 2021
2667ea9
Delete some more @briefs
rihardsk Dec 2, 2021
5b28786
Comment on a possibly missing "training-sets" option
rihardsk Dec 2, 2021
097effa
Remove unneeded member variables and describe member var usage
rihardsk Dec 2, 2021
507f8eb
Document some methods
rihardsk Dec 3, 2021
d797c90
Don't suggest looking at commits because they'll get squashed
rihardsk Dec 3, 2021
babf93d
Add a comment on stdin handling in CorpusBase
rihardsk Dec 3, 2021
2d1ff23
Fix a typo
rihardsk Dec 3, 2021
6955a9a
Document the `dropF0prefix` flag
rihardsk Dec 3, 2021
20cde20
Enable option validation for adaptive marian
rihardsk Dec 3, 2021
bbe5196
Add usage instructions to the adaptive/client_example.py script
rihardsk Dec 6, 2021
85d831f
Mention the tutorial repo as well
rihardsk Dec 6, 2021
7bb887a
Add punctiation for clarity
rihardsk Dec 6, 2021
9f03070
Fix a typo in a comment
rihardsk Dec 9, 2021
96615e7
Fix a typo in a comment
rihardsk Dec 9, 2021
d4a77ba
Fix a typo in a comment
rihardsk Dec 9, 2021
379418b
Revert an added space
rihardsk Dec 9, 2021
4bb6f5c
Clarify the server mode handling in ConfigParser
rihardsk Dec 9, 2021
c41a56b
Remove TSV options from self-adaptive translation
rihardsk Dec 9, 2021
6c97f82
Share code between marian-server and marian-adaptive-server
rihardsk Dec 10, 2021
88308a7
Don't require a "models" option for self-adaptive translation
rihardsk Dec 15, 2021
08d20d5
Fix crashes introduced by removing some options from self-adaptive ma…
rihardsk Dec 16, 2021
1326bb1
Disable parallel data validation for self-adaptive server mode
rihardsk Dec 16, 2021
56cfb37
Introduce a separate workspace size option for the translation graph
rihardsk Dec 28, 2021
d9cddf4
Fix alignment printing during translation
rihardsk Dec 29, 2021
892fed4
Merge remote-tracking branch 'upstream/master' into adaptive-whole-gr…
rihardsk Jan 28, 2022
3359bb7
Change "training-sets" to "train-sets"
snukky Jan 31, 2022
22230dd
Merge pull request #9 from marian-cef/adaptive-whole-graph-recreate-p…
kpu Jan 31, 2022
ea169a4
Merge branch 'master' into adaptive-whole-graph-recreate
rihardsk Feb 22, 2022
a274dfb
Mention marian-adaptive-server in the changelog
rihardsk Feb 22, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](http://semver.org/spec/v2.0.0.
## [Unreleased]

### Added
- Adds `marian-adaptive` and `marian-adaptive-server` executables to enable self-adaptive translation (a.k.a, runtime domain adaptation).

### Fixed
- Scripts using PyYAML now use `safe_load`; see https://msg.pyyaml.org/load
Expand Down
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ option(COMPILE_CPU "Compile CPU version" ON)
option(COMPILE_CUDA "Compile GPU version" ON)
option(COMPILE_EXAMPLES "Compile examples" OFF)
option(COMPILE_SERVER "Compile marian-server" OFF)
option(COMPILE_ADAPTIVE "Compile marian-adaptive. Set COMPILE_SERVER=ON to enable the server mode." OFF)
option(COMPILE_TESTS "Compile tests" OFF)
if(APPLE)
option(USE_APPLE_ACCELERATE "Compile with Apple Accelerate" ON)
Expand Down
63 changes: 63 additions & 0 deletions scripts/self-adaptive/client_example.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#!/usr/bin/env python
rihardsk marked this conversation as resolved.
Show resolved Hide resolved

# This is an example for using self-adaptive translation in server mode.
#
# To run:
# 1. Start self-adaptive Marian in server mode, e.g.:
# ./build/marian-adaptive-server -p 8080 -m model.npz -v vocap.yaml vocab.yaml \
# --after-batches 10 --after-epochs 10 --learn-rate 0.1 --mini-batch 15 # other options
# 2. In a new shell, run this script:
# python3 ./scripts/self-adaptive/client_exmaple.py -p 8080
#
# For a more extensive example, see https://github.com/marian-cef/marian-examples/tree/master/adaptive
# or https://github.com/tilde-nlp/runtime-domain-adaptation-tutorial

from __future__ import print_function, unicode_literals, division

import sys
import time
import argparse
import json

from websocket import create_connection


def translate(batch, port=8080):
ws = create_connection("ws://localhost:{}/translate".format(port))
ws.send(batch)
result = ws.recv()
ws.close()
return result.rstrip()


def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument("-p", "--port", type=int, default=8080)
return parser.parse_args()


if __name__ == "__main__":
args = parse_args()

# List of input sentences separated by a new line character
inputs = "this is an example\nthe second sentence\nno context provided"
# For each input sentence a list of parallel sentences can be provided as a
# list of source and target sentences.
contexts = [
# Source-side context for the first input sentence
["this is a test\nthese are examples",
# Target-side context for the first input sentence
"das ist ein test\ndies sind Beispiele"],
# Only one example is given as a context for the second input sentence
["the next sentence",
"der nächste Satz"],
# No context for the third input sentence
[]
]

input_data = {'input': inputs, 'context': contexts}
input_json = json.dumps(input_data)

output_json = translate(input_json, port=args.port)
output_data = json.loads(output_json)
print(output_data['output'])
20 changes: 20 additions & 0 deletions src/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,7 @@ set(MARIAN_SOURCES
translator/nth_element.cpp
translator/helpers.cpp
translator/scorers.cpp
translator/swappable.cpp

training/graph_group_async.cpp
training/graph_group_sync.cpp
Expand All @@ -129,6 +130,12 @@ set(MARIAN_SOURCES
$<TARGET_OBJECTS:faiss>
)

if(COMPILE_ADAPTIVE)
set(MARIAN_SOURCES ${MARIAN_SOURCES}
data/adaptive_context.cpp
)
endif(COMPILE_ADAPTIVE)

add_library(marian STATIC ${MARIAN_SOURCES})

target_compile_options(marian PRIVATE ${ALL_WARNINGS})
Expand Down Expand Up @@ -188,6 +195,7 @@ if(CUDA_FOUND)
tensors/gpu/add_all.cu
tensors/gpu/tensor_operators.cu
tensors/gpu/cudnn_wrappers.cu
tensors/gpu/swap.cu
translator/nth_element.cu
translator/helpers.cu
STATIC)
Expand Down Expand Up @@ -274,6 +282,18 @@ if (NOT COMPILE_LIBRARY_ONLY)
set(EXECUTABLES ${EXECUTABLES} marian_server)
endif(COMPILE_SERVER)

if(COMPILE_ADAPTIVE)
add_executable(marian_adaptive command/marian_adaptive.cpp)
set_target_properties(marian_adaptive PROPERTIES OUTPUT_NAME marian-adaptive)
set(EXECUTABLES ${EXECUTABLES} marian_adaptive)

if(COMPILE_SERVER)
add_executable(marian_adaptive_server command/marian_adaptive_server.cpp)
set_target_properties(marian_adaptive_server PROPERTIES OUTPUT_NAME marian-adaptive-server)
set(EXECUTABLES ${EXECUTABLES} marian_adaptive_server)
endif(COMPILE_SERVER)
endif(COMPILE_ADAPTIVE)

foreach(exec ${EXECUTABLES})
target_link_libraries(${exec} marian)
if(CUDA_FOUND)
Expand Down
19 changes: 19 additions & 0 deletions src/command/marian_adaptive.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
#include "marian.h"

#include "common/timer.h"
#include "common/utils.h"
#include "training/training.h"
#include "translator/self_adaptive.h"

using namespace marian;

int main(int argc, char **argv) {
auto options = parseOptions(argc, argv, cli::mode::selfadaptive);
auto task = New<TrainSelfAdaptive>(options);

timer::Timer timer;
task->run();
LOG(info, "Total time: {:.5f}s", timer.elapsed());

return 0;
}
11 changes: 11 additions & 0 deletions src/command/marian_adaptive_server.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
#include "translator/self_adaptive.h"
#include "translator/server_common.h"

int main(int argc, char **argv) {
using namespace marian;

auto options = parseOptions(argc, argv, cli::mode::selfadaptiveServer);
auto task = New<TrainSelfAdaptive>(options);
rihardsk marked this conversation as resolved.
Show resolved Hide resolved

return runServer(task, options);
}
55 changes: 2 additions & 53 deletions src/command/marian_server.cpp
Original file line number Diff line number Diff line change
@@ -1,62 +1,11 @@
#include "marian.h"
#include "translator/beam_search.h"
#include "translator/server_common.h"
#include "translator/translator.h"
#include "common/timer.h"
#include "common/utils.h"

#include "3rd_party/simple-websocket-server/server_ws.hpp"

typedef SimpleWeb::SocketServer<SimpleWeb::WS> WSServer;

int main(int argc, char **argv) {
using namespace marian;

// Initialize translation task
auto options = parseOptions(argc, argv, cli::mode::server, true);
auto task = New<TranslateService<BeamSearch>>(options);
auto quiet = options->get<bool>("quiet-translation");

// Initialize web server
WSServer server;
server.config.port = (short)options->get<size_t>("port", 8080);

auto &translate = server.endpoint["^/translate/?$"];

translate.on_message = [&task, quiet](Ptr<WSServer::Connection> connection,
Ptr<WSServer::InMessage> message) {
// Get input text
auto inputText = message->string();
auto sendStream = std::make_shared<WSServer::OutMessage>();

// Translate
timer::Timer timer;
auto outputText = task->run(inputText);
*sendStream << outputText << std::endl;
if(!quiet)
LOG(info, "Translation took: {:.5f}s", timer.elapsed());

// Send translation back
connection->send(sendStream, [](const SimpleWeb::error_code &ec) {
if(ec)
LOG(error, "Error sending message: ({}) {}", ec.value(), ec.message());
});
};

// Error Codes for error code meanings
// http://www.boost.org/doc/libs/1_55_0/doc/html/boost_asio/reference.html
translate.on_error = [](Ptr<WSServer::Connection> /*connection*/,
const SimpleWeb::error_code &ec) {
LOG(error, "Connection error: ({}) {}", ec.value(), ec.message());
};

// Start server thread
std::thread serverThread([&server]() {
server.start([](unsigned short port) {
LOG(info, "Server is listening on port {}", port);
});
});

serverThread.join();

return 0;
return runServer(task, options);
}
2 changes: 1 addition & 1 deletion src/common/config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ void Config::initialize(ConfigParser const& cp) {
}

// guess --tsv-fields, i.e. the number of fields in a TSV input, if not set
if(get<bool>("tsv") && get<size_t>("tsv-fields") == 0) {
if(get<bool>("tsv", false) && get<size_t>("tsv-fields") == 0) {
size_t tsvFields = 0;

// use the length of --input-types if given
Expand Down
Loading